Parallel Read

sanvertica · June 2022

I was just wondering if there is way to perform parallel reads from Vertica. Let's say you have 3 node cluster, and what you want to do is perform 3 parallel reads, one from each node, or in some cases 2 parallel read from 3 nodes. The idea is to perform either node pruning or segmentation pruning and have the query execute only on the node where the data resides and not on other nodes. Node pruning could be done with local_node_name but doesn't work correctly all the time depending upon where local_node_name is executed as part of the query optimization and which node you are connected to. Segmentation pruning could be done using HASH of the primary key on which segmentation is performed and using the hash range for each Node to filter the data. The issue with this is that Vertica optimizer doesn't recognize this optimization and still runs the query on all nodes using resources and even processing data to eliminate rows. Not sure why it doesn't look at HASH int (HASH(a) & 0xffffffff) range in the where clause filter to eliminate the segment/node altogether. Currently the single node query are limited to equality operator, wondering why it can't be range based...

We're Moving!

Create My New Community Account Now

Parallel Read

Leave a Comment