ANALYZE_STATISTICS on single node cluster

We have a single node Vertica server, and we had never run ANALYZE_STATISTICS on it before. For a testing, I ran ANALYZE_STATISTICS, but I didn't see any performance improvements (virtually no changes) on our queries after ANALYZE_STATISTICS. I know running ANALYZE_STATISTICS is supposed to improve the system performance, but not sure why it did not improve for us. Could it be our queries, our data or that we are running on a single node?

Comments

  • Hi May, What ANALYZE_STATISTICS() specifically does, the thing with the biggest impact at least, is give the optimizer more information when deciding how to optimize nontrivial joins. (For example, if you're joining ten tables, which join do we perform first in order to get down to the smallest data set as quickly as possible?) If you only have very straightforward joins, probably Vertica was already able to optimize them without a separate call to ANALYZE_STATISTICS(). (Recent versions of Vertica do actually track a few basic statistics automatically without this function call; they can be sufficient for simple cases.) The single-node bit shouldn't matter; we still do joins and still care about the join order. Have you run Vertica's Database Designer? That tool does a heavier-duty optimization that can speed up a much wider variety of queries. Adam
  • Thanks, Adam! That really helps!

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file