Options

Query performance for non-depot data

J_KelleyJ_Kelley
edited April 2020 in General Discussion

So the depot is intended to keep low latency queries fast. How much of an impact is it if the data is not already stored in the depot?
Migrating your Vertica Cluster to the Cloud
@SumeetAgrawal @Chris_Daly_HPE

Answers

  • Options
    dsprogisdsprogis Employee

    The performance penalty for a depot miss varies according to I/O speed of communal storage, concurrency load on communal storage and network bandwidth.
    On AWS, for the simplest of queries against a single table, we have seen an approximate difference of 10X (the query takes 10 times as long to run).
    On Pure Storage FlashBlade with a dedicated network for inter-node communication, for the simplest of queries we have seen a 30% performance impact (the query takes 30% longer to run). While this 30% performance penalty seems low, it quickly multiplies when multiple joins are involved.
    Whatever platform you choose, we strongly recommend running a POC to assess performance differences against your individual schema, queries to inform your hardware configuration.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file