Optimized Hardware for Vertica

Hello,
I use Vertica to perform all sorts of analytical calculations on large input files (millions to tenths of millions of entries to process).
The hardware I'm using is a PC workstation with Intel i7 Extreme processor, 64GB RAM and 4 512GB SSD drives for the storage which operates as a RAID5 device.
I'm using both single node configuration and triple node configuration.
Each table set relevant for a specific analysis is fully stored on one node (data is not broken between nodes) so there should be no bottleneck related to the network between the workstations.
The current performance I'm getting is not satisfying for my application and I'm searching for ways to improve the results before considering another analytical database.
The queries were already optimized with a help of a certified Vertica expert so the answer is not there, that's why I'm looking at the hardware level solution.
Please let me know what you think about the current hardware I'm using and suggest improvements.

Thanks,
Tal

Comments

  • Still not a single answer?

  • It's hard to say much concrete without more details about the workload. How much concurrency? Do joins spill? Query duration? Disk footprint? What resource is the bottleneck?
    Generically, 64GB of RAM looks a little small compared to other hardware config I've seen. But your problem also looks on the small side (10M rows - I know Vertica databases with 10T+ rows). So tuning may look different at your scale.

  • Jim_KnicelyJim_Knicely Administrator
    edited May 2018

    It's quite obvious. Bad design? Can you share some SQL with explain plans?

  • Hi guys,
    Thanks for the response.
    We do not use joins at all. Concurrency is pretty low (maybe 5-10 users at the same time at extreme cases).
    Query duration takes 5-10 seconds on large input files and can get even higher on some queries.
    Some of the queries are implemented as UDFs.
    As for resources - I used the "top" command and in some cases the CPU reaches above 300%. The RAM utilization is pretty low though (mostly free). The disks are also mostly free.

  • If you want a more traditional % view of seeing CPU in top then press 'i' (in TOP) to turn off Irix mode, otherwise it's pretty normal to see Vertica > 100% (with irix mode on ) - I've witnessed the vertica process at 4000% on a busy 16 CORE platform, with UDFs.

    What sort of values are seeing from 'iostat -dx 1' figures are you getting from the data volume raidset ? Look at %util, and r and w values.
    Did you baseline the SSD volume before with vioperf ? Is throughput anywhere in the region of your baseline ?

    For memory check with a free -m, take a note of the 'used' 'free' and 'cached' values.

  • Hi, I checked the performance of the SSDs using vioperf.
    The results are ~250MB/s for write and ~1000MB/s for read (using RAID5 on 4 SSDs of 500GB).
    Based on this performance, do you think this can explain lower performance?

  • Jim_KnicelyJim_Knicely Administrator

    What does vioperf report for "counter value/core"?

    Minimum and Recommended I/O Performance

    • The minimum required I/O is 20 MB/s read/write per physical processor core on each node, in full duplex (reading and writing) simultaneously, concurrently on all nodes of the cluster.
    • The recommended I/O is 40 MB/s per physical core on each node.
    • The minimum required I/O rate for a node with 2 hyper-threaded six-core CPUs (12 physical cores) is 240 MB/s. Vertica recommends 480 MB/s.

    For example, the I/O rate for a node with 2 hyper-threaded six-core CPUs (12 physical cores) is 240 MB/s required minimum, 480 MB/s recommended.

    See:
    https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/InstallationGuide/scripts/vioperf.htm

  • Jim_KnicelyJim_Knicely Administrator

    Also, you said that you running on a PC workstation. How did you install Linux? In a VM or is it the primary OS? Also, I noticed in your first post you wrote:

    Each table set relevant for a specific analysis is fully stored on one node (data is not broken between nodes) so there should be no bottleneck related to the network between the workstations.

    How are you doing this? Replicating each table? Even the big ones? If so, you're circumventing Vertica's "massively parallel processing (MPP), shared-nothing architecture, in which the query processing workload is divided among all nodes of the Vertica database."

    See:
    https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/InstallationGuide/BeforeYouInstall/platformhardwarereqts.htm

    When using a 3 node configuration, have you implemented Native Load Balancing? This feature helps spread the CPU and memory overhead caused by client connections across the hosts in the database

    See:
    https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/AdministratorsGuide/LoadBalancing/AboutNativeConnectionLoadBalancing.htm

  • Hi,
    Here are some answers to your questions:

    • vioperf report for "counter value/core" - ~30 for write and ~125 for read.
    • How did you install Linux? - directly on the hardware (no VM). We're using CentOS (latest).
    • How are you doing this? Replicating each table? Even the big ones? -
    1. We did not implemented node load balancing
    2. The reason we prefer to keep our data in a single node is because:
      a. A single analysis table is not big enough to be broken into multiple nodes
      b. We have a lot of non-parallel calculations (using analytic / transformation functions)

    We also done some performance tests and indeed keeping the data in one node performed better.

  • Jim_KnicelyJim_Knicely Administrator

    Your VIOPERF numbers seem fine.

    What is the size of your DB? I'm guessing it's relatively small thus your performance on a single node would be better than introducing the complexity of a cluster of nodes.

  • We currently using up to 1TB for the DB.
    So you say we should avoid using the 3-node cluster and use a single node instead to improve performance?

  • Jim_KnicelyJim_Knicely Administrator

    I was referring to your comment:

    We also done some performance tests and indeed keeping the data in one node performed better.

    I'm still not sure what that means. If you have a 3 node cluster, with a K-Safety set to 1, you can't have all of the data (i.e. segments) on a single node.

    By one node, I thought you meant you have 1 TB on one node. That single node "might" perform better than 3 nodes (assuming you replicated the data) on a single query.

    If you have multiple nodes you'll want to be sure that you've set up your table projections properly so that your joins occur locally on each node (i.e. replicate dimension tables, segment fact tables).

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file