Is Vertica CPU utilization related to the Disk space consumed

Hello,
I have a Vertica cluster deployed over Kubernetes. Currently we are facing high CPU utilization reaching at 100% on maximum and memory usage of Max 40%. The total disk utilization is 77% of the 11T. Would increasing the disk space reduce the CPU utilization and improve the performance of the DB?

Answers

  • moshegmosheg Vertica Employee Administrator
    edited August 19

    While increasing disk space may not directly reduce CPU usage, ensuring that disk I/O is not a bottleneck is important. High disk latency can indirectly affect CPU usage if processes are waiting on I/O operations. Generally, you want to leave about 10-20% free disk space to avoid fragmentation and provide enough free storage for Vertica's TEMP space.
    The TEMP space location is occupied by temporary files created by certain query execution operations, such as hash joins and sorts, when they need to spill to disk. Such operations might also be encountered during queries, recovery processes, projection refreshes, and other database activities.

    Please consider the following to troubleshoot the issue:

    1. Vertica provides validation utilities that can be used to help determine if your CPU config, IO and network can properly handle the workload and troubleshoot performance issues:
      Vcpuperf - a CPU performance test used to verify your CPU performance.
      Vioperf - an Input/Output test used to verify the speed and consistency of your hard drives.
      Vnetperf - a Network test used to test the latency and throughput of your network between hosts.
      See: https://docs.vertica.com/latest/en/setup/set-up-on-premises/install-using-command-line/validation-scripts

    2. Does your system utilization reach 100% CPU utilization during off-hours?
      High system utilization does not necessarily indicate a problem unless you are constantly at 100% utilization or your canary queries show high elapsed times that exceed your users' acknowledged SLA.
      Canary Queries are SQL statements run at regular intervals to monitor system performance.
      See: https://www.linkedin.com/pulse/how-monitor-opentext-vertica-canary-queries-moshe-goldberg

    3. Does the high system utilization relate to a specific workload, such as long-running queries, high concurrency, or heavy ETL processes?
      Try to isolate one or more of the suspected influencers. And once you have a clearer picture of the culprit, consider controlling it via Vertica resource pools or a specific subcluster.

This discussion has been closed.