Vertica, filesystems, and safety

RossRoss Community Edition User

Understood "ext4 is recommended" and that xfs may work but testing is less robust and upon any issues, Vertica may have to push back to Redhat or other OS vendor to address issues with non-ext4 filesystems.
With those caveats stated and acknowledged, some questions:

1) Does Vertica expect ext4 to be mounted in journalling mode?
2) if so, using the ext4 default of data=ordered?
3) Does Vertica rely on direct IO (IO to disk bypassing OS fileystem cache, typically using O_DIRECT flag) for high-performance?
4) Other than "vioperf" has Vertica yet identified any performance tools for gathering metrics?
5) Other than "minimum 20 MB/s per core" does Vertica assert any guidelines for storage configuration?

Anyone with answers, insights, or simply comments is thanked in advance!

Answers

  • Ross -

    • EXT4 has the largest install base and cumulative time in production
    • XFS is the most common on new installations
    • Both are robust file systems

    Before I answer your questions directly, keep in mind that economics make more difference at scale. Vertica is very flexible (runs in many environments, even a VM on my 2 core laptop) and deployments range from very small to very large. Insufficient disk throughput means CPUs don't get fully utilized, which makes operations take longer, and therefore camp on memory for more time - so memory may be filled, but not well utilized. Also, inconsistencies between "buddy" nodes due to file system issues may be recoverable [in some cases with intervention], but automatic recovery or infrequent intervention impact overall throughput. Odds of at least 1 server being impacted increase as the number of nodes goes up, while a 3 to 5 node cluster may go months or years between hardware failures - just statistics. At small scale, having all the Vertica features can be more important than getting optimal CPU and Memory utilization. At large scale, where CPU and Memory cost a lot more than storage (especially spinning disk and/or object storage), minimizing interventions as node count increases becomes more important. Also, larger clusters tend to serve mixed workloads and attract a more varied user community to the data, where again maximizing CPU and memory utilization via sufficient storage throughput avoids surprises. Therefore how much you have to optimize depends on your intended scale.

    1. EXT4 journal mode is not required; default order of file system data vs metadata operations is sufficient
    2. EXT4 defaults are "standard", see also: https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/UsingVerticaOnAzure/ConfiguringStorage.htm
    3. Vertica takes advantage of the OS cache. That's why the hardware guides suggest setting the RAID card to 80-90% write caching, letting the OS take care of read caching. The OS typically has access to much more memory than a RAID card, so this scales well
    4. "vioperf" is designed to be a standard measure for comparing throughput vs expected results across the Vertica install base. If you're tuning for absolutes, any of the more generic storage benchmarks may help you optimize RAID controller and OS settings before you re-run vioperf
    5. Obviously durability is the most important storage requirement, but having "buddy" nodes makes many node crashes automatically recoverable. Second priority is throughput, but with a nod to economics - if you invest in lots of servers full of high-TDP CPUs, and lots of memory bandwidth to keep those cores fed, then 80-100 MiB/s per core may be appropriate for optimal utilization. If your use case is more about ease of data integration and other Vertica features, then you will be just as happy at 40% CPU utilization as 90%, and less storage throughput won't be an issue

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file