Vertica, filesystems, and safety
Understood "ext4 is recommended" and that xfs may work but testing is less robust and upon any issues, Vertica may have to push back to Redhat or other OS vendor to address issues with non-ext4 filesystems.
With those caveats stated and acknowledged, some questions:
1) Does Vertica expect ext4 to be mounted in journalling mode?
2) if so, using the ext4 default of data=ordered?
3) Does Vertica rely on direct IO (IO to disk bypassing OS fileystem cache, typically using O_DIRECT flag) for high-performance?
4) Other than "vioperf" has Vertica yet identified any performance tools for gathering metrics?
5) Other than "minimum 20 MB/s per core" does Vertica assert any guidelines for storage configuration?
Anyone with answers, insights, or simply comments is thanked in advance!
Answers
Ross -
Before I answer your questions directly, keep in mind that economics make more difference at scale. Vertica is very flexible (runs in many environments, even a VM on my 2 core laptop) and deployments range from very small to very large. Insufficient disk throughput means CPUs don't get fully utilized, which makes operations take longer, and therefore camp on memory for more time - so memory may be filled, but not well utilized. Also, inconsistencies between "buddy" nodes due to file system issues may be recoverable [in some cases with intervention], but automatic recovery or infrequent intervention impact overall throughput. Odds of at least 1 server being impacted increase as the number of nodes goes up, while a 3 to 5 node cluster may go months or years between hardware failures - just statistics. At small scale, having all the Vertica features can be more important than getting optimal CPU and Memory utilization. At large scale, where CPU and Memory cost a lot more than storage (especially spinning disk and/or object storage), minimizing interventions as node count increases becomes more important. Also, larger clusters tend to serve mixed workloads and attract a more varied user community to the data, where again maximizing CPU and memory utilization via sufficient storage throughput avoids surprises. Therefore how much you have to optimize depends on your intended scale.