I/O throughput "per core" for vioperf on VMs
I have heard of several instances where customers and consultants execute vioperf showing well over 100 MB/s write and read disk throughput, yet because of having 8 or more vCPUs on their Vertica VMs they may fail to meet the recommended minimum 20 MB/s "per core" documented. Obviously in the virtual world, even the term "per core" is ambiguous as we are talking about virtual CPUs here. The end user often has no idea how many actual cores are on the underlying server or whether HT is enabled, etc.
The question is this: In the case where vioperf comes back at or just under the minimum, is the RIGHT thing to do to tell the customer to reduce the vCPU count for the VM? In other words, if they get 18MB/s "per core" write on 8 vCPUs (144 MB/s actual disk speed), should we advise they simply reboot with 4 vCPUs and rerun the test? Sure this will pass the vioperf threshold but we have not increased the disk speed and we have reduced the overall resources available on the node(s). Are we THAT sure our vioperf "per core" threshold is truly representing how well Vertica works in a virtualized environment?
Best Answer
-
Jim_Knicely - Select Field - Administrator
From my experience... in a virtual environment make sure to run VIOPERF on each node all at the same time. Most likely everything is shared in the VM (i.e compute and storage). You ned to measure performance simultaneously.
For best performance of Vertica, you want 40-60 per core on each node. Again, from my experience, a Vm offers 20% less performance than physical.
Per your case, you have a limiyed I/O pipe. You have to limit how many cores can write/read!
5
Answers
Thanks to Jim and others who replied offline for the tips! Understood that faster I/O is always better when possible.
The running of vioperf simultaneously on the different nodes is a good tip - especially if the VMs are referencing the same underlying data store. Even when data stores are separate, there may be other bottlenecks (like vnic) that will show up when all VM nodes run vioperf at the same time. Existing documentation suggests having Vertica VMs allocated across different servers (anti-affinity) and point-to-point spread configured as well. ref: https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SupportedPlatforms/Virtualization.htm
So far as whether actually reducing the vCPU count helps - I am still not convinced, because having more I/O threads which might execute slower would not necessarily provide less throughput overall than fewer threads running faster. The total I/O bandwidth available to the VM remains the same. However - the point is well taken that IF a VM is close to the MB per second per core guidance threshold, THEN one should do things to improve the I/O anyways.