How to configure Vertica EON mode for NetApp ONTAP
jordzilla
Vertica Customer ✭
We have a requirement of installing Vertica 12 in EON mode on a NetApp ONTAP (NOT StorageGrid) 9 based environment.
Does anyone have any experience or suggestions on how to install Vertica 12 using s3 mode?
0
Comments
Hi,
Netapp ONTAP has not been validated with Vertica yet, so there is no specific info on whether it is compatible with Vertica,, nor is there anyt performance data to indicate if it performs well or not. And finally, there's no guide to setting up and configuring the ONTAP side for Vertica EON mode communal storage. We have had recent talks with Netapp about ONTAP and how it compares to StorageGrid, but to date they have not run the Vertica benchmark tests to validate is is api and performance compatible.
Regards,
Hi @s_crossman ,
Thank you for the response.
We did find this article on NetApp's web site regarding connecting to the ONTAP 9 s3 provider.
https://docs.netapp.com/us-en/ontap/s3-config/enable-client-access-from-s3-app-task.html
Maybe it will help?
Kind regards,
J.
Hi @s_crossman ,
You mentioned the
Vertica benchmark test
have not been run yet.Is that a stand-alone tool that we can provide to our customer to test with to at least locally confirm that it will work?
Thank you,
J.
Hi,
Regarding the link to the ONTAP S3 article, it is helpful in the context of ONTAP but not for Vertica. Netapp had tried to run our api test on NTAP back in 2021 and it failed. They stated they have made improvements to the api but they haven't rerun the api test to confirm the original issues were resolved. Once that's confirmed then the benchmark testing has to be run to ensure that the apis work under a simulated customer environment with various data sizes and concurrent loads. The benchmark tests are run on a very specific Vertica hardware setup and the results are compared with those we captured in AWS using S3. The object storage numbers have ot meet or exceed the AWS numbers for a validation pass.
Regarding the test suite, Vertica works with object storage partners to validate their S3 object stores. We also build a technical relationship with them so once validation is complete we can put collateral together about specific nuances of the pairing, tuning info, best practices, observations, etc. We also establish a support relationship that allows us to properly support customers who implement the two products in their environment.
To date we have not sent the suite to any customers as the validation process requires experts on the Vertica and object storage sides to run, diagnose, tune, rerun, etc. The object storage partners know their s3 storage better than we do, and we know Vertica better than they do. So it takes a joint effort to get the validation environment setup, tests run, and validation completed. Not something that would work as well in a customer environment, especially if the two tech partners hadn't seen the combination together yet..
I would recommend contacting your Netapp account rep and let them know you have interest in running Vertica with Netapp ONTAP. The technical folks at Netapp have the latest test suite and docs and such. I believe they were holding off validation until they got some indication there was potential customer business to drive allocating resources to the validation.
Regards,
Thank you so much @s_crossman ! I appreciate the time and details. We will follow up with our customer to see if they can move this forward via their NetApp account rep.
No problem. I passed some info related to this to my contacts at Netapp so they are aware they may get a request from a customer. It might at least plant a seed to start planning out the validation testing.
So, quick update.
The customer is on NetApp ONTAP 9.11.1P7.
We were able to configure the S3 storage provider with certs.
We successfully tested using a couple of utilities on the same server we installed Vertica (12.1) on (s3cmd and mc).
For example, the MinIO
mc
command can fully connect and authenticate and add and remove files from the target bucket.But, when we are running
create_db
, we can see Vertica was able to connect to the ONTAP S3 endpoint and actually create the test file in the target bucket, but is unable to then read it (gets zero bytes, vs the 44 bytes of data in the file) and the run ends.And, we did ask them if they can try to get their NetApp rep to reach out to all y'all for help too. :-)
But, from their perspective, it is now something on the Vertica side.
(And yes, we know, it's not an officially supported platform.)
Still, any clues or thoughts on where we might look would be appreciated.
Hi,
My guess is that the failure is during the bootstrap routine of create_db where it writes a test file and then immediately reads it back for size info to confirm it wrote correctly. The bootstrap log file may have more details although it may not show a failure, just that it successfully read 0 bytes.
We saw this in one other storage partner's object storage. They had to fix their consistency level. Vertica expects immediate vs eventual consistency. If the s3 object store takes too long to complete the object write an immediate GET after PUT acts like the file doesn’t exist or it's 0 bytes.
Here's a link that covers AWS S3 consistency model, which Vertica requires.
https://aws.amazon.com/s3/consistency/
Because we haven't qualified ONTAP I'm not sure if they follow the S3 consistency model or not.
Hi,
I dug up a script that we used with the other object store that was exhibiting consistency issues. Below are the details plus some good and bad output. It may help confirm this is what you are running into. My suspicion is that where the s3 api in ONTAP is failry new it may not support strong r after w consistency, or there may be knobs that need to be adjusted to support it. I reached out to my Netapp contacts with that question just to confirm.
The following script loops over just the operations that are key to the issue that was observed during the create_db operation which is a PUT followed by a bucket listing:
• each run is a sequence of PUT, delay x, bucket listing, DELETE
• x = .1 up to .9 seconds
• 100 times each
• 0 means bucket listing had the object listed
• 1 means bucket listing was empty
• at the end of each line I sum up the amount of operations that resulted in a consistent bucket listing in %.
Script requires aws client be installed and an aws cli profile for the object store configured. Basically the script will do a cp, ls, and rm of a test file that's the same size as the one Vertica uses.
** before running you'll need to:
replace the "ip:port", "profname", and "bucketname" refs
create an rw_access_test.txt file containing 44 bytes of info, e.g. "1234567890123456789012345678901234567890123"
for _delay in $(seq 1 1 9); do echo -n "Delay .$_delay: "; __sum=0; for x in $(seq 1 1 100); do (
aws s3 --endpoint-url http://ip:port --profile profname cp ./rw_access_test.txt s3://bucketname/rw_access_test_${x}.txt &> /dev/null &);
sleep .$_delay;
aws s3 ls --endpoint-url http://ip:port --profile profname --profile minio s3://bucketname 2> /dev/null | egrep -q "rw_access_test_${x}.txt"; __return=$?;
echo -n $__return; ((__sum += $__return));
aws s3 rm s3://bucketname/rw_access_test_${x}.txt --endpoint-url http://ip:port --profile profname &> /dev/null;
done;
echo -n " : ";
echo "100 - ($__sum * 100 / 100)" | bc | tr -d "\n";
echo "%";
done
BAD object store with no load which was failing the create_db bootstrap rw access test due to eventual vs strong consistency:
Delay .1: 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 : 0%
Delay .2: 1111111111111111111111111111111111111111111111111111111111111011111111111111111111111111111111111111 : 1%
Delay .3: 1111111111111111111111111111111111011111111111111111101011111111111011111111111011111111111111111111 : 5%
Delay .4: 1110110111111111110111111111011111111111110111101111111111111111111111101111011111111111111111110111 : 9%
Delay .5: 0111110000101001101111010001011011100011001011111111111001001111111110111111101111100100111110110011 : 34%
Delay .6: 0011101111000010111010000100010110100000001110010010001100110000011110010101011000111011001101101111 : 52%
Delay .7: 0000111001100000111100000000110100001000000000100000101100110001001000000000110001001000000001000011 : 72%
Delay .8: 1000000000010000000001000000000000000000000000010000000100000000000000000000011000001100000000000000 : 91%
Delay .9: 0000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000010 : 98%
Good inhouse Pure system which is tuned but has heavy load on it most of the time:
Delay .1: 1000011001010010001001001011010000010000000111010001010000010100001010001101111100000101010001000110 : 64%
Delay .2: 0000000000000000000000010010010000001000010000100000000001001010000000000000000110000000100000000110 : 86%
Delay .3: 0000001000000000000000000001100000000000000000000000001000100100001000000000010000000000000100100000 : 90%
Good inhouse MinIO not tuned single VM host with no additional load:
Delay .1: 1011100001001011010010001011000110010110110010001011010011100011011100011100101101100001000101110111 : 51%
Delay .2: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 : 100%
Delay .3: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 : 100%
The output is a visualization of the consistency of the bucket listings. In this case the good object stores returned a high percent success (strong consistency) with minimum delay and the bad object store required a much higher delay for a good probability of success (eventual consistency).
Just checking back to see if that script I sent was used and if so whether it helped confirm this might bean eventual vs strong consistency issue. My Netapp contact had mentioned that the S3 api support in Ontap has evolved quite a bit over the last year or so, and that the Ontap version and platform might be important to narrow down which level of compatibility they have.
Hi @s_crossman ,
Thank you for providing the script and checking back in.
We provided the link to this thread, and they said they shared it with their NetApp rep.
They confirmed that they could not move off of ONTAP (to StorageGrid) as they are using features that are not available in the latter.
So, hopefully, NetApp will be reaching out and working with you and Vertica to get ONTAP's S3 stack certified to use with V12.
Did your NeApp contact mention which specific version of ONTAP they felt should be compatible?
Hello @fatimachoudary1, as per the information shared by @s_crossman, NetApp ONTAP has not been validated with Vertica yet. Therefore, we don't have any information regarding the performance or Functional compatibility of it with Vertica.
Hello @s_crossman @VikasGarg
We also have 1 if not multiple customers who are requesting ONTAP S3 validation. While the StorageGrid validation is great, it isn't always the best fit for every configuration. I will be asking our account team to add us to the list of customers who are asking for ONTAP certification. Thank you. S.
Hello @s_cipresse , Please have account team contact me through email (Jawahar A). Thanks.