Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Machine Learning using UDFs

Hey All,

Has anyone considered using Vertica for Data Science? Vertica 8.1 does have some good built-in features for analytics/modeling, but from a customization perspective, it it limited to what its built to do.

  1. Thus, is writing UDF a solution to that? Can we write different kinds of classifications, regression, association learning models in the UDFs and save our models? If yes, can someone point me to an example? - This is not the predefined Vertica modeling functions

  2. Can we achieve parallel processing gains through UDFs without a partition by clause? A simple example: I want to perform market basket analysis using the standard algorithm on the web. The input is a transactional data set with items sold in a every order. Let's say this is transactional set is huge (many orders). If I try to run this using UDF's in R, it takes a very long time even for a small number of rows. However, I can run the same function on my local R machine in less than 15 secs. I cannot partition this data since I need it to interpret as one complete dataset.

Any suggestions would be helpful!


Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.