Options

Create a Python UDx to Order a List of Values

Jim_KnicelyJim_Knicely - Select Field - Administrator

User-Defined Extensions (UDxs) are functions contained in external shared libraries that are developed in C++, Python, Java, or R using the Vertica SDK. The external libraries are defined in the Vertica catalog using the CREATE LIBRARY statement. They are best suited for analytic operations that are difficult to perform in SQL, or need to be performed frequently enough that their speed is a major concern.

An engineering buddy of mine, George Jen, recently wrote a cool Python UDx that will sort a string of values alphabetically. He graciously agreed to let me use it as an example showing how simple it is to create a UDx in Vertica!

Example:

dbadmin=> \! cat /home/dbadmin/sortDelimitedString.py
import datetime
import vertica_sdk

class sortDelimitedString(vertica_sdk.ScalarFunction):
    """
    """
    def processBlock(self, server_interface, arg_reader, res_writer):
        while True:
            if arg_reader.isNull(0):
                res_writer.setNull()
            else:
                x = arg_reader.getString(0)
                y = arg_reader.getString(1)
                arr = x.split(y)
                arr = sorted(arr)
                z=y.join(arr)
                res_writer.setString(z)
            res_writer.next()
            if not arg_reader.next():
                break

class sortDelimitedString_factory(vertica_sdk.ScalarFunctionFactory):
    def getPrototype(self, srv_interface, arg_types, return_type):
        arg_types.addVarchar()
        arg_types.addVarchar()
        return_type.addVarchar()
    def getReturnType(self, srv_interface, arg_types, return_type):
        return_type.addVarchar(64000)
    def createScalarFunction(self, srv):
        return sortDelimitedString()

dbadmin=> CREATE or replace LIBRARY sortDelimitedStringlib AS '/home/dbadmin/sortDelimitedString.py' LANGUAGE 'Python';
CREATE LIBRARY

dbadmin=> CREATE or replace FUNCTION sortDelimitedString AS LANGUAGE 'Python' NAME 'sortDelimitedString_factory' LIBRARY sortDelimitedStringlib fenced;
CREATE FUNCTION

dbadmin=> SELECT sortDelimitedString('Z,S,A,X,C,B', ',');
sortDelimitedString
---------------------
A,B,C,S,X,Z
(1 row)

That was easy! Thanks George!

Helpful Links:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/ExtendingVertica/UDx/DevelopingUDxs
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/ExtendingVertica/Python/IntroductionPythonSDK.htm

Have fun!

Comments

Sign In or Register to comment.