Request for HP Vertica C++ SDK use Example

Hi. I work for Syncsort Inc., a Vertica Partner. We are developing a high performance loader from our ETL tool DMX to a Vertica table. We would like to use the HP Vertica C++ SDK to retrieve table metadata, including information such as the column name, and column attributes such as data length, character encoding, date time components/format. Would it be possible for you to share a simple SDK call sequence that would accomplish this? If you know where I could find other sample SDK programs, that would be useful as well. Thanks.

Comments

  • Hi Thomas, Vertica's SDK examples are automatically installed to /opt/vertica/sdk/examples/ on every node in your cluster. There are additional examples on our github site, http://github.com/vertica/ . It sounds like you are already writing UDx's? If so, you are presumably aware of the getReturnType() method on the UDx Factory class, which you have to override as part of your implementation of the Factory. This method gives you a SizedColumnTypes object. That object contains most of the information that you're looking for. Its methods are as documented here: https://my.vertica.com/docs/6.1.x/SDK/html/class_vertica_1_1_sized_column_types.htm It specifically gives you the column name and the data type and type information for each column in the input data set. So you have the object; just call the methods that you want on that object to get the data that you want. Many other methods give you this SizedColumnTypes object, either directly or tacked onto the *Reader and *Writer classes on processBlock(). Feel free to review the SDK for details, and to ask if you have specific questions. A few methods give you an ordinary ColumnTypes object, which is similar to SizedColumnTypes except it has less-specific information about the data type of the input arguments (for example, it doesn't know the size of string columns). With regard to column encoding, I think you want not the SDK, but our system tables. (UDx functions don't operate on projections, which would have an encoding; they operate on data sets, which could come from ROS, WOS, UNIONs of multiple tables with different encodings, subqueries, other functions, etc.) In particular, the PROJECTION_COLUMNS system table provides that information. Further system tables are documented here: https://my.vertica.com/docs/6.1.x/HTML/index.htm#12288.htm Note that these tables can also provide you with most of the other information that you're looking for, if you're building an external application that can run queries to get at this information. If you want a mechanism that provides you with specific information about the encoding on a particular column of a particular projection, we recommend a view or a SQL macro of some sort. Adam
  • Hi Adam, Thank you for your response. I have been reading the SDK doc, I think looking at a few code examples will go a long way towards answering the initial questions I have. I have the Vertica client installed locally on my Windows machine, but it appears that the SDK and associated examples are not part of the client installation. I am currently tracking down the on-site Vertica server installations we have to hopefully locate the SDK directory there. Going to the system tables to get encoding information makes sense. Is the character data stored in the data sets as UTF-8 or some other universal encoding? Thanks again, Thomas
  • Hi Thomas, For the client drivers, you're correct that they don't come with a copy of the SDK. Vertica is a native Linux program, and SDK plugins are run from within Vertica, so they generally have to be compiled on Linux for Vertica to be able to load and run them. If you run Windows locally, we recommend that you download and develop against the prepackaged Community Edition VM from http://my.vertica.com/ . It comes with the full SDK, and with free third-party programs like VMware Player or VirtualBox you can run it on a Windows computer. Also, I believe you guys are sponsoring our User Conference in a couple weeks? If you'll be there in person, I know we have the Hackathon on Monday, and hopefully we'll have some stuff there that'd help you with UDx development. All that said: If you want to write a Windows program that uses Vertica, you actually probably don't want our SDK; you probably want our ODBC or ADO.NET drivers (depending on whether your C++ app is native or managed code). Or JDBC, if you do Java. Both ODBC and ADO.NET are popular general-purpose standards; I suspect you've heard of one or both?, and you can find lots and lots of documentation and examples online if you need them. You can access some Vertica metadata via standard ODBC/ADO.NET function calls; you can also access any/all metadata by running SQL queries against our system tables. Regarding character-data format, our client drivers will actually give you the string in any of various standard formats. ODBC defaults to a UTF-8 char*; I believe ADO.NET just gives you a generic string object, with an unspecified internal format that you can encode however you want. You can't get at system tables directly from within a UDx, but unless otherwise specified, all server-internal strings are in UTF-8 format. (Though in UDx code, they are NOT necessarily null-terminated; our VString class will give you the string's length.) Adam
  • Hi Adam, Thanks for your second response last week. I was able to access the SDK examples this morning. It seems the use case for the SDK is to write a custom function that can be called from within VSQL. We were thinking along the lines of the ODBC interface where one would establish a connection from within a user program, and then do a SQLColumns call for a particular table, and then an SQLDescribeCol to get the column information. We would use this information to map our DMX data types to column types. Our use case is a Hadoop environment where Vertica is installed, but not necessarily an ODBC driver. Though I did not see any evidence of this in the examples or SDK documentation, I would like to ask, Is there a way for a user program through the SDK to initiate a request for and obtain column metadata for a particular table? Thomas
  • Hi Thomas, Ah, yeah, it does sound like what you want is something more like ODBC than like our SDK. In that case, ODBC or similar is really the only option that we currently provide. You could always bundle the driver with your tool. (Disclaimer, I can't speak to legalese regarding bundling the driver in a useful tool; you'll have to read the license yourself.) (If you're using Hadoop, I'm a little surprised you aren't using JDBC?) Regarding metadata from the SDK, the answer to that is in my first post (above). You can't do anything from within the SDK beyond what's described there. You can, however, do a bunch more via ODBC/etc. Again, see my first post; particularly the link regarding system-table info. Adam

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file