Automated profiling tools to sift through the pre-made scripts for object migration

Min_Yun_ChanMin_Yun_Chan Vertica Employee Employee

Which are the automated profiling tools to sift through the pre-made scripts for object migration. Can you please share the links? Is there any documentation on this?


Question from The shortest path to Vertica: Best practices for data warehouse migration & ETL
@marcothesane @Maurizio

Answers

  • marcothesanemarcothesane - Select Field - Administrator

    Hi @Min_Yun_Chan -
    d2l - which reads one csv file and generates a matching CREATE TABLE statement, can be found here:
    https://github.com/marco-the-sane/d2l
    And I have a combination of two pretty complex scripts, SQL generating SQL; they scan the "v_catalog.columns" system table to create matching profiling queries, and generate runnable SQL code out of those scans.

    • genrecast.sql scans the string representation of all strings in the table, and creates a CREATE VIEW that creates a view with better high-level data types, like NUMERIC, TIMESTAMP, or, in fact , strings.
    • genpf2ddl.sql scans the views created in the previous step, to discern DATEs from TIMESTAMPs, INTEGERs from NUMERICs, etc. to deduct ideal and performant data types, creating a CREATE TABLE statement, and an INSERT ... SELECT statement selecting from the view from the previous step and writing into the target table.

    genrecast.sql is a 99-line script; genpf2ddl.sql is a270-line script. They are subject to changes at any time, as they are maturing along with our presales organisation's work on real assignments. So they're not yet on GitHub. On request, with a heavy caveat on its un-foolproofness and volatility, I might change my mind, though ...
    Good luck ...

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file