Options

tool that can convert MySQL queries to Vertica queries

pgvpgv Community Edition User

Can someone please point me to a tool that can convert MySQL queries to Vertica queries. My code is 100% on MySQL 5.6 need to get that converted to Vertica.

Answers

  • Options

    It's all SQL. To what extent does it require a conversion? Most of that SQL should run as-is in Vertica.

  • Options
    SankarmnSankarmn Community Edition User ✭✭

    @pgv, Mostly MySQL is similar with minor changes at Vertica. Post samples to understand your question :)

  • Options
    pgvpgv Community Edition User

    Dear Vertica_Curtis,
    Most SQL grammers vary by platform. So it is over simplistic to say that you can run the code as-is. I have a 3000 line computation procedure so it is a fair bit of conversion. I need a tool that get me quickly from MySQL to Vertica. Do you know of a reliable tool that can help me do the job in a couple of days?
    Thanks,
    PG

  • Options
    pgvpgv Community Edition User

    @Sankarmn said:
    @pgv, Mostly MySQL is similar with minor changes at Vertica. Post samples to understand your question :)

    @Sankarmn,
    The following sproc is coded for MySQL 5.6 and works great on AWS Aurora Serverless. What it does is tokenize text from a column to rows.
    So if the input is:
    Hello! how do you do?
    the the output will be as follows:
    Hello!
    how
    do
    you
    do?
    So it goes through a loop where it processes one record at a time. The greater the number of records and the data size per record the response becomes exponentially slow. Hence, I want to know whether Vertica give me the speed to process the records at a fraction of the time it takes on Aurora.
    The SPROC is as follows:
    DELIMITER $$
    DROP PROCEDURE IF EXISTS tokenizer $$
    CREATE PROCEDURE tokenizer()
    BEGIN
    DECLARE r_len INTEGER;
    DECLARE r_id BIGINT;
    DECLARE r_name LONGTEXT;
    DECLARE i INT DEFAULT 0;
    DECLARE splitted_name VARCHAR(500);
    DECLARE occurances INT DEFAULT 0;
    DECLARE done INT DEFAULT 0;
    DECLARE cur CURSOR FOR SELECT text_01.id,text_01.text_preprocess1 FROM text_01;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
    DROP TABLE IF EXISTS text_02 ;
    CREATE TABLE text_02(id BIGINT NOT NULL PRIMARY KEY AUTO_INCREMENT,record_id BIGINT,token VARCHAR(500)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
    OPEN cur;
    read_loop: LOOP
    FETCH cur INTO r_id,r_name;
    IF done THEN
    LEAVE read_loop;
    END IF;
    SET occurances = (SELECT LENGTH(r_name) - LENGTH(REPLACE(r_name, ',', ''))+1);
    SET i = 1;
    WHILE i <= occurances DO
    SET splitted_name = (SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(r_name, ',', i),
    LENGTH(SUBSTRING_INDEX(r_name, ',', i - 1)) + 1), ',', ''));
    INSERT INTO text_02 (record_id,token) VALUES (r_id, splitted_name);
    SET i = i + 1;
    END WHILE;
    END LOOP;
    CLOSE cur;
    END $$
    DELIMITER ;
    Can some one from Vertica take a look at the above requirement and let me know if it will be worthwile for me to go for vertica.
    Also in addition to the above my MySQL script contains monstorous 2 table inner joins that require read and write ops. Is Vertica the game changer solution for me. Kindly reply.

    Regards,
    PG

  • Options
    pgvpgv Community Edition User

    Dear Vertica_Curtis,
    Most SQL grammers vary by platform. So it is over simplistic to say that you can run the code as-is. I have a 3000 line computation procedure so it is a fair bit of conversion. I need a tool that get me quickly from MySQL to Vertica. Do you know of a reliable tool that can help me do the job in a couple of days?
    Thanks,
    PG

  • Options
    marcothesanemarcothesane - Select Field - Administrator

    This is not really SQL you're trying to migrate, but non-standard Stored Procedures Language.
    The answer , @pgv, is that it is not done like that in Vertica. in over 90% of the cases, you'll discover that you don't need to migrate your Stored Procedures. Just the sheer number of functions that Vertica has and other DBMSs don't have makes a ton of them unnecessary.

    Identify exactly what the Stored Procedure does, and use Vertica's more powerful SQL to do the same job better.

    We are a Big Data Platform, and therefore try to free everything from the need of nested loops and/or managing a specific memory stack . Therefore , stored procedures of that type don't exist (yet?).
    I'd solve your verticalising problem in a completely different way - and I bet it would be faster not only in Vertica:

    WITH
    input(s) AS (
              SELECT 'Hello! how do you do?'
    )
    ,
    i(i) AS ( -- help table with indexes - increase if you need more numbers
              SELECT  1 UNION ALL SELECT  2
    UNION ALL SELECT  3 UNION ALL SELECT  4
    UNION ALL SELECT  5 UNION ALL SELECT  6
    UNION ALL SELECT  7 UNION ALL SELECT  8
    UNION ALL SELECT  9 UNION ALL SELECT 10
    )
    SELECT
      i
    , SPLIT_PART(s,' ',i) AS subs
    FROM input CROSS JOIN i
    WHERE SPLIT_PART(s,' ',i) <>''
    ORDER BY i;
    -- out  i |  subs  
    -- out ---+--------
    -- out  1 | Hello!
    -- out  2 | how
    -- out  3 | do
    -- out  4 | you
    -- out  5 | do?
    
  • Options
    marcothesanemarcothesane - Select Field - Administrator

    If you need hundreds of consecutive integers in the Common Table Expression I named i above, you can "mis-" use the Vertica TIMESERIES clause for that:

    WITH
    -- create a series of 10 integers 
    -- keep this code-snippet as you'll need it often
    l(l) AS (
    SELECT TIMESTAMPADD(us,  1  , DATE '2000-01-01' ) UNION ALL
    SELECT TIMESTAMPADD(us, 10  , DATE '2000-01-01' )
    )
    ,i(i) AS (
      SELECT
        MICROSECOND(ts)
      FROM l
      TIMESERIES ts AS '1 us' OVER(ORDER BY l)
    )
    SELECT * FROM i;
    -- out  i  
    -- out ----
    -- out   1
    -- out   2
    -- out   3
    -- out   4
    -- out   5
    -- out   6
    -- out   7
    -- out   8
    -- out   9
    -- out  10
    
  • Options
    SankarmnSankarmn Community Edition User ✭✭

    @pgv, you need a data modeling and reverse engineering tool to do your needs where you can convert codes to various DB flavors. Power designer is one such tool where we did reverse engineer from SQL server and Oracle into MySQL pretty quick. See the link below that suggests other tools that can help you.
    https://dbmstools.com/categories/database-diagram-tools/mysql?export[]=SQL script
    Hope this Helps!

  • Options
    pgvpgv Community Edition User

    Dear Vertica_Curtis,
    Most SQL grammers vary by platform. So it is over simplistic to say that you can run the code as-is. I have a 3000 line computation procedure so it is a fair bit of conversion. I need a tool that get me quickly from MySQL to Vertica. Do you know of a reliable tool that can help me do the job in a couple of days?
    Thanks,
    PG> @marcothesane said:

    This is not really SQL you're trying to migrate, but non-standard Stored Procedures Language.
    The answer , @pgv, is that it is not done like that in Vertica. in over 90% of the cases, you'll discover that you don't need to migrate your Stored Procedures. Just the sheer number of functions that Vertica has and other DBMSs don't have makes a ton of them unnecessary.

    Identify exactly what the Stored Procedure does, and use Vertica's more powerful SQL to do the same job better.

    We are a Big Data Platform, and therefore try to free everything from the need of nested loops and/or managing a specific memory stack . Therefore , stored procedures of that type don't exist (yet?).
    I'd solve your verticalising problem in a completely different way - and I bet it would be faster not only in Vertica:

    WITH
    input(s) AS (
              SELECT 'Hello! how do you do?'
    )
    ,
    i(i) AS ( -- help table with indexes - increase if you need more numbers
              SELECT  1 UNION ALL SELECT  2
    UNION ALL SELECT  3 UNION ALL SELECT  4
    UNION ALL SELECT  5 UNION ALL SELECT  6
    UNION ALL SELECT  7 UNION ALL SELECT  8
    UNION ALL SELECT  9 UNION ALL SELECT 10
    )
    SELECT
      i
    , SPLIT_PART(s,' ',i) AS subs
    FROM input CROSS JOIN i
    WHERE SPLIT_PART(s,' ',i) <>''
    ORDER BY i;
    -- out  i |  subs  
    -- out ---+--------
    -- out  1 | Hello!
    -- out  2 | how
    -- out  3 | do
    -- out  4 | you
    -- out  5 | do?
    

    @marcothesane
    Your reply with an alternate code is much appreciated. My use case is a little different and simple if I may say so... I am developing a micro service to classify text and images. This is a standard solution industry agnostic. Hence, SPROCs such as the one I showed are handy since variables are defined and any user can just plug the data and get results. Your solution is brilliant, but I cannot be feeding input to the database every time. Long story short... I am not doing a one time big data solution. I am creating a platform so that a user can just load the data and the algorithm takes care of the rest. Right now Aurora Serverless is doing a brilliant job... But it is just slow.
    I would like to replace Aurora with Vertica... So I need some conversion help with the code (Just a few hundered lines like the above). Once I have some templates... I will do the full conversion... Kindly respond and we can get started ASAP...
    Thanks,
    PG

  • Options
    pgvpgv Community Edition User

    Dear Vertica_Curtis,
    Most SQL grammers vary by platform. So it is over simplistic to say that you can run the code as-is. I have a 3000 line computation procedure so it is a fair bit of conversion. I need a tool that get me quickly from MySQL to Vertica. Do you know of a reliable tool that can help me do the job in a couple of days?
    Thanks,
    PG> @marcothesane said:

    If you need hundreds of consecutive integers in the Common Table Expression I named i above, you can "mis-" use the Vertica TIMESERIES clause for that:

    WITH
    -- create a series of 10 integers
    -- keep this code-snippet as you'll need it often
    l(l) AS (
    SELECT TIMESTAMPADD(us, 1 , DATE '2000-01-01' ) UNION ALL
    SELECT TIMESTAMPADD(us, 10 , DATE '2000-01-01' )
    )
    ,i(i) AS (
    SELECT
    MICROSECOND(ts)
    FROM l
    TIMESERIES ts AS '1 us' OVER(ORDER BY l)
    )
    SELECT * FROM i;
    -- out i
    -- out ----
    -- out 1
    -- out 2
    -- out 3
    -- out 4
    -- out 5
    -- out 6
    -- out 7
    -- out 8
    -- out 9
    -- out 10

    @marcothesane,
    Thanks for the code snippet. I will give the complete project a try. I am hoping that Vertica will do the magic for me.. Will keep the feedback coming. You timely replies are much appreciated.
    Regards,
    PG

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file