Hive to vertica data export with Unix named pipe

Hi,
Can someone please help me that how to do large and fast export to Hive to Vetica without any Hadoop connector?

Currently, i am exporting the same thing via Unix Namedpipe but performance is not that good.

almost 5 parallel thread to load the data into Vertica and time is approx 230 min for 1.6 billion recordsets?

can someone please help me to improve this performance and if we can optimize this export?

Thank You
kosmiktechnologies.com

Comments

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited January 2018
  • Hi!

    Can you specify where is a bottleneck(on data reading from Hive or on data writing to Vertica)? Is it IO problem or Network problem? Is it CPU bottleneck(some custom parser)?

    almost 5 parallel thread to load the data into Vertica and time is approx 230 min for 1.6 billion recordsets?

    It doesn't say a thing

    • How many bytes/mb/gb/tb?
    • How many nodes?
    • What is your physical configuration(Hive and Vertica on same servers)?
    • Did you check IO on Vertica(may be IO limit reached)?
    • Does all 5 threads to same node? Or loads are distributed over cluster nodes?

    PS: too many questions.
    You need to specify where is a problem(may be a problem with hardware and your disk can't give better IO).

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file