Hive to vertica data export with Unix named pipe

kosmik5kosmik5 Registered User

Hi,
Can someone please help me that how to do large and fast export to Hive to Vetica without any Hadoop connector?

Currently, i am exporting the same thing via Unix Namedpipe but performance is not that good.

almost 5 parallel thread to load the data into Vertica and time is approx 230 min for 1.6 billion recordsets?

can someone please help me to improve this performance and if we can optimize this export?

Thank You
kosmiktechnologies.com

Comments

  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited January 24
  • sKwasKwa Registered User

    Hi!

    Can you specify where is a bottleneck(on data reading from Hive or on data writing to Vertica)? Is it IO problem or Network problem? Is it CPU bottleneck(some custom parser)?

    almost 5 parallel thread to load the data into Vertica and time is approx 230 min for 1.6 billion recordsets?

    It doesn't say a thing

    • How many bytes/mb/gb/tb?
    • How many nodes?
    • What is your physical configuration(Hive and Vertica on same servers)?
    • Did you check IO on Vertica(may be IO limit reached)?
    • Does all 5 threads to same node? Or loads are distributed over cluster nodes?

    PS: too many questions.
    You need to specify where is a problem(may be a problem with hardware and your disk can't give better IO).

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file