Error copying large files from HDFS to vertica

anup1007 · September 2019

Hi,
I am trying to copy over large amount of data from HDFS to Vertica cluster. I'm using Vertica Copy command through a Scala code, which runs as an Oozie job. The Copy command I am using is:
COPY myTable FROM 'hdfs:///mydirectory/ofmanyfiles/part*';
The Vertica nodes are configured as hadoop edge nodes. I am running query using vertica-jdbc-8.1.1-18.jar. But for large size data (> ), it is giving following error(~ after 5 minutes):

Please Note: The copy command finishes even after below error, but the time taken to load data is not fixed. Also for 2GB data it doesn't give error, but for 23 GB it is throwing this error.

Exception in thread "main" java.sql.SQLRecoverableException: [Vertica]VJDBC IOException while communicating with server: java.net.SocketException: Broken pipe (Write failed).
at com.vertica.io.ProtocolStream.logAndConvertToNetworkException(Unknown Source)
at com.vertica.io.ProtocolStream.flush(Unknown Source)
at com.vertica.core.VConnection.handleError(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.executeSimpleProtocol(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeNoParams(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeUpdate(Unknown Source)
at com.export.core.DataLoader$.execute(DataLoader.scala:44)
at com.export.core.CheckPointsManager$$anonfun$doExport$1$$anonfun$apply$2.apply(CheckPointsManager.scala:31)
at com.export.core.CheckPointsManager$$anonfun$doExport$1$$anonfun$apply$2.apply(CheckPointsManager.scala:30)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.LinkedHashMap.foreach(LinkedHashMap.scala:139)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at com.export.core.CheckPointsManager$$anonfun$doExport$1.apply(CheckPointsManager.scala:30)
at com.export.core.CheckPointsManager$$anonfun$doExport$1.apply(CheckPointsManager.scala:23)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at com.export.core.CheckPointsManager$.doExport(CheckPointsManager.scala:23)
at com.export.StartMain$.main(StartMain.scala:17)
at com.export.StartMain.main(StartMain.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: com.vertica.support.exceptions.RecoverableException: [Vertica]VJDBC IOException while communicating with server: java.net.SocketException: Broken pipe (Write failed).
... 26 more
Caused by: java.net.SocketException: Broken pipe (Write failed)
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at com.vertica.io.VStream.flush(Unknown Source)
... 25 more

SruthiA · October 2019

May be you are receiving Hadoop connection timeout for huge amount of data. Please try increasing the value of parameter of 'HadoopFSConnectionTimeout' and see if it helps.

anup1007 · October 2019

I couldn't find any property "HadoopFSConnectionTimeout". Could you please tell where to find this?

SruthiA · October 2019

Please try running the below command.
select set_config_parameter('HadoopFSConnectionTimeout',600 );

anup1007 · October 2019

Hi,
I tried running copy command after setting this parameter, still getting same error - Broken pipe (Write failed), after 5 minutes.

SruthiA · October 2019

Please open a support case if you can. It looks like we need to review logs to debug further.

anup1007 · October 2019

I checked the log and found only following FATAL:

[Txn] Commit Complete: Txn: 1b0000000026be3 at epoch 0x5f95f0 and new global catalog version 7850619
2019-10-17 21:31:39.320 Init Session:7fe319fge708 [EE] [Bulkload] DIRECT copy targeting TABLE_NAME from pipe committed at version 7650699.
2019-10-17 21:31:39.321 Init Session:7fe519ffe700 @v_vertica247_node0018: 08006/2907: Could not send data to client: No such file or directory
@v_vertica247_node0018: 08006/2607: Client has disconnected
LOCATION: internal_handle_error, pqcomm.c:731
Session _vertica247_node0018-53602:0x40ef73 ended; closing connection (connCnt 1)
2019-10-17 21:30:40.300 Init Session:7fe319fge708 @v_vertica247_node0018: 00000/2705: Connection received: host=xx.xx.xxx.x port=60810 (connCnt 1)

However this data load also got succeeded even after throwing the "Broken Pipe" error.
Can you please let me know if this helps?

Thanks

ersaurabhex · December 2019

Hi Anup,
I am also facing similar issue while loading a 40 GB File using PDI (Pentaho) using Insert Direct. What I observed was load gets failed (pentaho gives IO File exception) exactly after 10 minutes. Even through I get this error, but data load is getting completed. I couldn't find any issue in Vertica.log at that time.
I checked and found LockTimeOut set to 600 Seconds for 'I' (Insert) type of lock and could be the reason for such error, which I increased to 25 Minutes. Still I keep receiving errors after 10 minutes (but not at exatly 10 minutes of load) of load cycle.
Did you get any resolution for this? Please help out so that others can benefit.
PS - I am using Vertica 9.0 Enterprise Mode.

Regards,
Saurabh

anup1007 · December 2019

Hi Saurabh,

I didn't find any resolution to this, did a workaround instead. I used polling with system table v_monitor.query_profiles to check if the query has completed executing. Used a stream name to identify particular executing query.

Thanks,
Anup.

Jim_Knicely · December 2019

@ersaurabhex - Check this blog post:

https://www.vertica.com/blog/configuring-tcp-idle-settings-long-running-idle-sessions/

ersaurabhex · December 2019

Sure Jim. Let me try that out first and if we still see the error, I'll get back to you.

Regards,
Saurabh

We're Moving!

Create My New Community Account Now

Error copying large files from HDFS to vertica

Answers

Leave a Comment