Bulk load from HDFS(using Delegation token mechanism) failing with VIAssert SQL Exception

alrach · November 2018

Hi,
I am using hdfs scheme/Delegation token mechanism to bulk load data from our Hadoop cluster to Vertica 9.1. (reference: https://www.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/HadoopIntegrationGuide/libhdfs/ConfiguringAccessToHDFS.htm?TocPath=Integrating%20with%20Apache%20Hadoop|Using%20HDFS%C2%A0URLs|_____2)

In some instances, copy command is failing with below SQL Exception.

java.sql.SQLException: [Vertica]VJDBC INTERNAL: VIAssert(offset <= fileSize) failed
[Vertica][VJDBC]Detail: /scratch_a/release/svrtar6521/vbuild/vertica/SAL/WebHdfsFileSystem.cpp: 247
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.dataengine.VResultSet.fetchChunk(Unknown Source)
at com.vertica.dataengine.VResultSet.initialize(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.readExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.handleExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeNoParams(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeUpdate(Unknown Source)
at com.att.cpp.etl.ariesogeo.load.SparkAGEOVerticaLoadDriverDT.loadToVertica(SparkAGEOVerticaLoadDriverDT.java:362)
at com.att.cpp.etl.ariesogeo.load.SparkAGEOVerticaLoadDriverDT.main(SparkAGEOVerticaLoadDriverDT.java:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
Caused by: com.vertica.support.exceptions.ErrorException: [Vertica]VJDBC INTERNAL: VIAssert(offset <= fileSize) failed
[Vertica][VJDBC]Detail: /scratch_a/release/svrtar6521/vbuild/vertica/SAL/WebHdfsFileSystem.cpp: 247
... 15 more

fyi - failed batch DataSize = 49.6G and failed batch NumberOfFiles = 1013.
Copy command used (generalized):
copy .() from 'hdfs://lab02/Data/Staging/LOADER/*' DELIMITER AS '|' DIRECT REJECTED DATA AS TABLE .

Truly appreciate any inputs.

SruthiA · November 2018

Did you set HadoopImpersonationConfig parameter before running COPY? Please open a support case as we may need logs to debug issue further.

alrach · November 2018

Sruthi, thank you for quick response.
yes, I am setting HadoopImpersonationConfig.. below is the generalized command:
ALTER SESSION SET HadoopImpersonationConfig ='[{"nameservice":"LAB02","token":""}]'

also, whenever this exception occurs, if I manually get rid of the batch's data, the copy load is picking up from next batch data without any errors.

this is the only log I have on this error. please suggest what more information you would need, and I can share here.

SruthiA · November 2018

I sent you personal message on next steps. Please check

itpraveen83 · March 2023

what is the cause of the error, we are facing similar issue in Version 11

We're Moving!

Create My New Community Account Now

Bulk load from HDFS(using Delegation token mechanism) failing with VIAssert SQL Exception

Comments