Bulk load from HDFS(using Delegation token mechanism) failing with VIAssert SQL Exception

edited March 2023 in General Discussion

Hi,
I am using hdfs scheme/Delegation token mechanism to bulk load data from our Hadoop cluster to Vertica 9.1. (reference: https://www.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/HadoopIntegrationGuide/libhdfs/ConfiguringAccessToHDFS.htm?TocPath=Integrating%20with%20Apache%20Hadoop|Using%20HDFS%C2%A0URLs|_____2)

In some instances, copy command is failing with below SQL Exception.

java.sql.SQLException: [Vertica]VJDBC INTERNAL: VIAssert(offset <= fileSize) failed
[Vertica][VJDBC]Detail: /scratch_a/release/svrtar6521/vbuild/vertica/SAL/WebHdfsFileSystem.cpp: 247
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.dataengine.VResultSet.fetchChunk(Unknown Source)
at com.vertica.dataengine.VResultSet.initialize(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.readExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.handleExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeNoParams(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeUpdate(Unknown Source)
at com.att.cpp.etl.ariesogeo.load.SparkAGEOVerticaLoadDriverDT.loadToVertica(SparkAGEOVerticaLoadDriverDT.java:362)
at com.att.cpp.etl.ariesogeo.load.SparkAGEOVerticaLoadDriverDT.main(SparkAGEOVerticaLoadDriverDT.java:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
Caused by: com.vertica.support.exceptions.ErrorException: [Vertica]VJDBC INTERNAL: VIAssert(offset <= fileSize) failed
[Vertica][VJDBC]Detail: /scratch_a/release/svrtar6521/vbuild/vertica/SAL/WebHdfsFileSystem.cpp: 247
... 15 more

fyi - failed batch DataSize = 49.6G and failed batch NumberOfFiles = 1013.
Copy command used (generalized):
copy .() from 'hdfs://lab02/Data/Staging/LOADER/*' DELIMITER AS '|' DIRECT REJECTED DATA AS TABLE .

Truly appreciate any inputs.

Comments

  • SruthiASruthiA Vertica Employee Administrator

    Did you set HadoopImpersonationConfig parameter before running COPY? Please open a support case as we may need logs to debug issue further.

  • Sruthi, thank you for quick response.
    yes, I am setting HadoopImpersonationConfig.. below is the generalized command:
    ALTER SESSION SET HadoopImpersonationConfig ='[{"nameservice":"LAB02","token":""}]'

    also, whenever this exception occurs, if I manually get rid of the batch's data, the copy load is picking up from next batch data without any errors.

    this is the only log I have on this error. please suggest what more information you would need, and I can share here.

  • SruthiASruthiA Vertica Employee Administrator

    I sent you personal message on next steps. Please check

  • itpraveen83itpraveen83 Vertica Customer

    what is the cause of the error, we are facing similar issue in Version 11

This discussion has been closed.