The Vertica Forum recently got a makeover! Let us know what you think by filling out this short, anonymous survey.
Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Bulk load from HDFS(using Delegation token mechanism) failing with VIAssert SQL Exception

Hi,
I am using hdfs scheme/Delegation token mechanism to bulk load data from our Hadoop cluster to Vertica 9.1. (reference: https://www.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/HadoopIntegrationGuide/libhdfs/ConfiguringAccessToHDFS.htm?TocPath=Integrating%20with%20Apache%20Hadoop|Using%20HDFS%C2%A0URLs|_____2)

In some instances, copy command is failing with below SQL Exception.

java.sql.SQLException: [Vertica]VJDBC INTERNAL: VIAssert(offset <= fileSize) failed
[Vertica][VJDBC]Detail: /scratch_a/release/svrtar6521/vbuild/vertica/SAL/WebHdfsFileSystem.cpp: 247
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.dataengine.VResultSet.fetchChunk(Unknown Source)
at com.vertica.dataengine.VResultSet.initialize(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.readExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.handleExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeNoParams(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeUpdate(Unknown Source)
at com.att.cpp.etl.ariesogeo.load.SparkAGEOVerticaLoadDriverDT.loadToVertica(SparkAGEOVerticaLoadDriverDT.java:362)
at com.att.cpp.etl.ariesogeo.load.SparkAGEOVerticaLoadDriverDT.main(SparkAGEOVerticaLoadDriverDT.java:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
Caused by: com.vertica.support.exceptions.ErrorException: [Vertica]VJDBC INTERNAL: VIAssert(offset <= fileSize) failed
[Vertica][VJDBC]Detail: /scratch_a/release/svrtar6521/vbuild/vertica/SAL/WebHdfsFileSystem.cpp: 247
... 15 more

fyi - failed batch DataSize = 49.6G and failed batch NumberOfFiles = 1013.
Copy command used (generalized):
copy .

() from 'hdfs://lab02/Data/Staging/LOADER/*' DELIMITER AS '|' DIRECT REJECTED DATA AS TABLE . Truly appreciate any inputs.

Comments

  • SruthiASruthiA Vertica Employee Employee

    Did you set HadoopImpersonationConfig parameter before running COPY? Please open a support case as we may need logs to debug issue further.

  • Sruthi, thank you for quick response.
    yes, I am setting HadoopImpersonationConfig.. below is the generalized command:
    ALTER SESSION SET HadoopImpersonationConfig ='[{"nameservice":"LAB02","token":""}]'

    also, whenever this exception occurs, if I manually get rid of the batch's data, the copy load is picking up from next batch data without any errors.

    this is the only log I have on this error. please suggest what more information you would need, and I can share here.

  • SruthiASruthiA Vertica Employee Employee

    I sent you personal message on next steps. Please check

  • itpraveen83itpraveen83 Vertica Customer

    what is the cause of the error, we are facing similar issue in Version 11

This discussion has been closed.