We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Bulk load from HDFS(using Delegation token mechanism) failing with VIAssert SQL Exception — Vertica Forum

Bulk load from HDFS(using Delegation token mechanism) failing with VIAssert SQL Exception

edited March 2023 in General Discussion

Hi,
I am using hdfs scheme/Delegation token mechanism to bulk load data from our Hadoop cluster to Vertica 9.1. (reference: https://www.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/HadoopIntegrationGuide/libhdfs/ConfiguringAccessToHDFS.htm?TocPath=Integrating%20with%20Apache%20Hadoop|Using%20HDFS%C2%A0URLs|_____2)

In some instances, copy command is failing with below SQL Exception.

java.sql.SQLException: [Vertica]VJDBC INTERNAL: VIAssert(offset <= fileSize) failed
[Vertica][VJDBC]Detail: /scratch_a/release/svrtar6521/vbuild/vertica/SAL/WebHdfsFileSystem.cpp: 247
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.dataengine.VResultSet.fetchChunk(Unknown Source)
at com.vertica.dataengine.VResultSet.initialize(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.readExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.handleExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeNoParams(Unknown Source)
at com.vertica.jdbc.common.SStatement.executeUpdate(Unknown Source)
at com.att.cpp.etl.ariesogeo.load.SparkAGEOVerticaLoadDriverDT.loadToVertica(SparkAGEOVerticaLoadDriverDT.java:362)
at com.att.cpp.etl.ariesogeo.load.SparkAGEOVerticaLoadDriverDT.main(SparkAGEOVerticaLoadDriverDT.java:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
Caused by: com.vertica.support.exceptions.ErrorException: [Vertica]VJDBC INTERNAL: VIAssert(offset <= fileSize) failed
[Vertica][VJDBC]Detail: /scratch_a/release/svrtar6521/vbuild/vertica/SAL/WebHdfsFileSystem.cpp: 247
... 15 more

fyi - failed batch DataSize = 49.6G and failed batch NumberOfFiles = 1013.
Copy command used (generalized):
copy .() from 'hdfs://lab02/Data/Staging/LOADER/*' DELIMITER AS '|' DIRECT REJECTED DATA AS TABLE .

Truly appreciate any inputs.

Comments

  • SruthiASruthiA Administrator

    Did you set HadoopImpersonationConfig parameter before running COPY? Please open a support case as we may need logs to debug issue further.

  • Sruthi, thank you for quick response.
    yes, I am setting HadoopImpersonationConfig.. below is the generalized command:
    ALTER SESSION SET HadoopImpersonationConfig ='[{"nameservice":"LAB02","token":""}]'

    also, whenever this exception occurs, if I manually get rid of the batch's data, the copy load is picking up from next batch data without any errors.

    this is the only log I have on this error. please suggest what more information you would need, and I can share here.

  • SruthiASruthiA Administrator

    I sent you personal message on next steps. Please check

  • itpraveen83itpraveen83 Vertica Customer

    what is the cause of the error, we are facing similar issue in Version 11

This discussion has been closed.