copy from S3 fails

ankit0007smartankit0007smart Registered User
edited February 12 in Vertica Forum

copy from S3 failed with *, however when i try the similar command with the name of the each file explicitly then it succeeds

dbadmin=> COPY poc.hrly_s3 FROM 's3://bucket/data/*' PARQUET;
ERROR 6764: Error opening file [s3://s3://bucket/data/1a4701af353ea0f2-773898e30000000f_415672635_data.0.parq] for read: Access Denied
ERROR 6764: Error opening file [s3://bucket/data/3/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq] for read: S3 operation (open) was canceled
ERROR 7253: Error reading from Parquet parser input stream [s3://bucket/data/1a4701af353ea0f2-773898e300000010_437174618_data.0.parq]: S3 operation (read) was canceled
d

Comments

  • mtracymtracy Employee, Registered User

    Can you specify select set_debug_log('SAL', 'ALL'); and rerun the query with the error and post the relevant logs?

  • ankit0007smartankit0007smart Registered User

    2018-02-12 16:47:04.551 Init Session:7f2d0ffff700-a0000000042f88 [Session] [Query] TX:a0000000042f88(v_verticadb_node0001-321247:0x54744) COPY poc.hrly_s3 FROM 's3://bucket/data/' PARQUET;
    2018-02-12 16:47:04.553 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Adding (txn=a0000000042f88,statement=4) to HDFS cancel callback monitoring
    2018-02-12 16:47:04.553 Init Session:7f2d0ffff700-a0000000042f88 [SAL] [S3UDFS] [s3://bucket/data/
    ]: virtual void SAL::S3FileSystem::glob(const string&, std::vector<std::pair<std::basic_string, stat> >&, SAL::GlobPruner&, bool) const, Glob
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Disposing cURL handle 0x7f28a0243a00
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Disposing cURL handle 0x7f28a0243a00
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Disposing cURL handle 0x7f28a0243a00
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Disposing cURL handle 0x7f28a0243a00
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Disposing cURL handle 0x7f28a0243a00
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Disposing cURL handle 0x7f28a0243a00
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Disposing cURL handle 0x7f28a0243a00
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Disposing cURL handle 0x7f28a0243a00
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Removing (txn=a0000000042f88,statement=4( from HDFS cancel callback monitoring
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Adding (txn=a0000000042f88,statement=4) to HDFS cancel callback monitoring
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 @v_verticadb_node0001: 00000/6610: 0 percent of data matched to local nodes (min 0, max 0 percent per node). 0 percent of data can be loaded within same racks
    2018-02-12 16:47:04.610 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Removing (txn=a0000000042f88,statement=4( from HDFS cancel callback monitoring
    2018-02-12 16:47:04.619 Init Session:7f2d0ffff700-a0000000042f88 [Opt] Dump SendRecvNodes Map :

    2018-02-12 16:47:04.625 Init Session:7f2d0ffff700-a0000000042f88 [Opt] Load is using default ExprEval, not LoadExprEval
    2018-02-12 16:47:04.646 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Adding (txn=a0000000042f88,statement=4) to HDFS cancel callback monitoring
    2018-02-12 16:47:04.646 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Removing (txn=a0000000042f88,statement=4( from HDFS cancel callback monitoring
    2018-02-12 16:47:04.646 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Adding (txn=a0000000042f88,statement=4) to HDFS cancel callback monitoring
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2d24ffa700 [Util] ActiveQueue: beginning processing
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2d0effd700 [Util] ActiveQueue: beginning processing
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2d0f7fe700 [Util] ActiveQueue: beginning processing
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2d2665e700 [Util] ActiveQueue: beginning processing
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2d0e7fc700 [Util] ActiveQueue: beginning processing
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2d0dffb700 [Util] ActiveQueue: beginning processing
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2d05ffe700 [Util] ActiveQueue: beginning processing
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2d057fd700 [Util] ActiveQueue: beginning processing
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2cfe7fc700 [Util] ActiveQueue: beginning processing
    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2cfd7fa700 [Util] ActiveQueue: beginning processing

  • ankit0007smartankit0007smart Registered User

    2018-02-12 16:47:04.646 PlanParquetFileQueue:7f2d24ffa700 [SAL] [S3UDFS] File opened: s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq
    2018-02-12 16:47:04.660 PlanParquetFileQueue:7f2d24ffa700 [SAL] [S3UDFS] S3FileCancelHandle :: Adding (txn=a0000000042f88,statement=4) to S3FS cancel callback monitoring
    2018-02-12 16:47:04.660 PlanParquetFileQueue:7f2d24ffa700 [SAL] [S3UDFS] [s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq]: virtual Vertica::UDFileOperator* SAL::S3FileSystem::open(const char*, int, mode_t, bool) const, Opened file
    2018-02-12 16:47:04.660 PlanParquetFileQueue:7f2d24ffa700 [SAL] [S3UDFS] [s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq]: seek=123843460 (0)
    2018-02-12 16:47:04.660 PlanParquetFileQueue:7f2d24ffa700 [SAL] [S3UDFS] [s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq]: virtual size_t SAL::S3FileOperator::read(void*, size_t), 8
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [VMPI] Hurry: processing for session v_verticadb_node0001-321247:0x54744, call 18 of type 8
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [Dist] Dist::cancelPlan: canceling state NO EXECUTION(0): a0000000042fa7
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [Dist] Dist::cancelPlan: canceled state NO EXECUTION(0): a0000000042fa7
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [Dist] Dist::cancelPlan: canceling state CANCELED(4): a0000000042fa7
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [Dist] Dist::cancelPlan: already canceled a0000000042fa7
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [Dist] Dist::cancelPlan: canceled state CANCELED(4): a0000000042fa7
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [SAL] HDFSWatchdog is canceling HDFS operations in (txn=a0000000042f88,statement=4)
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [SAL] An HDFS s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq operation in (txn=a0000000042f88,statement=4) is going to be canceled
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [SAL] [S3UDFS] Initiating cancel for [s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq]
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [SAL] [S3UDFS] S3FileCancelHandle:: cancelling (txn=45035996273979272,statement=4)
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [SAL] An HDFS read : s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq operation in (txn=a0000000042f88,statement=4) is going to be canceled
    2018-02-12 16:47:04.703 Spread Client:7f2d04ff9700 [SAL] HDFSWatchdog is done canceling HDFS operations in (txn=a0000000042f88,statement=4)
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d24ffa700 [SAL] [S3UDFS] [SDK] [ERROR] 2018-02-12 16:47:04 AWSClient [139831870990080] No response body. Response code: 4294967295

    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d24ffa700 [SAL] [S3UDFS] [s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq]: virtual void SAL::S3FileSystem::close(Vertica::UDFileOperator*) const, Closed file
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d24ffa700 [SAL] [S3UDFS] [s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq]: void SAL::S3FileOperator::finalize(),
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d24ffa700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d2665e700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d0f7fe700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d0effd700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d0e7fc700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d0dffb700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d05ffe700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2d057fd700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2cfe7fc700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 PlanParquetFileQueue:7f2cfd7fa700 [Util] ActiveQueue: processing done
    2018-02-12 16:47:04.749 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Removing (txn=a0000000042f88,statement=4( from HDFS cancel callback monitoring
    2018-02-12 16:47:04.774 Init Session:7f2d0ffff700-a0000000042f88 @v_verticadb_node0002: 42501/6764: Error opening file [s3://bucket/data/1a4701af353ea0f2-773898e30000000f_415672635_data.0.parq] for read: Access Denied
    LOCATION: initialize, /scratch_a/release/svrtar6887/vbuild/vertica/Util/ParquetInterface.cpp:173
    2018-02-12 16:47:04.774 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Adding (txn=a0000000042f88,statement=4) to HDFS cancel callback monitoring
    2018-02-12 16:47:04.774 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Removing (txn=a0000000042f88,statement=4( from HDFS cancel callback monitoring
    2018-02-12 16:47:04.774 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Adding (txn=a0000000042f88,statement=4) to HDFS cancel callback monitoring
    2018-02-12 16:47:04.774 Init Session:7f2d0ffff700-a0000000042f88 [SAL] Removing (txn=a0000000042f88,statement=4( from HDFS cancel callback monitoring
    2018-02-12 16:47:04.776 Init Session:7f2d0ffff700-a0000000042f88 @v_verticadb_node0001: VX001/7253: Error reading from Parquet parser input stream [s3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq]: S3 operation (read) was canceled
    LOCATION: ReadUdfo,

  • mtracymtracy Employee, Registered User

    Thanks. So when you specify to copy just 's3://bucket/data/1a4701af353ea0f2-773898e300000016_386770378_data.0.parq' it works without error? Could you attach those logs?

  • sKwasKwa Registered User

    Hi!

    May be its just a log message and nothing more, but protocol listed twice:

    ERROR 6764: Error opening file [s3://s3://bucket/data/1a4701af353ea0f2-773898e30000000f_415672635_data.0.parq] for read: Access Denied
    

    file path is - s3://s3://bucket....

  • ankit0007smartankit0007smart Registered User

    that was a typo during replace, I can't show my production bucket..thats not related

  • ankit0007smartankit0007smart Registered User
    edited February 12

    mtracy here are the logs

    2018-02-12 17:31:47.140 EEThread:7f2cfe7fc700-a0000000043107 [SAL] serviced from reservation: 24 requested 3706984 used 4194304 reserved
    2018-02-12 17:31:47.140 EEThread:7f2cfe7fc700-a0000000043107 [SAL] serviced from reservation: 24 requested 3707008 used 4194304 reserved
    2018-02-12 17:31:47.140 EEThread:7f2cfe7fc700-a0000000043107 [SAL] set savepoint for 23201 statement 2 -> block 9054, offset 11
    2018-02-12 17:31:47.156 Init Session:7f2d05ffe700-a0000000043107 [SAL] Removing (txn=a0000000043107,statement=2( from HDFS cancel callback monitoring
    2018-02-12 17:31:47.174 Init Session:7f2d05ffe700-a0000000043107 [SAL] Adding (txn=a0000000043107,statement=2) to HDFS cancel callback monitoring
    2018-02-12 17:31:47.175 Init Session:7f2d05ffe700-a0000000043107 [SAL] Removing (txn=a0000000043107,statement=2( from HDFS cancel callback monitoring
    2018-02-12 17:31:47.175 Init Session:7f2d05ffe700-a0000000043107 [VMPI] [abandon plan] - New synced txn catalog version : 188 (Old : 0)
    2018-02-12 17:31:47.180 Init Session:7f2d05ffe700-a0000000043107 [Txn] Starting Commit: Txn: a0000000043107 'COPY poc.hrly_s3 FROM 's3://tableau-360-production/v_hrly_flash_dtl_s3/1a4701af353ea0f2-773898e30000001b_1645706297_data.0.parq' PARQUET;' 1804
    2018-02-12 17:31:47.180 Init Session:7f2d05ffe700-a0000000043107 [SAL] Adding (txn=a0000000043107,statement=2) to HDFS cancel callback monitoring exclusion list
    2018-02-12 17:31:47.180 Init Session:7f2d05ffe700-a0000000043107 [TM] updateCPE: Updating projection hrly_s3_b1 CPE to 0x108 from 0xffffffffffffffff for wos 0xa000000036f926
    2018-02-12 17:31:47.182 Init Session:7f2d05ffe700-a0000000043107 [TM] updateCPE: Updating projection hrly_s3_b0 CPE to 0x108 from 0xffffffffffffffff for wos 0xa000000036f866
    2018-02-12 17:31:47.185 Init Session:7f2d05ffe700-a0000000043107 [SAL] Removing (txn=a0000000043107,statement=2) from HDFS cancel callback monitoring exclusion list
    2018-02-12 17:31:47.186 Init Session:7f2d05ffe700 [Txn] Commit Complete: Txn: a0000000043107 at epoch 0x109 and new global catalog version 1805
    2018-02-12 17:31:47.186 Init Session:7f2d05ffe700 [EE] [Bulkload] AUTO copy targeting hrly_s3 from pipe committed at version 1805.
    2018-02-12 17:31:49.000 AnalyzeRowCount:7f2cf6ffd700 [Command] TMTask: row count analyze - start
    2018-02-12 17:31:49.002 AnalyzeRowCount:7f2cf6ffd700-a000000004310a [Txn] Begin Txn: a000000004310a 'getRowCountsForProj'
    2018-02-12 17:31:49.078 AnalyzeRowCount:7f2cf6ffd700-a000000004310a [Txn] Rollback Txn: a000000004310a 'getRowCountsForProj'
    2018-02-12 17:31:49.107 AnalyzeRowCount:7f2cf6ffd700-a000000004310b [Txn] Begin Txn: a000000004310b 'analyze_row_count'
    2018-02-12 17:31:49.109 AnalyzeRowCount:7f2cf6ffd700-a000000004310b [VMPI] [commit transaction] - New synced txn catalog version : 1 (Old : 0)
    2018-02-12 17:31:49.110 AnalyzeRowCount:7f2cf6ffd700-a000000004310b [Txn] Starting Commit: Txn: a000000004310b 'analyze_row_count' 1805
    2018-02-12 17:31:49.110 AnalyzeRowCount:7f2cf6ffd700-a000000004310b [SAL] Adding (txn=a000000004310b,statement=1) to HDFS cancel callback monitoring exclusion list
    2018-02-12 17:31:49.111 AnalyzeRowCount:7f2cf6ffd700-a000000004310b [SAL] Removing (txn=a000000004310b,statement=1) from HDFS cancel callback monitoring exclusion list
    2018-02-12 17:31:49.111 AnalyzeRowCount:7f2cf6ffd700 [Txn] Commit Complete: Txn: a000000004310b at epoch 0x10a and new global catalog version 1806
    2018-02-12 17:31:49.113 AnalyzeRowCount:7f2cf6ffd700 [Command] TMTask: row count analyze - Done
    2018-02-12 17:31:49.114 AnalyzeRowCount:7f2cf6ffd700 [Util] Task 'AnalyzeRowCount' enabled
    2018-02-12 17:31:50.000 DiskSpaceRefresher:7f2d057fd700 [Util] Task 'DiskSpaceRefresher' enabled
    2018-02-12 17:31:50.078 nameless:7f2da1fcb700 [SAL] Running runWatchdog
    2018-02-12 17:31:50.078 nameless:7f2da1fcb700 [SAL] Watchdog is checking for idle connections

  • mtracymtracy Employee, Registered User

    Thanks Ankit. One more quick question. What are the permissions of the files with that path prefix? Just trying to get enough information to reproduce this on our end.

  • ankit0007smartankit0007smart Registered User

    well we use IAM accounts and I have full access to the bucket.
    That is why the individual COPY works, but not sure why * fails

  • ankit0007smartankit0007smart Registered User

    Tracy any update. were you able to reproduce.

  • mtracymtracy Employee, Registered User

    Not yet. It appears as though we have similar setups in many of our tests, so we are trying to identify what could be making your configuration different, as well as how to best proceed.

  • asaeidiasaeidi Employee, Registered User

    Could you check if you have list permission on IAM? That is the only reason single file read works, but glob does not. For glob we need to send list request to S3.

  • ankit0007smartankit0007smart Registered User

    I checked I have the full access to all the buckets in the system

  • asaeidiasaeidi Employee, Registered User

    Could you use awscli to list the same bucket from the same node? Does that work?

  • ankit0007smartankit0007smart Registered User
    edited February 13

    here you go

    aws s3 ls s3://bucket/data/
    2018-02-13 14:40:31 79724994 a44372205c100f5b-4e719f8800000010_1273429912_data.0.parq
    2018-02-13 14:40:31 60918026 a44372205c100f5b-4e719f8800000012_835150974_data.0.parq
    2018-02-13 14:40:29 41914935 a44372205c100f5b-4e719f8800000015_28682028_data.0.parq
    2018-02-13 14:40:29 51045058 a44372205c100f5b-4e719f8800000016_1842451061_data.0.parq
    2018-02-13 14:40:29 10284619 a44372205c100f5b-4e719f8800000017_312849041_data.0.parq
    2018-02-13 14:40:29 43433695 a44372205c100f5b-4e719f8800000019_726107042_data.0.parq
    2018-02-13 14:40:30 73811774 a44372205c100f5b-4e719f880000001b_312700690_data.0.parq

  • asaeidiasaeidi Employee, Registered User

    How many nodes do you have in your cluster? Here is how Vertica works when you have a glob pattern loading from S3: we list the files on the initiator, and then send the list of files to the executors to read in parallel, which is slightly different than a single node read.
    The next thing I would check is to make sure all the nodes have read permission. The error is Access Denied and we are returning that directly from S3 API, so there must be a permission issue at least on one of the nodes.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file