Not able to query the s3 external table
Hi,
I have created a external table which refers to a parquet format file. while trying to fire a select query it is giving me below error:
[Vertica]VJDBC ERROR: Cannot expand glob pattern due to error: Access Denied [SQL State=22023, DB Errorcode=7160]
1 statement failed
Where as I am able to list the bucket from cli.
Have created the location and provided the permission as well .
Have followed the steps descried in the below link:
https://www.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/Eon/CreatingExternalTablesFromS3.htm
below is my table creation script:
create external table common.test2
(
record_code varchar(10),
station_code integer,
dim_date_time_id int,
hh_hut_put_impression float,
hh_universe_estimate int,
hh_impression float,
c25_hut_put_impression float,
c25_universe_estimate int,
c25_impression float,
c611_hut_put_impression float,
c611_universe_estimate int,
c611_impression float,
m1214_hut_put_impression float,
m1214_universe_estimate int,
m1214_impression float,
m1517_hut_put_impression float,
m1517_universe_estimate int,
m1517_impression float,
m1820_hut_put_impression float,
m1820_universe_estimate int,
m1820_impression float,
m2124_hut_put_impression float,
m2124_universe_estimate int,
m2124_impression float,
m2534_hut_put_impression float,
m2534_universe_estimate int,
m2534_impression float,
m3549_hut_put_impression float,
m3549_universe_estimate int,
m3549_impression float,
m5054_hut_put_impression float,
m5054_universe_estimate int,
m5054_impression float,
m5564_hut_put_impression float,
m5564_universe_estimate int,
m5564_impression float,
m65plus_hut_put_impression float,
m65plus_universe_estimate int,
m65plus_impression float,
f1214_hut_put_impression float,
f1214_universe_estimate int,
f1214_impression float,
f1517_hut_put_impression float,
f1517_universe_estimate int,
f1517_impression float,
f1820_hut_put_impression float,
f1820_universe_estimate int,
f1820_impression float,
f2124_hut_put_impression float,
f2124_universe_estimate int,
f2124_impression float,
f2534_hut_put_impression float,
f2534_universe_estimate int,
f2534_impression float,
f3549_hut_put_impression float,
f3549_universe_estimate int,
f3549_impression float,
f5054_hut_put_impression float,
f5054_universe_estimate int,
f5054_impression float,
f5564_hut_put_impression float,
f5564_universe_estimate int,
f5564_impression float,
f65plus_hut_put_impression float,
f65plus_universe_estimate int,
f65plus_impression float,
ww_hut_put_impression float,
ww_universe_estimate int,
ww_impression float,
sample_type varchar(100),
sample_type_int integer,
playback_type varchar(20),
rating_stream varchar(20),
reporting_service integer,
sub_sample_indicator integer,
metroahousehold int,
metrobhousehold int,
daily_average_group varchar(300),
daily_group varchar(300),
daily_normal_group varchar(300),
daily_telecast_group varchar(300),
excluded varchar(1),
monthly_average_group varchar(300),
monthly_group varchar(300),
monthly_normal_group varchar(300),
monthly_telecast_group varchar(300),
nielsendow integer,
program_code int,
program_group varchar(100),
program_group_airing_time varchar(100),
program_name varchar(100),
programday_group varchar(100),
swing_break_type varchar(100),
week_number integer,
weekly_average_group varchar(300),
weekly_group varchar(300),
weekly_normal_group varchar(300),
weekly_telecast_group varchar(300),
dow integer,
half_hr_grain varchar(50),
hr_grain varchar(50),
dow_str varchar(50),
day_type varchar(50),
hour_minute integer,
source_creation_epoch int,
etl_timestamp timestamp,
job_run_id varchar(500)
)
AS COPY FROM 's3://mybucket/1_fact_rating_quarter_hour_nielsen/reporting_year=2018/reporting_month=1/market_code=101/rating_stream_int=5/part-00121-e8ac2dc7-39c3-4da2-8451-ffa7238448cb.c000.snappy.parquet' PARQUET;
Vertica version:Vertica Analytic Database v9.2.1-1
Below is my log trace:
2019-07-01 15:07:18.900 Init Session:0x7fe5f0fff700-a000000319e064 [SAL] [S3UDFS] Creating a new session-specific S3 client for bucket 'mybucket'. Region: 'us-east-1', Endpoint: ''
2019-07-01 15:07:18.901 Init Session:0x7fe5f0fff700-a000000319e064 [SAL] [S3UDFS] [SDK] [ERROR] 2019-07-01 15:07:18 EC2MetadataClient [140625567545088] Http request failed with error code 404
2019-07-01 15:07:18.901 Init Session:0x7fe5f0fff700-a000000319e064 [SAL] [S3UDFS] [SDK] [ERROR] 2019-07-01 15:07:18 EC2MetadataClient [140625567545088] Http request failed with error code 404
2019-07-01 15:07:18.911 Init Session:0x7fe5f0fff700-a000000319e064 [SAL] [S3UDFS] [SDK] [ERROR] 2019-07-01 15:07:18 EC2MetadataClient [140625567545088] Http request failed with error code 404
2019-07-01 15:07:18.911 Init Session:0x7fe5f0fff700-a000000319e064 [SAL] [S3UDFS] [SDK] [ERROR] 2019-07-01 15:07:18 EC2MetadataClient [140625567545088] Http request failed with error code 404
2019-07-01 15:07:18.921 Init Session:0x7fe5f0fff700-a000000319e064 [SAL] [S3UDFS] [SDK] [ERROR] 2019-07-01 15:07:18 EC2MetadataClient [140625567545088] Http request failed with error code 404
2019-07-01 15:07:18.921 Init Session:0x7fe5f0fff700-a000000319e064 [SAL] [S3UDFS] [SDK] [ERROR] 2019-07-01 15:07:18 VerticaInstanceProfileCredentialsProvider [140625567545088] Gave up trying to reload configuration profile after 50 retries.
2019-07-01 15:07:18.921 Init Session:0x7fe5f0fff700-a000000319e064 [SAL] [S3UDFS] Found a session-scoped client for bucket 'mybusket'
2019-07-01 15:07:19.306 Init Session:0x7fe5f0fff700-a000000319e064 [SAL] Removing (txn=a000000319e064,statement=4( from UDFS cancel callback monitoring
2019-07-01 15:07:19.306 Init Session:0x7fe5f0fff700-a000000319e064 @v_operative_node0004: 22023/7160: Cannot expand glob pattern due to error: Access Denied
LOCATION: expandGlobLocal, /data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_2_1-x_grader/build/vertica/Optimizer/Path/BulkLoad.cpp:2430
any help will be appreciated.
Thanks
Comments
The response say Access Denied. Are you able to access the file in the bucket using the access key that is configured in Vertica? Did you set the session AWS security parameters, e.g.: ALTER SESSION SET AWSAuth='ID:secret';
yes, Have executed below statements:
ALTER SESSION SET AWSAuth='ID:*******';
SELECT AWS_SET_CONFIG('aws_id', '*****');
SELECT AWS_SET_CONFIG('aws_secret', '***);
I am able to access the bucket from vertica as I was successfully able to load the data into vertica managed table via copy command.
So COPY works but not CREATE EXTERNAL TABLE AS COPY?
The AWS_SET_CONFIG function is no longer recommended in 9.2. You should only use ALTER SESSION SET AWSAuth = 'aws_id:aws_secret'. See the latest doc at https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/Eon/CreatingExternalTablesFromS3.htm
You may need to log out or disconnect to clear the session settings, then try to connect again with new session.
thanks after reconnecting, it worked and yes with** ALTER SESSION SET AWSAuth = 'aws_id:aws_secret**' it is working fine now thanks for your quick response.
Though it is giving a different error now but looks it is something to do with the data in parquet file.
[Vertica]VJDBC ERROR: Datatype mismatch: column 6 in the parquet source [s3://mybucket/1_fact_rating_quarter_hour_nielsen/reporting_year=2018/reporting_month=1/market_code=101/rating_stream_int=5/part-00121-e8ac2dc7-39c3-4da2-8451-ffa7238448cb.c000.snappy.parquet] has type INT64, expected float [SQL State=22V24, DB Errorcode=7247]
This seems to be your column 6:
hh_impression float,
.. as from the
create external table common.test2
statement you sent above.The way I read it, in the Parquet file we have a format for a 64 bit integer, while you are declaring a float as a data type.
Try to drop the table
common.test2
and re-create it with:hh_impression int,
instead ...
happy playing ...