Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Problems COPYing from server outside of Vertica cluster

Hello. I'm hoping somebody more experienced might be able to help me.

 

We have a script that has previously been used to COPY data into a local, single node Vertica cluster, with all associated processes taking place on the one physical machine. We are in the process of trying to migrate to AWS, and have set up a three node cluster, and also a separate instance which acts as the primary storage location for "flat file" data that is to be imported in to the database cluster.

 

The COPY doesn't seem to be working when the associated scripts are run from the storage instance, and I suspect it's something to do with the use of STDIN which might looking at the Vertica cluster rather than the storage instance for its input. Similarly, the rejected and exceptions log files are created locally in the storage instance, but I think the COPY is looking for these files on the vertica cluster to write its errors to.

 

Does anybody else run this sort of COPY? Hopefully the scripting below is sufficient to understand what we do, it's full of variables of course, but the cat $t.* relates to a file that has been created for loading in to the Vertica cluster.

 

rejected_data_log=$LOAD_DIR/copy_log/${staging_t}_rejected_data.log
exceptions_log=$LOAD_DIR/copy_log/${staging_t}_exceptions.log
analyse_log=$LOAD_DIR/analyse_log/${staging_t}_analyse.log

flow_name_ver=`echo $flow_name_ver_group | cut -c1-8`

cd $LOAD_DIR/$flow_name_ver

vsql_msg "Load $t files into staging table"
# msg "DEBUG: Exceptions will be logged to $exceptions_log"
# msg "DEBUG: Rejected data will be logged into $rejected_data_log"

cat $t.* |\
vsql -h $DS_HOST -d $DS_DATABASE -U $DS_USER -w $DS_PASSWORD -c \
" COPY $DS_SCHEMA.$staging_t
FROM STDIN
DELIMITER '|'
NULL as 'null'
ENFORCELENGTH
EXCEPTIONS '$exceptions_log'
REJECTED DATA '$rejected_data_log';"

 

When I try to run the load, I get lots of the below, but with no data actually being loaded.

 

05:12:58 Delete temp data from D0019001G0798_STAGE
05:12:58 Load D0019001G0798 files into staging table
ERROR 2016: COPY: Could not open rejected data file [/home/dbadmin/load_home/21/copy_log/D0019001G0798_STAGE_rejected_data.log] for writing: No such file or directory

 

I've managed to run the script from within the vertica cluster, and that worked fine, so I think it's just something about trying to copy in to the cluster from something outside of it.

 

Thanks


Ben

Comments

  • Hi 

    My understending from your note is that you trying to load data from local server to remote vertica node 

    of this is the case you need to change your copy command to 

     

    COPY $DS_SCHEMA.$staging_t
    FROM local  STDIN

     

    I hope you will find it usfull 

     

    Thanks 

  • Thank you. Yes, that worked :)

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.