MERGE on huge external tables(oracle)
I have created external tables with a built in shell script to connect to ORACLE from vertica. Its working really fast on small tables with with upto 250 k rows or so. But tables with millions of rows fails probably becoz shell script not being able to keep up with connection to oracle or some reason. Did any one tried to merge data by intervals based on data or date? This will really help me to complete a 10 million row table with a 20 chunk interval
0
Comments
Anyone with MERGE expertise is desperatley needed ...please reply asap
So what is the issue here ?
MERGE INTO DSS.T_D_RE_ELIG AS target
USING dssmod.T_D_RE_ELIG_EXT AS Source
ON Source.SAK_RECIP = target.SAK_RECIP
WHEN MATCHED THEN UPDATE SET
SAK_PGM_ELIG = Source.SAK_PGM_ELIG,
SAK_PUB_HLTH = Source.SAK_PUB_HLTH,
DTE_EFFECTIVE = Source.DTE_EFFECTIVE,
DTE_END = Source.DTE_END,
CDE_PRE_ENTRY_STAT = Source.CDE_PRE_ENTRY_STAT,
CDE_STOP = Source.CDE_STOP,
CDE_STATUS1 = Source.CDE_STATUS1,
DTE_CREATED = Source.DTE_CREATED,
DTE_LAST_UPDATED = Source.DTE_LAST_UPDATED,
DTE_ACTIVE_THRU = Source.DTE_ACTIVE_THRU,
NUM_RANK = Source.NUM_RANK
WHEN NOT MATCHED THEN INSERT
(
SAK_RECIP,
SAK_RECIP ,
SAK_PGM_ELIG,
SAK_PUB_HLTH,
DTE_EFFECTIVE,
DTE_END,
CDE_PRE_ENTRY_STAT,
CDE_STOP,
CDE_STATUS1,
DTE_CREATED,
DTE_LAST_UPDATED,
DTE_ACTIVE_THRU,
NUM_RANK
)
VALUES (
Source.SAK_RECIP,
Source.SAK_RECIP ,
Source.SAK_PGM_ELIG,
Source.SAK_PUB_HLTH,
Source.DTE_EFFECTIVE,
Source.DTE_END,
Source.CDE_PRE_ENTRY_STAT,
Source.CDE_STOP,
Source.CDE_STATUS1,
Source.DTE_CREATED,
Source.DTE_LAST_UPDATED,
Source.DTE_ACTIVE_THRU,
Source.NUM_RANK
);
This is our basic MERGE and works fine for a table with upto 400k rows...but we are trying to see what can we do for a 8 million rows - like MERGE chunk by chunk like using some 'date' column in the table itself or use an icnrement on a surrogate key
2. #!/bin/bash
ssh user@server005 ARG1=$1 'bash -s' <<'ENDSSH'
cd $HOME/scripts
export ORACLE_HOME=/OraCl11g/instantclient_11_2
export PATH=$PATH:/OraCl11g/instantclient_11_2
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/OraCl11g/instantclient_11_2
./unloader "userid=user/password@oracledb" sqlstmt="select * from $ARG1" arraysize=10000 delimiter="|" enclosure="" null_string=""
#echo "All: " $*
#echo "ARG1: " $ARG1
ENDSSH
This is how I am trying to connect to a remote ORACLE DB and trying to merge data with VERTICA. UNLOADER user is a C program. I believe this is not able to hold huge volume of data..which is why we are looking to MERGE by chunks
It is not clear if this is a data movement (Oracle>Vertica) issue or a MERGE issue. I would suggest you to:
and then:
This is just to understand where your issue come from and recommend the appropriate solution.
A few questions:
It is not clear if this is a data movement (Oracle>Vertica) issue or a MERGE issue. I would suggest you to:
Its actually both - I have created an EXTRENAL TABLE on oracle tables as source and target have same structure. What I am trying to do is MERGE external table(oracle) with vertica to get my CDC. Instead of going for a huge scan at once, I would like to MERGE by chunks to preseve my network. Is there a way to do this ?