How to load data type fixed width support thai language
sample data utf8 have thai language
9999Aเจ้าหน้าที่.........................................................................................AX999....XXXX20221214
table structure name ABC
A1 char(5)
A2 varchar(100)
A3 char(1)
A4 char(4)
A5 char(4)
A6 char(4)
A7 char(8)
when I use command copy load
COPY public.ABC ( A1,A2,A3,A4,A5,A6,A7) from local '/data/sampledata.TXT' FIXEDWIDTH colsizes(5,100,1,4,4,4,8) skip 0 rejected data as table public.ABC_RJCT ENFORCELENGTH;
Rows Loaded = 0
and rejected_reason from table reject public.ABC_RJCT
"Row is malformed. Including record terminator, expected size is 127 bytes. Row size is 149 bytes."
I try expand data type A2 is varchar(400) but failed.
reason is the same,
"Row is malformed. Including record terminator, expected size is 127 bytes. Row size is 149 bytes."
Where do I need to make additional settings or edit? to be able to load the data.
Best Answer
-
mosheg Vertica Employee Administrator
It looks like the issue is that the data in the file is not the same size as the columns you have defined in the COPY command.
To fix this, you will need to adjust the colsizes parameter to match the actual size of the data in the file.Try this:
SELECT 'A1 char(4)' as DataType, OCTET_LENGTH('9999'), '9999' as Data UNION ALL SELECT 'A2 varchar(36)', OCTET_LENGTH('Aเจ้าหน้าที่ A'), 'Aเจ้าหน้าที่ A' UNION ALL SELECT 'A3 char(1)', OCTET_LENGTH('X'), 'X' UNION ALL SELECT 'A4 char(4)', OCTET_LENGTH('999 '), '999 ' UNION ALL SELECT 'A5 char(4)', OCTET_LENGTH('XXXX'), 'XXXX' UNION ALL SELECT 'A6 char(4)', OCTET_LENGTH('2022'), '2022' UNION ALL SELECT 'A7 char(4)', OCTET_LENGTH('1214'), '1214'; -- DataType | OCTET_LENGTH | Data -- ----------------+--------------+------------ -- A1 char(4) | 4 | 9999 -- A2 varchar(36) | 36 | Aเจ้าหน้าที่ A -- A3 char(1) | 1 | X -- A4 char(4) | 4 | 999 -- A5 char(4) | 4 | XXXX -- A6 char(4) | 4 | 2022 -- A7 char(4) | 4 | 1214 -- (7 rows) drop table if exists ABC cascade; create table ABC ( A1 char(4), A2 varchar(36), A3 char(1), A4 char(4), A5 char(4), A6 char(4), A7 char(4)); COPY ABC( A1,A2,A3,A4,A5,A6,A7) from STDIN FIXEDWIDTH colsizes(4,36,1,4,4,4,4) ENFORCELENGTH ABORT ON ERROR; 9999Aเจ้าหน้าที่ AX999 XXXX20221214 1234Aเจ้าหน้าที่ A11234123412341234 \. SELECT * FROM ABC; -- A1 | A2 | A3 | A4 | A5 | A6 | A7 -- ------+------------+----+------+------+------+------ -- 1234 | Aเจ้าหน้าที่ A | 1 | 1234 | 1234 | 1234 | 1234 -- 9999 | Aเจ้าหน้าที่ A | X | 999 | XXXX | 2022 | 1214 -- (2 rows)
Another option would be to try using a delimited format for the data instead of fixed width. This would allow you to specify the data type for each column without needing to worry about the exact size of the data in the file. For example, you could use the following COPY command to load the data using a delimited format:
COPY public.ABC ( A1,A2,A3,A4,A5,A6,A7)
FROM local '/data/sampledata.TXT'
DELIMITER ','
REJECTED DATA AS TABLE public.ABC_RJCT
ENFORCELENGTH;In this case, you would need to make sure that the data in the file is properly delimited using commas (or whatever delimiter you choose). You would also need to adjust the data types for each column to match the data in the file.
I hope this helps!
0