Query Read Performance
verban
Vertica Customer ✭
I don't know why Vertica send data slowly!
(I mean when it processed query and want to send it to output)
for example when i run my query (on vertica machine, 1 node), after 2nd second it start to write to file but it tacks about 120 seconds for writing a 1GB file (10M records).
it means about 8MBps!
(it was same when i queried 1M records to file)
i was suspicious of disk limit for writing, i wrote the vsql output on ramdisk but it was same
and also i switched to receiving output remotely but it consumed only about 12% of 1Gbps interface!
how can i find what is the bottleneck?
how can i improve it?
0
Comments
Can you show how you are writing output to a file?
CREATE LOCAL TEMPORARY TABLE foo
ON COMMIT PRESERVE ROWS AS
<your full select>
KSAFE 0;
And time that run.
The difference between the timing you get here and the timing you got before comes from:
1. "Send data to client" in the Vertica plan. A relatively small portion, the only part in Vertica.
2. The network between Vertica and the client. This can be substantial.
3. The data formatting effort in the SQL client: vsql, or whatever you use. This can be substantial, too
4. Finally, writing results wherever you want them.
This should explain it all.
Marco
CREATE LOCAL TEMPORARY TABLE foo
ON COMMIT PRESERVE ROWS AS
( your full select)
KSAFE 0;
And time that run.
The difference between the timing you get here and the timing you got before comes from:
1. "Send data to client" in the Vertica plan. A relatively small portion, the only part in Vertica.
2. The network between Vertica and the client. This can be substantial.
3. The data formatting effort in the SQL client: vsql, or whatever you use. This can be substantial, too
4. Finally, writing results wherever you want them.
This should explain it all.
Marco
Thanks Marco,
It's about 10 times faster...
1M takes about 2.0 sec (10M is about 19sec)
I checked disk write speed and it was good (about 200MB/s)
But why storing data on disk is slow??
(i store fetched record locally with vsql on vertica machine)
@Jim_Knicely
i want to fetched it remotely with jdbc,
last week i found fetching data is very slow and network interface utilization is about 10%,
then i decided to measure vertica max read first
bcoz my disk write is good and vsql write 1GB file continuously on disk during 120 sec!!
i'm not interested in working with file
@marcothesane @Jim_Knicely any comment?