Joins on multiple(more than two) tables - extremely slow

rajatpaliwal86 · December 2019

We've designed tables like following - we have a primary table + multiple tables for each attribute that can have multiple attributes(array values)

primary table: f_network_events(flow_id, comp_id, event_time, ip_src, ip_dst, port_src, port_dst, application,.......)
then a few individual tables like
file_name(flow_id, comp_id, file_name)
file_type(flow_id, comp_id, file_type)
server_name(flow_id, comp_id, server_name)

I have a basic understanding of JOINs - I know to avail merge join for two tables I have required projections ready i.e. order by flow_id, comp_id. The join on two tables works properly e.g. the JOIN on f_network_events and file_name
select * from f_network_events JOIN file_name on f_network_events.flow_id = file_name.flow_id and f_network_events.comp_id = file_name.comp_id;
since I have projections (with the order by flow_id, comp_id) on both tables, the query optimizer is picking correct projections as depicted in the EXPLAIN command.

However, If I'm trying to JOIN more than two tables the query optimizer is not picking the correct projections (its picking super projections) and the query is extremely slow.
e.g.
select * from f_network_events JOIN file_name on f_network_events.flow_id = file_name.flow_id and f_network_events.comp_id = file_name.comp_id JOIN file_type on f_network_events.flow_id = file_type.flow_id and f_network_events.comp_id = file_type.comp_id ;

So, in brief, I have two issues
1. Join on more than two tables is extremely slow and why the correct projection is not getting picked up by the query optimizer.
2. Also, the JOIN on two tables is slow when I add order by event_timestamp in the query
select * from f_network_events JOIN file_name on f_network_events.flow_id = file_name.flow_id and f_network_events.comp_id = file_name.comp_id order by f_network_events .event_timestamp desc limit 100;
I tried giving query to the database designer and deployed suggested projections but still, there is no improvement in the query execution - they are extremely slow.
Note: tables have around ~ 300 million rows.

Bryan_H · December 2019

Try specifying the projection to use for each table with the PROJS hint as shown at
https://www.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/SQLReferenceManual/LanguageElements/Hints/Projs.htm
It might help to review the table and projection definitions to see if they match the query; I notice the query runs a SELECT * which means that Vertica may need to use a super-projection that contains all fields if the optimized projection does not contain all fields in the anchor table.
For the second issue, we would probably need the explain plan for the query with order as well as the table and projection definitions to determine whether optimization is possible. It may be the same issue where one or more fields is missing from the projection so it is necessary to computer the join and ordering from the anchor tables.

marcothesane · December 2019

Are flow_id and comp_id integers?
And - I'd create super projections (any projection containing all columns of the anchor table is a super one), all ordered by flow_idand then comp_id (or the other way round- lower cardinality first, higher cardinality then) - and drop the original (probably auto - ) projections.
How is everything segmented? Either UNSEGMENTED ALL NODES for all projections of smaller tables , or SEGMENTED BY HASH(flow_id, comp_id) for all projections of bigger tables.
Good luck ---

Maurizio · December 2019

Your definition of "correct projections" not being picked up by the optimiser is a little bit too vague to offer specific comments. A few general comments:
1. Vertica uses a Cost Based Optimizer. Its primary scope is to pick the cheaper plan; not the faster one. No one knows in advance how long your query will take;
2. the reason why the optimiser doesn't use a given projection is because either it doesn't have enough columns to answer the query or the estimated plan cost using that projection is higher
3. The possibility to use multiple projections per table is a Vertica feature we should use... but not to abuse! Especially if the optimiser don't use them. The more projections means more work during loads and more workload on the TM Mergeout.

rajatpaliwal86 · December 2019

@Bryan_H said:
Try specifying the projection to use for each table with the PROJS hint as shown at
https://www.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/SQLReferenceManual/LanguageElements/Hints/Projs.htm
It might help to review the table and projection definitions to see if they match the query; I notice the query runs a SELECT * which means that Vertica may need to use a super-projection that contains all fields if the optimized projection does not contain all fields in the anchor table.
For the second issue, we would probably need the explain plan for the query with order as well as the table and projection definitions to determine whether optimization is possible. It may be the same issue where one or more fields is missing from the projection so it is necessary to computer the join and ordering from the anchor tables.

Here, I'll just talk about the second issue i.e. slow order by JOINED query.
The tables super projections are ordered by flow_id, component_id
The primary table(f_network_events) has two projections
1. order by flow_id, component_id
2. order by event_timestamp, flow_id, component_id

The other table (f_virtual_fields) has one projection i.e. order by flow_id, component_id

CREATE PROJECTION athena.f_network_events_super
(
id ENCODING DELTARANGE_COMP,
flow_id ENCODING COMMONDELTA_COMP,
component_id ENCODING COMMONDELTA_COMP,
event_timestamp ENCODING COMMONDELTA_COMP,
....
)
AS
SELECT f_network_events.id,
f_network_events.flow_id,
f_network_events.component_id,
f_network_events.event_timestamp,
....
FROM athena.f_network_events
ORDER BY f_network_events.flow_id,
f_network_events.component_id
UNSEGMENTED ALL NODES;

CREATE PROJECTION athena.f_network_events_super_ts
(
id ENCODING DELTARANGE_COMP,
flow_id ENCODING DELTARANGE_COMP,
component_id ENCODING COMMONDELTA_COMP,
event_timestamp ENCODING COMMONDELTA_COMP,
....)
AS
SELECT f_network_events.id,
f_network_events.flow_id,
f_network_events.component_id,
f_network_events.event_timestamp,
....
FROM athena.f_network_events
ORDER BY f_network_events.event_timestamp,
f_network_events.flow_id,
f_network_events.component_id
UNSEGMENTED ALL NODES;

CREATE FLEX TABLE athena.f_virtual_fields
(
flow_id int NOT NULL,
component_id int NOT NULL
);
ALTER TABLE athena.f_virtual_fields ADD CONSTRAINT C_FOREIGN FOREIGN KEY (flow_id, component_id) references athena.f_network_events (flow_id, component_id);

CREATE PROJECTION athena.f_virtual_fields_super /+createtype(D)/
(
raw,
flow_id ENCODING COMMONDELTA_COMP,
component_id ENCODING COMMONDELTA_COMP
)
AS
SELECT f_virtual_fields.raw,
f_virtual_fields.flow_id,
f_virtual_fields.component_id
FROM athena.f_virtual_fields
ORDER BY f_virtual_fields.flow_id,
f_virtual_fields.component_id
SEGMENTED BY hash(f_virtual_fields.flow_id, f_virtual_fields.component_id) ALL NODES OFFSET 0;

select athena.f_network_events.*, maptostring(athena.f_virtual_fields.raw) as eventjson from athena.f_network_events JOIN athena."f_virtual_fields" on athena.f_network_events.flow_id = athena."f_virtual_fields".flow_id AND athena.f_network_events.component_id = athena."f_virtual_fields".component_id AND athena.f_network_events.event_timestamp > to_timestamp_tz(1576795262) AND athena.f_network_events.event_timestamp < to_timestamp_tz(1576816862) limit 100;

This command returns very fast i.e. within < 1 second, however, If I add order by, in the end, it runs very slow about ~ 80 seconds

select athena.f_network_events.*, maptostring(athena.f_virtual_fields.raw) as eventjson from athena.f_network_events JOIN athena."f_virtual_fields" on athena.f_network_events.flow_id = athena."f_virtual_fields".flow_id AND athena.f_network_events.component_id = athena."f_virtual_fields".component_id AND athena.f_network_events.event_timestamp > to_timestamp_tz(1576795262) AND athena.f_network_events.event_timestamp < to_timestamp_tz(1576816862) order by athena.f_network_events.event_timestamp desc limit 100;

even order by asc is very slow

what should I do to make the order by fast on JOIN queries?

Note: order by is slow only in JOINs and on the single table its very fast for both asc and desc.

Can you help me out in getting order by execution fast in the JOIN query?

rajatpaliwal86 · December 2019

@Bryan_H said:
Try specifying the projection to use for each table with the PROJS hint as shown at
https://www.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/SQLReferenceManual/LanguageElements/Hints/Projs.htm
It might help to review the table and projection definitions to see if they match the query; I notice the query runs a SELECT * which means that Vertica may need to use a super-projection that contains all fields if the optimized projection does not contain all fields in the anchor table.
For the second issue, we would probably need the explain plan for the query with order as well as the table and projection definitions to determine whether optimization is possible. It may be the same issue where one or more fields is missing from the projection so it is necessary to computer the join and ordering from the anchor tables.

explain select athena.f_network_events.*, maptostring(athena.f_virtual_fields.raw) as eventjson from athena.f_network_events JOIN athena."f_virtual_fields" on athena.f_network_events.component_id = athena."f_virtual_fields".component_id and athena.f_network_events.flow_id = athena."f_virtual_fields".flow_id AND athena.f_network_events.event_timestamp > to_timestamp_tz(1576795262) AND athena.f_network_events.event_timestamp < to_timestamp_tz(1576816862) order by athena.f_network_events.event_timestamp asc limit 100;

Access Path:
+-SELECT LIMIT 100 [Cost: 660M, Rows: 100] (PATH ID: 0)
| Output Only: 100 tuples
| +---> SORT [TOPK] [Cost: 660M, Rows: 4M] (PATH ID: 1)
| | Order: f_network_events.event_timestamp ASC
| | Output Only: 100 tuples
| | +---> JOIN MERGEJOIN(inputs presorted) [Cost: 40M, Rows: 4M] (PATH ID: 2)
| | | Join Cond: (f_network_events.flow_id = f_virtual_fields.flow_id) AND (f_network_events.component_id = f_virtual_fields.component_id)
| | | Materialize at Output: f_virtual_fields.raw
| | | +-- Outer -> STORAGE ACCESS for f_virtual_fields [Cost: 2M, Rows: 380M] (PATH ID: 3)
| | | | Projection: athena.f_virtual_fields_super
| | | | Materialize: f_virtual_fields.component_id, f_virtual_fields.flow_id
| | | | Runtime Filters: (SIP1(MergeJoin): f_virtual_fields.flow_id), (SIP2(MergeJoin): f_virtual_fields.component_id), (SIP3(MergeJoin): f_virtual_fields.flow_id, f_virtual_fields.component_id)
| | | +-- Inner -> STORAGE ACCESS for f_network_events [Cost: 22M, Rows: 4M] (PATH ID: 4)
| | | | Projection: athena.f_network_events_super
| | | | Materialize: f_network_events.event_timestamp, f_network_events.flow_id, f_network_events.component_id, f_network_events.id, f_network_events.event_timestamp_weekid, f_network_events.event_timestamp_day, f_network_events.event_timestamp_hour, f_network_events.event_timestamp_minute, f_network_events.insertion_timestamp, f_network_events.sensor_id, f_network_events.port_initiator, f_network_events.port_responder, f_network_events.mac_initiator, f_network_events.mac_responder, f_network_events.ip_initiator, f_network_events.ip_responder, f_network_events.ip_initiator_location, f_network_events.ip_responder_location, f_network_events.packets, f_network_events.bytes, f_network_events.protocol, f_network_events.application, f_network_events.country_initiator, f_network_events.country_responder, f_network_events.device_name, f_network_events.device_type, f_network_events.domain_category, f_network_events.user_agent, f_network_events.url_reputation, f_network_events.md5_reputation, f_network_events.url_category, f_network_events.browser, f_network_events.server_name, f_network_events.url, f_network_events.response_code, f_network_events.request_method, f_network_events.component_packets_initiator, f_network_events.component_packets_responder, f_network_events.component_bytes_initiator, f_network_events.component_bytes_responder, f_network_events.tor_ip, f_network_events.tor_exit_node, f_network_events.org_name, f_network_events.dga, f_network_events.subdomain_dns, f_network_events.subdomain_length_dns, f_network_events.application_classification, f_network_events.username, f_network_events.method_ftp, f_network_events.method_content_ftp, f_network_events.header_raw_http
| | | | Filter: ((f_network_events.event_timestamp > '2019-12-19 15:41:02-07'::timestamptz) AND (f_network_events.event_timestamp < '2019-12-19 21:41:02-07'::timestamptz))

explain select athena.f_network_events.*, maptostring(athena.f_virtual_fields.raw) as eventjson from athena.f_network_events JOIN athena."f_virtual_fields" on athena.f_network_events.component_id = athena."f_virtual_fields".component_id and athena.f_network_events.flow_id = athena."f_virtual_fields".flow_id AND athena.f_network_events.event_timestamp > to_timestamp_tz(1576795262) AND athena.f_network_events.event_timestamp < to_timestamp_tz(1576816862) limit 100;

Access Path:
+-SELECT LIMIT 100 [Cost: 40M, Rows: 100] (PATH ID: 0)
| Output Only: 100 tuples
| +---> JOIN MERGEJOIN(inputs presorted) [Cost: 40M, Rows: 4M] (PATH ID: 1)
| | Join Cond: (f_network_events.flow_id = f_virtual_fields.flow_id) AND (f_network_events.component_id = f_virtual_fields.component_id)
| | Materialize at Output: f_virtual_fields.raw
| | Output Only: 100 tuples
| | +-- Outer -> STORAGE ACCESS for f_virtual_fields [Cost: 2M, Rows: 380M] (PATH ID: 2)
| | | Projection: athena.f_virtual_fields_super
| | | Materialize: f_virtual_fields.component_id, f_virtual_fields.flow_id
| | | Output Only: 100 tuples
| | | Runtime Filters: (SIP1(MergeJoin): f_virtual_fields.flow_id), (SIP2(MergeJoin): f_virtual_fields.component_id), (SIP3(MergeJoin): f_virtual_fields.flow_id, f_virtual_fields.component_id)
| | +-- Inner -> STORAGE ACCESS for f_network_events [Cost: 22M, Rows: 4M] (PATH ID: 3)
| | | Projection: athena.f_network_events_super
| | | Materialize: f_network_events.event_timestamp, f_network_events.flow_id, f_network_events.component_id, f_network_events.id, f_network_events.event_timestamp_weekid, f_network_events.event_timestamp_day, f_network_events.event_timestamp_hour, f_network_events.event_timestamp_minute, f_network_events.insertion_timestamp, f_network_events.sensor_id, f_network_events.port_initiator, f_network_events.port_responder, f_network_events.mac_initiator, f_network_events.mac_responder, f_network_events.ip_initiator, f_network_events.ip_responder, f_network_events.ip_initiator_location, f_network_events.ip_responder_location, f_network_events.packets, f_network_events.bytes, f_network_events.protocol, f_network_events.application, f_network_events.country_initiator, f_network_events.country_responder, f_network_events.device_name, f_network_events.device_type, f_network_events.domain_category, f_network_events.user_agent, f_network_events.url_reputation, f_network_events.md5_reputation, f_network_events.url_category, f_network_events.browser, f_network_events.server_name, f_network_events.url, f_network_events.response_code, f_network_events.request_method, f_network_events.component_packets_initiator, f_network_events.component_packets_responder, f_network_events.component_bytes_initiator, f_network_events.component_bytes_responder, f_network_events.tor_ip, f_network_events.tor_exit_node, f_network_events.org_name, f_network_events.dga, f_network_events.subdomain_dns, f_network_events.subdomain_length_dns, f_network_events.application_classification, f_network_events.username, f_network_events.method_ftp, f_network_events.method_content_ftp, f_network_events.header_raw_http
| | | Filter: ((f_network_events.event_timestamp > '2019-12-19 15:41:02-07'::timestamptz) AND (f_network_events.event_timestamp < '2019-12-19 21:41:02-07'::timestamptz))

rajatpaliwal86 · December 2019

@marcothesane said:
Are flow_id and comp_id integers?
And - I'd create super projections (any projection containing all columns of the anchor table is a super one), all ordered by flow_idand then comp_id (or the other way round- lower cardinality first, higher cardinality then) - and drop the original (probably auto - ) projections.
How is everything segmented? Either UNSEGMENTED ALL NODES for all projections of smaller tables , or SEGMENTED BY HASH(flow_id, comp_id) for all projections of bigger tables.
Good luck ---

Yes, both flow_id, comp_id are integers.
comp_id(low cardinality), and flow_id(high cardinality) - so I also tried reversing the order by in projections but it didn't help.

It is UNSEGMENTED ALL NODES.

I want events with - order by event_timestamp desc. The problem is the order by (asc, desc) works well when only the primary table is involved but if I join with other tables - the JOINED query + order by event_timestamp is very slow. What should I do to make the order by query fast with JOINS?

rajatpaliwal86 · December 2019

@Maurizio said:
Your definition of "correct projections" not being picked up by the optimiser is a little bit too vague to offer specific comments. A few general comments:
1. Vertica uses a Cost Based Optimizer. Its primary scope is to pick the cheaper plan; not the faster one. No one knows in advance how long your query will take;
2. the reason why the optimiser doesn't use a given projection is because either it doesn't have enough columns to answer the query or the estimated plan cost using that projection is higher
3. The possibility to use multiple projections per table is a Vertica feature we should use... but not to abuse! Especially if the optimiser don't use them. The more projections means more work during loads and more workload on the TM Mergeout.

Thanks for the reply!
It looks like the query optimizer is picking up correct projections but now I still have the second issue i.e. order by event_timestamp is very slow on a JOIN query. The order by works fast on the primary table but its extremely slow when I join the primary table with other tables - Any suggestion to make order by fast in a JOIN query?

mosheg · December 2019

Try Flattened Tables, if you need to JOIN more than two tables.
With Flattened Tables you can create wide tables that combine several fact and/or dimension table columns that your queries require. These tables can dramatically speed up query execution.
See: https://www.vertica.com/docs/9.3.x/HTML/Content/Authoring/AnalyzingData/FlattenedTables/FlattenedTables.htm

rajatpaliwal86 · December 2019

@mosheg said:
Try Flattened Tables, if you need to JOIN more than two tables.
With Flattened Tables you can create wide tables that combine several fact and/or dimension table columns that your queries require. These tables can dramatically speed up query execution.
See: https://www.vertica.com/docs/9.3.x/HTML/Content/Authoring/AnalyzingData/FlattenedTables/FlattenedTables.htm

In my case, there will be multiple large facts tables (billions of rows in each table) - will it be good in that case too?

We're Moving!

Create My New Community Account Now

Joins on multiple(more than two) tables - extremely slow

Answers

Leave a Comment