We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


about foreign key — Vertica Forum

about foreign key

I have impression that foreign key definition helps Vertica optimizer. In the scenario that I have a slowly changing dimension and fact table has the dimension key without version, create table product_dim ( product_key int, version int, date_from date, date_to date, price decimal(6,2), primary key (product_key, version) ) create table sales_fact ( date_key int, product_key int, sales_amount decimal(8,2) ) alter table sales_fact add constraint fk_product foreign key (product_key) references product_dim (product_key); If I define foreign key like above, I got error "ROLLBACK 4550: Referenced primary key constraint does not exist" since version is part of primary key but not foreign key. How can I address this or just forget about foreign key? How much benefit foreign key can help with the query performance? Thanks

Comments

  • Hi Mike, The performance advantage of defining the PK/FK relationship can be quite substantial or it can be quite small; it depends on your data and on your queries, and on whether Vertica gets things like the join order correct without the added information. It looks like your relationship is not a PK/FK relationship. The primary key is not "product_key", it is "(product_key, version)". The reason for the performance advantage is because, with a PK/FK relationship, Vertica is told that the primary key is both unique and the join key from that table. If both of these facts are true against the product_key field alone, then you should set that field as the primary key. If they are not both always true against just the product_key field, then Vertica can't make that assumption; you are asking for something that is fundamentally more complicated (a many-to-many join), so it may not perform as well. If there are relatively few distinct values of the product_key field, but the field is not necessarily unique, then you should instead analyze statistics on the two tables. PK/FK is a simple hard constraint, perfectly unique; statistics tell Vertica what to do if that's not quite the case but the data still has interesting relevant properties. Adam

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file