We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


deleting duplicated rows — Vertica Forum

deleting duplicated rows

Hi All, Is there an efficient way to delete duplicates (whole row duplicated) in a big vertica table . All I can think of is a group by on all the columns, store the unique rows in a temp table, delete the duplicated chunk, repopulate from the temp table. Thanks, RAvi.

Comments

  • Hi Ravi, My approach would be similar, except, recall that Vertica really doesn't like DELETEs. So I would just do: CREATE TABLE temp AS SELECT DISTINCT * FROM orig; DROP TABLE orig; ALTER TABLE temp RENAME TO orig; (Of course, before dropping orig, you should verify that temp contains what you want. You could also "ALTER TABLE orig RENAME TO old_orig;" or something, to get it out of the way but keep it around.) Adam
  • If your data is partitioned usefully, you can also do this by partition, using DROP PARTITION and MOVE_PARTITIONS_TO_TABLE().

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file