How to speed up order by?

HyeontaeJuHyeontaeJu Vertica Customer

I use muliple join, and i wanna order by specific column.
but, if i use order by, the speed is very low.. How to solve this problem?

Answers

  • Nimmi_guptaNimmi_gupta - Select Field - Employee

    Can you share your query explain plan with and without order by? With respect of timing how slow if you use order by?
    ex:
    explain

  • HyeontaeJuHyeontaeJu Vertica Customer
    edited September 2020

    Sorry, I forgot upload query explain! haha
    This is query explain with order by and without,,

    without order by

    QUERY PLAN DESCRIPTION:

    explain SELECT a.image_link , a.product_id , b.title , b.mall_name , b.brand , b.category_name1 , b.category_name2 , b.category_name3 , b.category_name4 FROM newbiz_product_dev.products_image a LEFT OUTER JOIN newbiz_product_dev.products b ON a.product_id = b.product_id WHERE a.image_status = 'SUCCESS' ORDER BY a.update_time desc LIMIT 50 OFFSET 0

    Access Path:
    +-SELECT LIMIT 50 [Cost: 11M, Rows: 50] (PATH ID: 0)
    | Output Only: 50 tuples
    | +---> SORT [TOPK] [Cost: 11M, Rows: 4M] (PATH ID: 1)
    | | Order: a.update_time DESC
    | | Output Only: 50 tuples
    | | +---> JOIN MERGEJOIN(inputs presorted) [LeftOuter] [Cost: 5M, Rows: 4M] (PATH ID: 2)
    | | | Join Cond: (a.product_id = b.product_id)
    | | | Materialize at Output: a.image_link, a.update_time
    | | | Runtime Filter: (SIP1(TopK): a.update_time)
    | | | +-- Outer -> STORAGE ACCESS for a [Cost: 57K, Rows: 4M] (PATH ID: 3)
    | | | | Projection: newbiz_product_dev.products_image_super
    | | | | Materialize: a.product_id
    | | | | Filter: (a.image_status = 'SUCCESS')
    | | | +-- Inner -> STORAGE ACCESS for b [Cost: 4M, Rows: 44M] (PATH ID: 4)
    | | | | Projection: newbiz_product_dev.products_super
    | | | | Materialize: b.product_id, b.mall_name, b.title, b.category_name1, b.category_name2, b.category_name3, b.category_name4, b.brand

    with order by

    QUERY PLAN DESCRIPTION:

    explain SELECT a.image_link , a.product_id , b.title , b.mall_name , b.brand , b.category_name1 , b.category_name2 , b.category_name3 , b.category_name4 FROM newbiz_product_dev.products_image a LEFT OUTER JOIN newbiz_product_dev.products b ON a.product_id = b.product_id WHERE a.image_status = 'SUCCESS' LIMIT 50 OFFSET 0

    Access Path:
    +-SELECT LIMIT 50 [Cost: 5M, Rows: 50] (PATH ID: 0)
    | Output Only: 50 tuples
    | +---> JOIN MERGEJOIN(inputs presorted) [LeftOuter] [Cost: 5M, Rows: 4M] (PATH ID: 1)
    | | Join Cond: (a.product_id = b.product_id)
    | | Materialize at Output: a.image_link
    | | Output Only: 50 tuples
    | | +-- Outer -> STORAGE ACCESS for a [Cost: 57K, Rows: 4M] (PATH ID: 2)
    | | | Projection: newbiz_product_dev.products_image_super
    | | | Materialize: a.product_id
    | | | Filter: (a.image_status = 'SUCCESS')
    | | +-- Inner -> STORAGE ACCESS for b [Cost: 4M, Rows: 44M] (PATH ID: 3)
    | | | Projection: newbiz_product_dev.products_super
    | | | Materialize: b.product_id, b.mall_name, b.title, b.category_name1, b.category_name2, b.category_name3, b.category_name4, b.brand

  • HyeontaeJuHyeontaeJu Vertica Customer

    and this is graph
    with order by

    without order by

  • Nimmi_guptaNimmi_gupta - Select Field - Employee

    Thanks for providing all the detail information. Explain plan for with and without order by query looks good and are same except order by has to take extra steps[ Runtime Filter: (SIP1(TopK): a.update_time)] for the sorting and that is expected. Can you confirm about the timing how long it's taking? Turn on the timing and run the query with and without order by and capture the time (if possible).

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file