Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: nexmark q9 #7341

Closed
lmatz opened this issue Jan 12, 2023 · 4 comments
Closed

perf: nexmark q9 #7341

lmatz opened this issue Jan 12, 2023 · 4 comments

Comments

@lmatz
Copy link
Contributor

lmatz commented Jan 12, 2023

Query:

CREATE MATERIALIZED VIEW nexmark_q9
AS
SELECT
    id, item_name, description, initial_bid, reserve, date_time, expires, seller, category,
    auction, bidder, price, bid_date_time
FROM (
    SELECT A.*, B.auction, B.bidder, B.price, B.date_time AS bid_date_time,
    ROW_NUMBER() OVER (PARTITION BY A.id ORDER BY B.price DESC, B.date_time ASC) AS rownum
    FROM auction A, bid B
    WHERE A.id = B.auction AND B.date_time BETWEEN A.date_time AND A.expires
) tmp
WHERE rownum <= 1;

Plan:

 StreamMaterialize { columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, auction, bidder, price, bid_date_time, _
row_id(hidden), _row_id#1(hidden)], pk_columns: [_row_id, _row_id#1, id, auction] }
 └─StreamProject { exprs: [Field(auction, 0:Int32), Field(auction, 1:Int32), Field(auction, 2:Int32), Field(auction, 3:Int32), Field(auction, 4:Int32), Field(a
uction, 5:Int32), Field(auction, 6:Int32), Field(auction, 7:Int32), Field(auction, 8:Int32), Field(bid, 0:Int32), Field(bid, 1:Int32), Field(bid, 2:Int32), Fie
ld(bid, 5:Int32), _row_id, _row_id] }
   └─StreamGroupTopN { order: "[Field(bid, 2:Int32) DESC, Field(bid, 5:Int32) ASC]", limit: 1, offset: 0, group_key: [0] }
     └─StreamProject { exprs: [Field(auction, 0:Int32), Field(auction, 1:Int32), Field(auction, 2:Int32), Field(auction, 3:Int32), Field(auction, 4:Int32), Fie
ld(auction, 5:Int32), Field(auction, 6:Int32), Field(auction, 7:Int32), Field(auction, 8:Int32), Field(auction, 9:Int32), Field(bid, 0:Int32), Field(bid, 1:Int
32), Field(bid, 2:Int32), Field(bid, 3:Int32), Field(bid, 4:Int32), Field(bid, 5:Int32), Field(bid, 6:Int32), _row_id, _row_id] }
       └─StreamFilter { predicate: (Field(bid, 5:Int32) >= Field(auction, 5:Int32)) AND (Field(bid,
(END) 5:Int32) <= Field(auction, 6:Int32)) }
         └─StreamAppendOnlyHashJoin { type: Inner, predicate: Field(auction, 0:Int32) = Field(bid, 0:Int32) }
           ├─StreamExchange { dist: HashShard(Field(auction, 0:Int32)) }
           | └─StreamProject { exprs: [Field(auction, 0:Int32), Field(auction, 1:Int32), Field(auction, 2:Int32), Field(auction, 3:Int32), Field(auction, 4:Int32), Field(auction, 5:Int32), Field(auction, 6:Int32), Field(auction, 7:Int32), Field(auction, 8:Int32), Field(auction, 9:Int32), _row_id] }
           |   └─StreamFilter { predicate: (event_type = 1:Int32) }
           |     └─StreamProject { exprs: [event_type, person, auction, bid, _row_id] }
           |       └─StreamShare { id = 227 }
           |         └─StreamFilter { predicate: ((event_type = 1:Int32) OR (event_type = 2:Int32)) }
           |           └─StreamRowIdGen { row_id_index: 4 }
           |             └─StreamSource { source: "nexmark", columns: ["event_type", "person", "auction", "bid", "_row_id"] }
           └─StreamExchange { dist: HashShard(Field(bid, 0:Int32)) }
             └─StreamProject { exprs: [Field(bid, 0:Int32), Field(bid, 1:Int32), Field(bid, 2:Int32), Field(bid, 3:Int32), Field(bid, 4:Int32), Field(bid, 5:Int32), Field(bid, 6:Int32), _row_id] }
               └─StreamFilter { predicate: (event_type = 2:Int32) }
                 └─StreamProject { exprs: [event_type, person, auction, bid, _row_id] }
                   └─StreamShare { id = 227 }
                     └─StreamFilter { predicate: ((event_type = 1:Int32) OR (event_type = 2:Int32)) }
                       └─StreamRowIdGen { row_id_index: 4 }
                         └─StreamSource { source: "nexmark", columns: ["event_type", "person", "auction", "bid", "_row_id"] }

Related:
#7376

@lmatz
Copy link
Contributor Author

lmatz commented Feb 10, 2023

after #7376,
plan:

 StreamMaterialize { columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, auction, bidder, price, bid_date_time, _row_id(hidden), _row_id#1(hidden)], pk_columns: [_row_id, _row_id#1, id, auction] }
 └─StreamProject { exprs: [Field(auction, 0:Int32), Field(auction, 1:Int32), Field(auction, 2:Int32), Field(auction, 3:Int32), Field(auction, 4:Int32), Field(auction, 5:Int32), Field(auction, 6:Int32), Field(auction, 7:Int32), Field(auction, 8:Int32), Field(bid, 0:Int32), Field(bid, 1:Int32), Field(bid, 2:Int32), Field(bid, 5:Int32), _row_id, _row_id] }
   └─StreamAppendOnlyGroupTopN { order: "[Field(bid, 2:Int32) DESC, Field(bid, 5:Int32) ASC]", limit: 1, offset: 0, group_key: [0] }
     └─StreamProject { exprs: [Field(auction, 0:Int32), Field(auction, 1:Int32), Field(auction, 2:Int32), Field(auction, 3:Int32), Field(auction, 4:Int32), Field(auction, 5:Int32), Field(auction, 6:Int32), Field(auction, 7:Int32), Field(auction, 8:Int32), Field(auction, 9:Int32), Field(bid, 0:Int32), Field(bid, 1:Int32), Field(bid, 2:Int32), Field(bid, 3:Int32), Field(bid, 4:Int32), Field(bid, 5:Int32), Field(bid, 6:Int32), _row_id, _row_id] }
       └─StreamFilter { predicate: (Field(bid, 5:Int32) >= Field(auction, 5:Int32)) AND (Field(bid, 5:Int32) <= Field(auction, 6:Int32)) }
         └─StreamAppendOnlyHashJoin { type: Inner, predicate: Field(auction, 0:Int32) = Field(bid, 0:Int32) }
           ├─StreamExchange { dist: HashShard(Field(auction, 0:Int32)) }
           | └─StreamProject { exprs: [Field(auction, 0:Int32), Field(auction, 1:Int32), Field(auction, 2:Int32), Field(auction, 3:Int32), Field(auction, 4:Int32), Field(auction, 5:Int32), Field(auction, 6:Int32), Field(auction, 7:Int32), Field(auction, 8:Int32), Field(auction, 9:Int32), _row_id] }
           |   └─StreamFilter { predicate: (event_type = 1:Int32) }
           |     └─StreamProject { exprs: [event_type, person, auction, bid, _row_id] }
           |       └─StreamShare { id = 251 }
           |         └─StreamFilter { predicate: ((event_type = 1:Int32) OR (event_type = 2:Int32)) }
           |           └─StreamRowIdGen { row_id_index: 4 }
           |             └─StreamSource { source: "nexmark", columns: ["event_type", "person", "auction", "bid", "_row_id"] }
           └─StreamExchange { dist: HashShard(Field(bid, 0:Int32)) }
             └─StreamProject { exprs: [Field(bid, 0:Int32), Field(bid, 1:Int32), Field(bid, 2:Int32), Field(bid, 3:Int32), Field(bid, 4:Int32), Field(bid, 5:Int32), Field(bid, 6:Int32), _row_id] }
               └─StreamFilter { predicate: (event_type = 2:Int32) }
                 └─StreamProject { exprs: [event_type, person, auction, bid, _row_id] }
                   └─StreamShare { id = 251 }
                     └─StreamFilter { predicate: ((event_type = 1:Int32) OR (event_type = 2:Int32)) }
                       └─StreamRowIdGen { row_id_index: 4 }
                         └─StreamSource { source: "nexmark", columns: ["event_type", "person", "auction", "bid", "_row_id"] }
(22 rows)

Seems the only difference in the plan is Top N operator.

Performance Changes:
SCR-20230210-hpq

Append-only Top N was merged on Feb 4th.
It seems the throughput gets halved since then. @xxchan Any insights into this?
It could also be other factors, but other queries on the dashboard seem generally ok, I am not entirely sure.

@xxchan
Copy link
Member

xxchan commented Feb 10, 2023

What about q18? Is its performance improved? 👀 🤔

@lmatz
Copy link
Contributor Author

lmatz commented Feb 12, 2023

What about q18? Is its performance improved? 👀 🤔

It's better with append only top n if I manually run q18 and q9, about 30% improvement

SCR-20230212-tyv
Seems the dashboard pipeline is measuring noise... 😓 (could be no resource isolation)

@github-actions
Copy link
Contributor

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants