Questions about projections given by DBD

We used DBD for our projection design, and I have some questions about the design that DBD gave us: 1. There are a lot IPV6 address lookups in a lot of our queries (WHERE and GROUP BY clauses), I would have expected the IP address column in the ORDER BY of the projection, but it's not. The data type of the IP address column is varbinary. Why didn't DBD put IP addresses in ORDER BY? 2. DBD put a lot of measure columns (NumConnections, NumPackets, etc.) in the ORDER BY clause of the projection. My guess is that it's trying to compress the data more, but I think it does not help with query performance. Does anyone know why it put so many measures in ORDER BY and what are the impacts on query performance?

Comments

  • Hi, I can't speak to the address lookups in particular; I'd have to know more about your particular queries. Regarding the ORDER BY clause -- the DBD doesn't just optimize for query performance. It also optimizes for load performance and for recovery (ie., restoring a crashed node) performance. Recovery in particular gains a huge performance benefit if the contents of each row can be uniquely (or nearly uniquely) identified by just the columns in the ORDER BY clause. And adding additional columns beyond that point to the sort order should have negligible cost (if a row is uniquely described by the first three columns then sorting by any subsequent column is a no-op) so the DBD tends to be conservative. Adam

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file