Need data collector column descriptions / some minimal documentation

The data_collector tables are deemed undocumented, but in order to use them ( even without support) it would be useful to understand what they capture at least. For example, I am curios what metric is being used on dc_cpu_aggregate_by_hour table for the following columns : user_microseconds_start_value | 212190000 user_microseconds_end_value | 354350000 user_microseconds_peak_delta | 19630000 user_microseconds_peak_start | 2013-09-06 11:41:00.002371-07 user_microseconds_peak_end | 2013-09-06 11:42:00.001458-07 The values in start and end seem to keep increasing , so what are they measuring ? Thanks.

Comments

  • Hey Colin, Well, unfortunately, if a DC table is not yet documented, that generally means we're not ready for you to directly use it yet :-) Reasons for this typically include that they may change in future releases; that values may only be relevant to people within Vertica Support; etc. I think a more common reason is that they are often (as with dc_cpu_aggregate_by_hour, which incidentally you will find looks suspiciously like periodic snapshots of certain bits of the data presented in the file /proc/stat on each node, documentation for which ships with the kernel source) very raw and quite low-level. So they're not very accessible; in some cases it'd take a fair bit of documenting to explain the relevant underlying concepts in enough depth to let users fully take advantage of the table. Some DC tables are documented; that set is growing with time. Many non-DC system tables are actually views over DC tables (not necessarily literal views with CREATE VIEW, though that approach works too); they re-organize the data and present it in a way that's more useful and more accessible. If you'd like a particular piece of this functionality to be explicitly cleaned up and made available, it would be very helpful if you could add details of what you'd like to know, to the "Ideas" section of this site. Adam
  • Incidentally, that's a general note -- a number of our DC tables look very much like the data that you might get by sampling numbers from some appropriate underlying system. We don't explicitly document which (because it could change, etc); but it's often not hard to guess or to look up. If you (or anyone else here) is so inclined, you're welcome to look up how the underlying data is represented; then build (and share) some views that display the data in a more-helpful way. Totally unsupported, of course. If it breaks, you get to keep the pieces. If you're interested: As you've noticed, a lot of the data in the DC tables is sampled and/or accumulated. The LAG() analytic function is exceedingly handy for computing deltas between samples.
  • Thanks Adam. I realized the numbers surfaced in the dc_*_aggregate tables were surfaced from the OS , but I was unsure what extra manipulation is being added on top. The moment you mentioned /proc/stat it was clear. No one I think expects documentation on OS features, this is public info . I think I am all set for now, I was trying to look at these umbers from sar output perspective which is slightly different ( % of utilization, rather than time). The appeal of using these tables is to use SQL and extract what you need ( resource usage during hotspots for example) rather than go through a data ingestion / management phase of the same data. It is actually the word that data collector engineers have done, and why duplicate it ? Regards, Colin

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file