Can anyone explain time spent between queue_entry_timestamp and start_timestamp

VerDba · December 2015

I dont see any contention or lack of resources but sometimes query run in 6 Seconds sometimes 26 seconds.....

When I looked more I saw in query_request or query profile table, query start_timestamp is 23:47:11. But when I look at resourec_acquistions table for same transaction I see queue_entry_timestamp 23:47:32.

Where did this (23:47:32 - 23:47:11) = 21 Seconds spend or go? This additional 21 seconds got added to execution time....

colin_loghin · December 2015

The 21 seconds were spend before the resource manager added it to the queue. In the same resource_acquisitions table you would find

* queue_entry_timestamp,

* acquisition_timestamp,

* release_timestamp.

If there is no delay between the 2 timestamps then there was no resource contention.

If the start_timestamp is from query_requests table and there are 21 elapsed sec before query execution, i would suspect delays in query planning phase.

For that , please look into dc_query_executions for the given session_id / statement_id / request_id and check the various phases

select execution_step, completion_time - time, * from dc_query_executions where session_id = ... etc ;

If you see a large InitPlan or any non execution step , then this is where time was spent...

Let me know if this helps it.

VerticaDba · December 2015

Thanks for reply...it helps....I have both the scenario....Lets start with #1 ....it is taking longer to get in the queue....when I look at execution _engine_profile i see more time spend on: sometimes input queue wait, sometimes consumer stall, sometimes network wait, sometimes all of the 3. When it runs faster I dont see any of this. Question is what is causing this?

input queue wait (us)

consumer stall (us)

network wait (us)

colin_loghin · December 2015

Your response is not clear to me. I thought the queue_entry_timestamp 23:47:32 is of concern and the 21 sec spent before that. None of that is visible in the execution_engine_profiles because it occurs before execution.

Best option is to clarify with actual time stamps.

For the execution_engine_profiles , the difference in counters are likely caused by overall concurrent workloads if you happen to have any. There could be other causes, but it is hard to tell form the info provided.

VerticaDba · December 2015

Thanks ...You are right ...I have confused you.......Let me try to explain again. I have 2 issue.

1) Sometimes it takes longer to get into queue:

For this you recommended to look at dc_query_executions. (Once that scenario happen in a big way again i will provide more data.) But now sometimes I see for a tiny canary query one time compile plan takes 2.47 secs.

CompilePlan

0 00:00:02.472947

another time it takes 0.24 secs

CompilePlan

0 00:00:00.244708

why is it different as system does not seem to have any resource conflict.

2) sometime same query takes longer:

In execution_engine_profile sometimes these counters (initial_queue_wait, consumer_stall, network wait) takes longer for tiny canary query.

You said " ...overall concurrent workloads ..." might caused it. I have a separate resource pool with enough resources for this canary query. Why concurrent wokload will affect this tny query?

How can I get consistent timing for same query (I have put it in a separate resourcepool ) irrespective of load?

Thanks again for helping.......

VerticaDba · December 2015

I have #3 scenario:

From Cron it runs top of every minute like 18:01:01, 18:02:01, 18:03:02. Most of the time in query requests i see this ime. But during slow time i see it takes 12 seconds to go to qury request. Like in query request i see it is like 18:04:12 instead of 18:04:01. What happened to this 12 seconds before it get into query request....

colin_loghin · December 2015

on 1), i don't find valuable fighting over .2 sec -> 2.4 sec. This is just too small of a time delta and the cluster is tuned for OLAP work ( minutes) not OLTP ( ms). There could be multiple reasons why the difference occurs ( hw for example) , but i am suggesting it is not a worthwhile effort.

On 2) Regardless of resource pools, you are still sharing CPU / IO / network. Those are not carved out using precise science and they can influence entire environment. Again HW performance may have to do with it. Use sar tools to monitor cluster performance. Ideally, for benchmarking run your queries in isolation to avoid side effects from concurrent workloads.

3) Are you saying that query starttime is 12 sec delayed during heavy workloads ? I would check in the logs whether the time is spend establishing DB connection or elsewhere . It is not a common scenario I have seen unless the HW is underprovisioned.

Can anyone explain time spent between queue_entry_timestamp and start_timestamp

Comments

Leave a Comment