Using the python built-in profiler with a script runninng in one processor (and no multithreading)
time python -m cProfile myscript.py
the CPU time reported by the profiler is 345.710 CPU seconds
24184348 function calls (24183732 primitive calls) in 345.710 CPU seconds
and the real, user and sys time are:
real 5m45.926s
user 1m59.340s
sys 0m39.452s
As you can see the CPU time is nearly the real time (345.710 = 5m45.710s).
So, given that result, is it possible to assume that CPU time reported by the profiler includes the time slices used by other processes and time the process spends blocked? i.e. that the profiler CPU time is not the process time (user+sys) but the wall clock time as it is explained in
What do 'real', 'user' and 'sys' mean in the output of time(1)?
Thank very much in advance
This answers beautifully details on the meaning of real, user and sys timings. To quote:
‘Real’ is wall clock time – time from start to finish of the call. This
is all elapsed time including time slices used by other processes and
time the process spends blocked (for example if it is waiting for I/O
to complete).
‘User’ is the amount of CPU time spent in user-mode code (outside the
kernel) within the process. This is only actual CPU time used in
executing the process. Other processes and time the process spends
blocked do not count towards this figure.
‘Sys’ is the amount of CPU time spent in the kernel within the process.
This means executing CPU time spent in system calls within the
kernel, as opposed to library code, which is still running in
user-space. Like ‘user’, this is only CPU time used by the process.
From the above explanation it appears like User+Sys time should be equal to CPU seconds. Instead, its closer to ‘real’ time. Strange!
There is a fair explanation for that. ‘User’ time does not include CPU seconds spent on I/O operations within the process. It just measures CPU time spent on user-mode code within the memory. The rule of thumb is:
To validate this I made a
urllib2.urlopen(urllib2.Request(url))call doing intensive I/O. Here are the results:Here, 2.076-(0.352+0.093), i.e, 1.631 CPU secs were consumed in I/O operations (mainly
_socket.socketand_socket.getaddrinfo). Rest of the time, 2.255-2.076, were spent in cold start of the code.Hope that was useful.
Update: In multiple core systems, where multiple CPUs work in parallel, the case is slightly different. Total CPU seconds reported by cProfile is the sum total of time spent by all CPUs individually. For eg: In a 2 core system, if one CPU works for 10 secs. In parallel, another CPU works for 15 secs. Total CPU seconds reported will be 25 secs. Although the real time elapsed may be just 15 secs. Hence CPU time may be more than Real Time in multi core systems. As the CPUs are working in parallel