I have several questions regarding Python threads.
- Is a Python thread a Python or OS implementation?
- When I use htop a multi-threaded script has multiple entries – the same memory consumption, the same command but a different PID. Does this mean that a [Python] thread is actually a special kind of process? (I know there is a setting in htop to show these threads as one process –
Hide userland threads) - Documentation says:
A thread can be flagged as a “daemon thread”. The significance of this
flag is that the entire Python program exits when only daemon threads
are left.
My interpretation/understanding was: main thread terminates when all non-daemon threads are terminated.
So python daemon threads are not part of Python program if “the entire Python program exits when only daemon threads are left”?
Python threads are implemented using OS threads in all implementations I know (C Python, PyPy and Jython). For each Python thread, there is an underlying OS thread.
Some operating systems (Linux being one of them) show all different threads launched by the same executable in the list of all running processes. This is an implementation detail of the OS, not of Python. On some other operating systems, you may not see those threads when listing all the processes.
The process will terminate when the last non-daemon thread finishes. At that point, all the daemon threads will be terminated. So, those threads are part of your process, but are not preventing it from terminating (while a regular thread will prevent it). That is implemented in pure Python. A process terminates when the system
_exitfunction is called (it will kill all threads), and when the main thread terminates (orsys.exitis called), the Python interpreter checks if there is another non-daemon thread running. If there is none, then it calls_exit, otherwise it waits for the non-daemon threads to finish.The daemon thread flag is implemented in pure Python by the
threadingmodule. When the module is loaded, aThreadobject is created to represent the main thread, and it’s_exitfuncmethod is registered as anatexithook.The code of this function is:
This function will be called by the Python interpreter when
sys.exitis called, or when the main thread terminates. When the function returns, the interpreter will call the system_exitfunction. And the function will terminate, when there are only daemon threads running (if any).When the
_exitfunction is called, the OS will terminate all of the process threads, and then terminate the process. The Python runtime will not call the_exitfunction until all the non-daemon thread are done.All threads are part of the process.
Your understanding is incorrect. For the OS, a process is composed of many threads, all of which are equal (there is nothing special about the main thread for the OS, except that the C runtime add a call to
_exitat the end of themainfunction). And the OS doesn’t know about daemon threads. This is purely a Python concept.The Python interpreter uses native thread to implement Python thread, but has to remember the list of threads created. And using its
atexithook, it ensures that the_exitfunction returns to the OS only when the last non-daemon thread terminates. When using “the entire Python program”, the documentation refers to the whole process.The following program can help understand the difference between daemon thread and regular thread:
If you execute this program with the ‘–use_daemon’, you will see that the program will only print a small number of
Working hardlines. Without this flag, the program will not terminate even when the main thread finishes, and the program will printWorking hardlines until it is killed.