I am having some trouble understanding how to use Unix’s fork(). I am used to, when in need of parallelization, spawining threads in my application. It’s always something of the form
CreateNewThread(MyFunctionToRun());
void myFunctionToRun() { ... }
Now, when learning about Unix’s fork(), I was given examples of the form:
fork();
printf("%d\n", 123);
in which the code after the fork is “split up”. I can’t understand how fork() can be useful. Why doesn’t fork() have a similar syntax to the above CreateNewThread(), where you pass it the address of a function you want to run?
To accomplish something similar to CreateNewThread(), I’d have to be creative and do something like
//pseudo code
id = fork();
if (id == 0) { //im the child
FunctionToRun();
} else { //im the parent
wait();
}
Maybe the problem is that I am so used to spawning threads the .NET way that I can’t think clearly about this. What am I missing here? What are the advantages of fork() over CreateNewThread()?
PS: I know fork() will spawn a new process, while CreateNewThread() will spawn a new thread.
Thanks
fork()says “copy the current process state into a new process and start it running from right here.” Because the code is then running in two processes, it in fact returns twice: once in the parent process (where it returns the child process’s process identifier) and once in the child (where it returns zero).There are a lot of restrictions on what it is safe to call in the child process after
fork()(see below). The expectation is that thefork()call was part one of spawning a new process running a new executable with its own state. Part two of this process is a call toexecve()or one of its variants, which specifies the path to an executable to be loaded into the currently running process, the arguments to be provided to that process, and the environment variables to surround that process. (There is nothing to stop you from re-executing the currently running executable and providing a flag that will make it pick up where the parent left off, if that’s what you really want.)The UNIX
fork()-exec()dance is roughly the equivalent of the WindowsCreateProcess(). A newer function is even more like it:posix_spawn().As a practical example of using
fork(), consider a shell, such asbash.fork()is used all the time by a command shell. When you tell the shell to run a program (such asecho "hello world"), it forks itself and then execs that program. A pipeline is a collection of forked processes withstdoutandstdinrigged up appropriately by the parent in betweenfork()andexec().If you want to create a new thread, you should use the Posix threads library. You create a new Posix thread (pthread) using
pthread_create(). YourCreateNewThread()example would look like this:Before threads were available,
fork()was the closest thing UNIX provided to multithreading. Now that threads are available, usage offork()is almost entirely limited to spawning a new process to execute a different executable.below: The restrictions are because
fork()predates multithreading, so only the thread that callsfork()continues to execute in the child process. Per POSIX:Because any library function you call could have spawned a thread on your behalf, the paranoid assumption is that you are always limited to executing async-signal-safe operations in the child process between calling
fork()andexec().