Process: Difference between revisions
No edit summary |
|||
(3 intermediate revisions by the same user not shown) | |||
Line 10: | Line 10: | ||
* CPU [[Register|registers]] | * CPU [[Register|registers]] | ||
* The process [[stack]] | * The process [[stack]] | ||
* The address space | * The [[address space]] | ||
* etc. | * etc. | ||
Line 44: | Line 44: | ||
A process is '''scheduled''' when the OS moves it from Ready to Running. It is '''descheduled''' when it is moved from Running to Ready. This decision is made by the [[scheduler]]. | A process is '''scheduled''' when the OS moves it from Ready to Running. It is '''descheduled''' when it is moved from Running to Ready. This decision is made by the [[scheduler]]. | ||
The kernel also keeps a table of all the metadata of processes. This include PID, parent, page tables, etc. | |||
== Processes in Linux == | == Processes in Linux == | ||
When the computer starts, the only thing the kernel execute in the [[user space]] is a init command (shown with ps -f 1). This process is allocated a PID of 1 and is the only process with no parent process. | When the computer starts, the only thing the kernel execute in the [[user space]] is a init command (shown with ps -f 1). This process is allocated a PID of 1 and is the only process with no parent process. | ||
In contrast to the init process, all other processes are created by parent processes. This is done with ''fork'' and ''exec''. First, the [[shell]] will run ''fork'' to create a cloned process from the current process, and ''exec'' replaces the program in the cloned process with another program. | In contrast to the init process, all other processes are created by parent processes. This is done with ''fork'' and ''exec''. First, the [[shell]] will run ''fork'' to create a cloned process from the current process, and ''exec'' replaces the program in the cloned process with another program. | ||
The syscall ''fork'' clones a process almost perfectly. For example, the child process would not start running at ''main''; instead, it continues where the parent leaves off. The only differences between the two processes is the return value of ''fork'': In the parent it is the PID of the child, whereas in the child it is 0. This syscall is notably non-deterministic, as the CPU scheduler is free to run either process before the other. | |||
The ''wait'' syscall halt execution of the parent until the child is complete. | |||
To run a new process, the ''exec'' syscall can be used. It replaces the current process with that specified in the arguments. | |||
== Motivation of ''fork'' == | |||
The motivation behind this unintuitive set of APIs (fork, wait, exec) is to allow the parent process time to alter the environment of the child process. | |||
= Sources = | = Sources = |
Latest revision as of 18:27, 4 October 2024
A program is a passive set of machine code instructions and data stored in an executable image. A process can be thought of as this passive program in action. More formally, it is one or more threads in their own address space.
Processes are independent separate tasks. If one process crashed, it will not affect any other processes. The end goal of processes is to achieve the illusion of having multiple CPUs when executing multiple programs. This is done by virtualization.
Machine state
Representing the execution of a program, a process includes the state of the program first and foremost. This includes
- The program counter
- CPU registers
- The process stack
- The address space
- etc.
Beyond the state of the program, each process is allocated its own process identifier (PID) and virtual memory among other things.
Virtualization
The virtualization of CPU's is what allows more processes than CPU to exist. This is implemented by low-level time-sharing mechanisms. On top of these mechanisms resides scheduling policies, which decides which program should run.
API
Typical process APIs include the following:
- Create
- Destroy - Killing processes forcefully
- Wait - Suspend a process to be restarted later
- Status - Obtain current information of the process
There are many other misc. controls that are possible.
Process creation
- Code and static data are loaded into memory. In modern operating systems, this is done lazily.
- Allocate memory such as runtime stack and heap
- Perform initialization tasks, especially those related to I/O.
- Start the program
Process state
A process has three states
- Running - Its instructions are currently being executed by the CPU
- Ready - Ready to run but OS is not running it for misc. reasons (such as time-sharing)
- Blocked - Waiting some I/O to finish, such as requesting a write to disk.
A process is scheduled when the OS moves it from Ready to Running. It is descheduled when it is moved from Running to Ready. This decision is made by the scheduler.
The kernel also keeps a table of all the metadata of processes. This include PID, parent, page tables, etc.
Processes in Linux
When the computer starts, the only thing the kernel execute in the user space is a init command (shown with ps -f 1). This process is allocated a PID of 1 and is the only process with no parent process.
In contrast to the init process, all other processes are created by parent processes. This is done with fork and exec. First, the shell will run fork to create a cloned process from the current process, and exec replaces the program in the cloned process with another program.
The syscall fork clones a process almost perfectly. For example, the child process would not start running at main; instead, it continues where the parent leaves off. The only differences between the two processes is the return value of fork: In the parent it is the PID of the child, whereas in the child it is 0. This syscall is notably non-deterministic, as the CPU scheduler is free to run either process before the other.
The wait syscall halt execution of the parent until the child is complete.
To run a new process, the exec syscall can be used. It replaces the current process with that specified in the arguments.
Motivation of fork
The motivation behind this unintuitive set of APIs (fork, wait, exec) is to allow the parent process time to alter the environment of the child process.
Sources
- Linux documentation: https://tldp.org/LDP/tlk/tlk.html
- Linux processes, init, fork/exec, ps, kill, fg, bg, jobs: https://www.youtube.com/watch?v=TJzltwv7jJs
- Operating Systems, Three Easy Pieces