.hpf hyphen.local
.P1
.de PT
.tl 'SYS101 - SYSTEM CALL INTERFACE 1'\*[CH]'PD-1C301-01'
.tl 'File: sys1.c''Section 16'
.tl '''Issue 1, January 1976'
..
.2C
.ne 10
.
.LP
.LG
.B break
.SM
.sp 1n
.
.LP
.I CALL
.
.LP
break()
.sp 1n
.
.LP
.I RETURNS
.
.LP
No value is returned.
.ne 4
.sp 1n
.
.LP
.I SYNOPSIS
.
.LP
Sets the program break. Break system call.
.ne 4
.sp 1n
.
.LP
.I DESCRIPTION
.
.LP
The break system call sets the program break which is the highest address
in a program. It is used for dynamic storage allocation (alloc and free
system calls).
.
.LP
The program break may be set to increase or decrease the existing size of a
program. The single argument that is passed (indirectly in "u_arg[0]") is
the address of the new break, not an increment to be added or subtracted
from current break. Sys1.c/sbreak uses this absolute value to calculate the
difference between the old size and new size. New space is added or
subtracted from the data area ("u_dsize"). No area can be added to reen
trant text ("u_tsize) or the stack ("u_ssize") area. To insure that the
program break can be set to the new value, main.c/estabur is called to test
fit the new virtual address space and load the User Memory Management
registers.
.
.LP
If the new program break decreases the existing size, then the user's stack
is moved down in the program's physical area, so that it is in the proper
position for the, virtual address map. The slp.c/expand function is then
used to release the extra area. For break calls that increase the size of
the program, a larger physical space must- be acquired before any internal
adjustments are made. Slp.c/expand is used to increase the physical space
occupied by a process. It may roadblock the break call if there is not
enough memory available. (see slp.c/expand). When the physical area is
finally enlarged, sys1.c/sbreak moves the stack downward to the proper
position for the virtual address map and the new data area is initialized
to zero.
.sp 1m
.ne 10
.
.LP
.LG
.B exec
.SM
.sp 1n
.
.LP
.I CALL
.
.LP
exec()
.sp 1n
.
.LP
.I RETURNS
.
.LP
No value is returned.
.ne 4
.sp 1n
.
.LP
.I SYNOPSIS
.
.LP
Overlay system call.
.ne 4
.sp 1n
.
.LP
.I DESCRIPTION
.
.LP
This function performs the exec (overlay) system call. Overlaying is a
multi-step procedure in the system fraught with difficulties. The chief
hazard is that since the exec system call requires two buffers, one to hold
system arguments and one to read the program in, a deadly embrace is
possible. The risk of this is currently reduced by restricting the number
of simultaneous exec's that can be going on at one time.
.
.LP
The exec system call passes the name of the program to be executed along
with any arguments to that program. It is the responsibility of sys1.c/exec
to determine whether the program is reentrant-or not and to pass these
arguments to the program. In line with the hierarchy of processes under
UNIX, overaying programs inherit certain attributes of the antecedent
process. The chief attributes inherited are:
.IP 1. 4
The working directory is inherited.
.IP 2. 4
All open files of the preceding process are inherited.
.IP 3. 4
The age and process id are inherited. In short, all Process Table
information is inherited.
.IP 4. 4
All per process information (U block) with the exception of signals and
context information are inherited. Ignored signals are inherited but user
handled signals are not.
.
.LP
The steps in doing an overlay are:
.IP 1. 4
The system determines what program is to be executed. The first argument to
the exec system call is the name of the overlaying program. The i-node for
the file containing the program is brought into the Inode Table
(nami.c/nami). If no such named program exists, the exec is terminated and
an error is posted by nami.c/nami is returned to the user.
.IP 2. 4
In order to forestall the occurrence of a deadly embrace, the Count of the
number of processes currently doing an exec "execnt") is checked to see if
it has reached the limit NEXEC (4). If it has, the process calling for the
overlay is roadblocked until the count drops below NEXEC. (The count is
decremerited as each process finishes the exec and a wakeup is issued if
any process is roadblocked awaiting use of the exec function.) This
procedure does not absolutely prevent the occurrence of a deadly embrace,
but does decrease its probability significantly.
.IP 3. 4
Once entrance to the exec function is permitted, a system buffer is
allocated (bio.c/getblk) to hold the remaining arguments from the exec
call. The buffer is allocated to the "bfreelise queue ("b_forw", "b_back"
list), so that no device association occurs. The arguments are then fetched
from the user's address space (using mch.s/fubyte). (The exec system call
passes an gray of arguments. The array contains pointers to the actual
arguments. The arguments themselves must be ASCII strings terminated by
nulls, octal 0, or must appear to be strings terminated by nulls.) The
pointers to the actual arguments are fetched and eventually discarded. Only
the arguments themselves, separated by null characters, are placed in the
argument buffer. Only 511 characters of argument strings and their (null)
separators are allowed. If this limit is exceeded, the overlay is
terminated and the system error E2BIG is posted in "u_error". The arguments
are taken from the program in the same order that they appear in the system
call.
.IP 4. 4
The first 8 bytes of the program to be executed is read into the argument
array ("u_arg[]") in the U block. This is done in a manner similar to that
in which a core image of a process is produced (sig.c/core). The variables
"u_count", "u_offset[]", "u_segflg" and "u_base" are set to the number of
bytes for the transfer, the offset into the file (0), whether the transfer
is into the system (1) and the virtual destination address ("u_uarg[0]") in
the system before calling rdwri.c/readi. If any system errors occur in
doing this read, the exec is terminated. The header (first 16 bytes) of
every object file contains the following information:
.RS
.IP a. 4
Majic number. This indicates whether the program is reentrant (410) or not
(407) and in the future will indicate whether the program is to be
separated into I and D space (411).
.IP b. 4
Text size in bytes.
.IP c. 4
Data size in bytes.
.IP d. 4
Bss size in bytes.
.IP e. 4
The remaining 4 words contain information about the symbol table and
relocation bits (see UNIX Programmers Manual, A.OUT(V)).
.RE
.IP
These arguments are checked and regrouped. Any file not beginning with one
of the majic numbers is assumed to be unexecutable and the exec is
terminated. For nonreentr ant programs, the text and data area size are
combined in what is called the data. This insures that they will not be
separated when the prototype Memory Management registers are set up
(main.c/estabur).
.IP 5. 4
The prototype Memory Management registers are set up after rounding the
text and data sizes up to the nearest memory block (64 bytes). The
initialstack size (SSLZE) is set up for 20 memory blocks (1280 bytes). If
the program is too big or there is physically nor enough user memory on the
machine, the exec call will be terminated and the system error ENOMEM will
be posted by main.c/estabur. A check is also made to insure that no program
is updating the file which is to be overlaid. Rather than wait for the
update to occur, the exec is terminated.
.
.LP
At this point, the program has satisfied all of the criteria for being
allowed into the system. The procedure for bringing the program into memory
is as follows:
.IP 1. 4
Any text associated with the old process is freed (text.c/xfree).
.IP 2. 4
The remainder of the old process is truncated to the size of the U block,
(slp.c/expand).
.IP 3. 4
Any reentrant portion of the program is read in and a copy placed on the
swap device if there is not already a copy in the system (text.c/xalloc).
This may result in the exec call being delayed while necessary I/O occurs.
.IP 4. 4
The U block (truncated from the original process) is expanded to the size
of the data and stack. (The data size consists of data and bss for
reentrant programs and includes the size of the text for nonreentrant
programs.) The allocated memory is cleared. Because of the way memory
expansions are done, a swap may occur at this point to grow the process
size.
.IP 5. 4
The user's Memory Management registers are set up so that there is only a
data area (no stack), and the data (and bss) portion of the program is read
in. The read is accomplished by setting "u_base", "u_offsetn", "u_count"
and "u_segflg" to indicate that the read is to be done into the user's
virtual address space (starting at virtual 0), is to begin in the object
file after the 16 byte header, is to include all data (and text for
nonreentrant programs) and is to be done into the user's virtual address
space.
.IP 6. 4
The correct virtual address space is set up for the program
(main.c/estabur) , the User Memory Management registers are loaded and the
size of the process is recorded (in "p_size"). When setting the correct
virtual address space, the reentrant text is included and an initial stack
area (SSIZE) of 1280 bytes is used.
.IP 7. 4
The arguments to the overlaid program are set up. This is done by storing
the (ASCII string) arguments onto the user's stack. The argument strings
are placed on the stack in the same order that they were taken from the
calling program. Pointers to each string are loaded directly below this.
.IP 8. 4
There are 2 indicators in the file access permissions associated with each
file. These two indicators allow the user id or group id of a program to
assume those associated with the i-node rather than those of the user
executing the program. These two indicators are referred to as the
"set-user-id" and "set-group-id" bits and sys1.c/exec changes the user or
group id to the appropriate value if they are set.
.IP 9. 4
As mentioned previously, only ignored signals are inherited by the
orverlaid processes. User processed signals are reset so that to the
standard system action is taken.
.IP 10. 4
The user's registers must be zeroed (includ-ing floating point registers),
so that the overlaid process starts out with all registers initialized to
zero. Since the system and user processes use the same set of registers
(Gen-eral Register Set 0), the registers cannot be zeroed directly. Rather
the context information saved when the exec system call is made is zeroed.
The PC on the stack is set to zero so that when the overlaid program begins
executing it will start executing at virtival address zero.
.IP 11. 4
The i-node for the overlaying file can be released and the buffer used for
holding arguments can be released (bio.c/brelse) to the pool of available
buffers.
.IP 12. 4
Any processes that were roadblocked because they attempted an exec and
there were too many processes in the midst of doing an exec are awakened.
The number of processes doing an exec ("execnt") is decremented.
.sp 1m
.ne 10
.
.LP
.LG
.B exit
.SM
.sp 1n
.
.LP
.I CALL
.
.LP
exit()
.sp 1n
.
.LP
.I RETURNS
.
.LP
No value is returned.
.ne 4
.sp 1n
.
.LP
.I SYNOPSIS
.
.LP
Terminates a process, that is, makes a process a ZOMBIE.
.ne 4
.sp 1n
.
.LP
.I DESCRIPTION
.
.LP
When a process terminates, it enters the zombie state ("p_star" = SZOMB)
until a parent find it. Besides the need to pass back termination status of
children processes, the cpu time (user and system) used by the child is
accumulated by the parent. As CPU times are kept in the U block and as a
parent may have many children processes, it is convenient to keep the U
block of the deceased process around until the parent disposes of it.
.
.LP
Since the parent process may disappear from the system before any of its
children, a mechanism must exist so that the children can be disposed of.
The INIT process in the system is the pro cess that spawns the line monitor
programs. It allows user's to log on and off so it is always in the system.
(Since it is a user process though, it can be killed.) If a parent dies
before any of it's children, then INIT is made their parent and will
dispose of them when they terminate.
.
.LP
The steps in the termination of a process are:
.IP 1. 4
All of the process's signals are reset so that they are ignored. This is
done since subsequent steps may require the process to roadblock, at which
time a signal might be caught.
.IP 2. 4
All of the files that the process had open are closed.
.IP 3. 4
The i-node corresponding to the working directory is released from the
Inode Table.
.IP 4. 4
Any reentrant text is abandoned (text.c/xfree).
.IP 5. 4
Space is allocated on the swap device to place the zombie. The zombie is
only 256 bytes in size, but 8 times that much space is allocated to reduce
fragmentation. This is done because although most processes dispose of
zombies quickly (by waiting for them), a zombie remains in the system until
a parent finds it. Overallocating space on the swap device reduces
fragmentation (hopeful-ly).
.IP 6. 4
A system buffer is obtained to copy the first 512 bytes of the U block
into. Since the relevant information in the U block is in the first 512
bytes ("u" array), buffered I/O can be used rather than doing physical I/O
to the swap device. A synchronous buffered write is performed to insure
that the data reaches the swap device and does not linger in the I/O
subsystem.
.IP 7. 4
The memory occupied by the process is freed and the process state is
changed to that of a zombie ("p_stat" = SZOMB), and the address of the
process on the swap device is set up ("p_addr"). The data and stack areas
could have been freed (malloc.c/mfree) before the copy was made in 6, but
the U block would then have to be freed and fragmentation in memory would
probably be increased.
.IP 8. 4
Arrangements are made for the disposition of the terminating process and
all of it's children.
.RS
.IP a. 4
The Process Table is searched for the parent of the terminating process. If
the parent is found, INIT is awakened first_This is done because INIT will
inherit all of the terminating process's children as its children. As some
of these children may already be deceased, INIT can dispose of them. A
wakeup (slp.c/wakeup) is also issued to the parent process. The. Process
Table is rescanned and all of the children of the terminating process are
made children of INIT (process number 1). Finally, the processor is
relinquished by the terminated process.
.IP b. 4
If the parent of the terminating process cannot be found, then the
terminating process is made a child of-INIT and the procedure in a is
repeated.
.IP
If the INIT is somehow destroyed in the Process Table, a system panic
occurs ("PANIC NO INIT PROCESS"). INIT may be killed and become a zombie
itself, however, since there will no longer be a process to remove zombies
as described above they will accumulate in the system.
.RE
.sp 1m
.ne 10
.
.LP
.LG
.B fork
.SM
.sp 1n
.
.LP
.I CALL
.
.LP
fork()
.sp 1n
.
.LP
.I RETURNS
.
.LP
Posts an error if there is not room in the system to create a new process.
Also, returns the identity of the created child process to the parent and
the identity of the parent to the child.
.ne 4
.sp 1n
.
.LP
.I SYNOPSIS
.
.LP
Fork system call. Creates a new process in the system.
.ne 4
.sp 1n
.
.LP
.I DESCRIPTION
.
.LP
Fork is the mechanism by which a new process enters the system. It creates
a mirror image copy of the process making the fork call.
.
.LP
Most of the work in creating the mirror image process is done by
slp.c/newproc. This function creates the new image and sets up a new
Process Table entry for the child. After it completes it's work, there will
be two processes in the system which will return from the slp.c/newproc
function. The parent will return directly from slp.c/newproc, returning a
zero, however, the child will return through the process Switcher
(slp.c/swtch) and will return a one. Sys1.c/fork uses this distinction to
allow it to perform some additional initialization for the child. In
particular, it zeroes the cumulative user and system times ("u_utime" and
"u_stime") of the child and the cumulative user and system time of the
child's children ("u_cutime[]" and "u_cstime[]"). In addition, it returns
the ID of the parent to the child. (The actual C library interface for the
fork system call causes a zero to be returned to the child.) For the parent
process, sys1.c/fork returns the ID of the child and advances the PC for
the parent process (on the stack frame), so that a different point in the C
library is entered.
.
.LP
Before starting the creation of the child, sys1.c/fork scans the Process
table to see if there is a slot available. If there is none, then an error
EAGAIN is posted.
.sp 1m
.ne 10
.
.LP
.LG
.B rexit
.SM
.sp 1n
.
.LP
.I CALL
.
.LP
rexit()
.sp 1n
.
.LP
.I RETURNS
.
.LP
No value is returned.
.ne 4
.sp 1n
.
.LP
.I SYNOPSIS
.
.LP
The exit system call.
.ne 4
.sp 1n
.
.LP
.I DESCRIPTION
.
.LP
This is the System Entry point corresponding to the exit system call. The
exit system call can pass a one byte status indicator as an argument.
Sys1.c/rexit saves this value (passed in R0) in "u_arg[0]" for convenience
and calls sys1.c/exit to terminate the process.
.sp 1m
.ne 10
.
.LP
.LG
.B wait
.SM
.sp 1n
.
.LP
.I CALL
.
.LP
wait 0
.sp 1n
.
.LP
.I RETURNS
.
.LP
A system error is posted if a process waits, but has no children.
.ne 4
.sp 1n
.
.LP
.I SYNOPSIS
.
.LP
The wait system call. Wait for a child process to die.
.ne 4
.sp 1n
.
.LP
.I DESCRIPTION
.
.LP
When a process dies, it becomes a zombie until the parent finds and
disposes of it. If a parent does not wait for the child, then the zombie
will remain in the system for the lifetime of the parent. If the parent
leaves the system before the child dies, then the child is made a child of
the INIT process.
.
.LP
Making a process a zombie allows status and execution time of the child to
be examined. In particular, the exit status of the deceased child is passed
back to the parent and the cumulative execution time (user and system time)
of the deceased child and all of it's children is added to that of the
parent.
.
.LP
The wait system call does not wait for the death of a particular child
process. Rather, the call returns when the first child process terminates.
.
.LP
When sys1.c/wait is called, a linear search of the Process Table is done.
If the process making the wait system call has no children then a system
error (ECHILD) is posted (in "u_error") and the wait is terminated. If the
process does have children, but none are yet deceased, the process is
roadblocked (at low priority, "p_pri" = PWAIT).
.
.LP
when a child does ,terminate the roadblocked parent is reawakened and the
linear search is repeated.
.
.LP
When a zombie child is found, the following steps are performed to dispose
of it.
.IP 1. 4
The process ID of the deceased child is returned in register R0 and the
status of the dead child is returned in register R1. The status consists of
two bytes. The low order byte contains the number of any signal that may
have been received by the process to cause termination and an indication of
whether a core image was produced. The high order byte contains any status
inforrnation returned by the child.
.IP 2. 4
The U block of the zombie is read into one of the system buffers by doing a
synchronous buffered read (bio.c/bread) from the swap device and the area
occupied by the zombie on the swap area is freed. It should be remembered
that this area was overallocated (8 disk blocks rather than one) to reduce
fragmentation on the swap area.
.IP 3. 4
The Process Table entry of the deceased child is cleared. Only the
important entries are zeroed; "p_stat" - process state; "p_pid" - process
ID; "pppid" - parent ID; "p_sig" - indication of pending signal; "p_ttyp" -
controlling teletype; "p_flag" - location of process and flag indicators
for process.
.IP 4. 4
The execution of the deceased child and all of it's children is added to
that of the parent
.
.LP
Associated with every process there are several time values kept.
.IP 1. 4
"u_utime" - This is a one word entry which records the cumulative user CPU
time of the process. (that is, time actually spent executing the user's
program). The entry is kept in sixtieths of a second and is only an
approximation since it cannot discount any interrupt handling processed
between clock ticks (1/60 second).
.IP 2. 4
"u_stime" - This one word entry records the cumulative system CPU time used
by the process. It records all of the time spent by the system in handling
system calls for the process and is subject to the same limitations as 1.
.IP 3. 4
"ti_cutimen" - This is a two word entry (long integer) used to record the
cumulative user time of all children of the process.
.IP 4. 4
"u_cstimen" - This is a two word entry used to record the cumulative system
time of all children processes.
.
.LP
When a process terminates, the (user and system) execution time of the
deceased are added to the cumulative times ("u_cutime[]" and "u_cstime[]")
of the parent. That is, the parent's times are adjusted as follows:
.
.LP
"u_cutime[]" = "u_utime"+"u_cutime[]"
.
.LP
"u_cstime[]" = "u_stime"+"u_cstime"
.
.LP
In both of these equations, values on the left correspond to parent times,
while all values on the right correspond to child times.
.
.LP
The system buffer containing the zombie U block is released to the buffer
pool (bio.c/brelse).
