Friday, February 5, 2010

UNIX QA

OPERATING SYSTEM
1. What is an Operating System?
The operating system can be viewed as a set of software programs normally supplied along with the
hardware for the effective and easy use of the machine. The two benefits that enhance its utilities are
provision of ‘Security/Confidentiality’ of the information to users and elimination of duplicate
efforts by hundreds of programmers in developing tedious and complicated routines.
2. What is a kernel?
In computing, the kernel is the central component of most computer operating systems (OS). Its
responsibilities include managing the system's resources and the communication between hardware
and software components. As a basic component of an operating system, a kernel provides the
lowest level of abstraction layer for the resources (especially memory, processors and I/O devices)
that applications must control to perform their function. It typically makes these facilities available
to application processes through inter-process communication mechanisms and system calls. These
tasks are done differently by different kernels, depending on their design and implementation. While
monolithic kernels will try to achieve these goals by executing all the code in the same address space
to increase the performance of the system, microkernel run most of their services in user space,
aiming to improve maintainability and modularity of the code base. A range of possibilities exists
between these two extremes.
3. What is a monolithic kernel?
A monolithic kernel is a kernel architecture where the entire kernel is run in kernel space in
supervisor mode. In common with other architectures (microkernel, hybrid kernels), the kernel
defines a high-level virtual interface over computer hardware, with a set of primitives or system calls
to implement operating system services such as process management, concurrency, and memory
management in one or more modules.
4. What are the basic responsibilities of the kernel?
The kernel's primary purpose is to manage the computer's resources and allow other programs to run
and use these resources. Typically, the resources consist of:
The CPU (frequently called the processor). This is the most central part of a computer system,
responsible for running or executing programs on it. The kernel takes responsibility for deciding at
any time which of the many running programs should be allocated to the processor or processors
(each of which can usually run only one program at once)
The computer's memory. Memory is used to store both program instructions and data. Typically,
both need to be present in memory in order for a program to execute. Often multiple programs will
want access to memory, frequently demanding more memory than the computer has available. The
kernel is responsible for deciding which memory each process can use, and determining what to do
when not enough is available.
Any Input/Output (I/O) devices present in the computer, such as disk drives, printers, displays, etc.
The kernel allocates requests from applications to perform I/O to an appropriate device (or
subsection of a device, in the case of files on a disk or windows on a display) and provides
convenient methods for using the device (typically abstracted to the point where the application does
not need to know implementation details of the device)
Kernels also usually provide methods for synchronization and communication between processes
(called inter-process communication or IPC).
A kernel may implement these features itself, or rely on some of the processes it runs to provide the
facilities to other processes, although in this case it must provide some means of IPC to allow
processes to access the facilities provided by each other.
Finally, a kernel must provide running programs with a method to make requests to access these
facilities.
5. What is a microkernel?
A microkernel is a minimal computer operating system kernel providing only basic operating system
services (system calls), while other services (commonly provided by kernels) are provided by userspace
programs called servers. Commonly, microkernels provide services such as address space
management, thread management, and inter-process communication, but not networking or display
for example.
Basic overview of the system architecture.
As shown in the figure.
6. What is process management?
The main task of a kernel is to allow the execution of applications and support them with features
such as hardware abstractions. To run an application, a kernel typically sets up an address space for
the application, loads the file containing the application's code into memory (perhaps via demand
paging), sets up a stack for the program and branches to a given location inside the program, thus
starting its execution.
7. What is memory management?
Memory management is the act of managing computer memory. In its simpler forms, this involves
providing ways to allocate portions of memory to programs at their request, and freeing it for reuse
when no longer needed.
8. What is device management?
To perform useful functions, processes need access to the peripherals connected to the computer,
which are controlled by the kernel through device drivers. For example, to show the user something
on the screen, an application would make a request to the kernel, which would forward the request to
its display driver, which is then responsible for actually plotting the character/pixel.
System calls.To actually perform useful work, a process must be able to access the services provided
by the kernel. This is implemented differently by each kernel, but most provide a C library or an
API, which in turn invoke the related kernel functions.
The method of invoking the kernel function varies from kernel to kernel. If memory isolation is in
use, it is impossible for a user process to call the kernel directly, because that would be a violation of
the processor's access control rules. A few possibilities are:
Using a software-simulated interrupt. This method is available on most hardware, and is therefore
very common.
Using a call gate. A call gate is a special address which the kernel has added to a list stored in kernel
memory and which the processor knows the location of. When the processor detects a call to that
location, it instead redirects to the target location without causing an access violation. Requires
hardware support, but the hardware for it is quite common.
Using a special system call instruction. This technique requires special hardware support, which
common architectures (notably, x86) may lack. System call instructions have been added to recent
models of x86 processors, however, and some (but not all) operating systems for PCs make use of
them when available.
Using a memory-based queue. An application that makes large numbers of requests but does not
need to wait for the result of each may add details of requests to an area of memory that the kernel
periodically scans to find requests.
9.What is a file system?
In computing, a file system is a method for storing and organizing computer files and data they
contain to make it easy to find and access them.
10.What is context switching?
A context switch is the computing process of storing and restoring the state (context) of a CPU such
that multiple processes can share a single CPU resource. The context switch is an essential feature of
a multitasking operating system. Context switches are usually computationally intensive and much
of the design of operating systems is to optimize the use of context switches. A context switch can
mean a register context switch, a task context switch, a thread context switch, or a process context
switch. What constitutes the context is determined by the processor and the operating system.
11.What is multitasking?
Most commonly, within some scheduling scheme, one process needs to be switched out of the CPU
so another process can run. Within a preemptive multitasking operating system, the scheduler allows
every task to run for some certain amount of time, called its time slice. If a process does not
voluntarily yield the CPU (for example, by performing an I/O operation), a timer interrupt fires, and
the operating system schedules another process for execution instead. This ensures that the CPU
cannot be monopolized by any one processor-intensive application.
12.What is User and kernel mode switching?
When a transition between user mode and kernel mode is required in an operating system, a context
switch is not necessary; a mode transition is not by itself a context switch. However, depending on
the operating system, a context switch may also take place at this time.
13.What is Inter-process Communication?
Inter-Process Communication (IPC) is a set of techniques for the exchange of data between two or
more threads in one or more processes. Processes may be running on one or more computers
connected by a network. IPC techniques are divided into methods for message passing,
synchronization, shared memory, and remote procedure calls (RPC). The method of IPC used may
vary based on the bandwidth and latency of communication between the threads, and the type of data
being communicated.
14. What is race condition?
A race condition or race hazard is a flaw in a system or process whereby the output of the process is
unexpectedly and critically dependent on the sequence or timing of other events. The term originates
with the idea of two signals racing each other to influence the output first. Race conditions can occur
in poorly-designed electronics systems, especially logic circuits, but they can and often do also arise
in computer software.
15. What is critical section?
In computer programming a critical section is a piece of code that accesses a shared resource (data
structure or device) that must not be concurrently accessed by more than one thread of execution. A
critical section will usually terminate in fixed time, and a thread, task or process will only have to
wait a fixed time to enter it. Some synchronization mechanism is required at the entry and exit of the
critical section to ensure exclusive use, for example a semaphore.
16. What is atomicity?
An atomic operation in computer science refers to a set of operations that can be combined so that
they appear to the rest of the system to be a single operation. Usually the critical sections must be
atomic.
17. What is a deadlock?
A deadlock is a situation wherein two or more competing actions are waiting for the other to finish,
and thus neither ever does. It is often seen in a paradox like 'the chicken or the egg'. In the
computing world deadlock refers to a specific condition when two or more processes are each
waiting for another to release a resource, or more than two processes are waiting for resources in a
circular chain.
18. What are the prerequisites for a deadlock?
There are four necessary conditions for a deadlock to occur, known as the Coffman conditions from
their first description in a 1971 article by E. G. Coffman.
1. Mutual exclusion condition: a resource is either assigned to one process or it is available
2. Hold and wait condition: processes already holding resources may request new resources
3. No preemption condition: only a process holding a resource may release it
4. Circular wait condition: two or more processes form a circular chain where each process waits
for a resource that the next process in the chain holds
19. What is dynamic memory allocation?
In computer science, dynamic memory allocation is the allocation of memory storage for use in a
computer program during the runtime of that program. It is a way of distributing ownership of
limited memory resources among many pieces of data and code. A dynamically allocated object
remains allocated until it is de-allocated explicitly, either by the programmer or by a garbage
collector; this is notably different from automatic and static memory allocation. It is said that such an
object has dynamic lifetime.
20. What is garbage collection?
In computer science, garbage collection (also known as GC) is a form of automatic memory
management. The garbage collector or collector attempts to reclaim garbage, or memory used by
objects that will never again be accessed or mutated by the application. Garbage collection was
invented by John McCarthy around 1959 to solve the problems of manual memory management in
his Lisp programming language.
21. What is paging?
In computer operating systems, paging memory allocation (also called memory address translation)
algorithms divide computer memory into small partitions, and allocate memory using a page as the
smallest building block.
22.What is demand paging?
In computer operating systems, demand paging is a simple method of implementing virtual memory.
In a system that uses demand paging, the operating system copies a page into physical memory only
if an attempt is made to access it (i.e., if a page fault occurs). It follows that a process begins
execution with none of its pages in physical memory, and many page faults will occur until most of a
process's working set of pages is located in physical memory.
23.What is segmentation?
Segmentation means that a part or parts of the memory will be sealed off from the currently running
process, through the use of hardware registers. If the data that is about to be read or written to is
outside the permitted address space of that process, a segmentation fault will result.
24. What are/is the following :
contiguous memory management - Please refer Achyut S Godbole, Pg – 296
fixed partitioned memory management -Please refer Achyut S Godbole, Pg – 298
Variable partitions - Please refer Achyut S Godbole, Pg – 310
Non-Contiguous Allocation - Please refer Achyut S Godbole, Pg – 318
25. What is MMU?
MMU, short for memory management unit, is a class of computer hardware components responsible
for handling memory accesses requested by the CPU. Among the functions of such devices are the
translation of virtual addresses to physical addresses (i.e., virtual memory management), memory
protection, cache control, bus arbitration, and, in simpler computer architectures (especially 8-bit
systems), bank switching.
File Management
1. What are the responsibilities of File System ?
The file subsystem is responsible for managing files, allocating file space, administers free space,
controlling the access to files & retrieving the data for users.
2. What is a buffer ?
A buffer is in-memory copy of the disk block.
3.What are the different parts of the buffer?
Each buffer has two parts: a memory array that contains data from the disk & a buffer header (next
slide)that identifies the buffer.
4.What does a buffer header contain ?
The buffer header maintains a status field, which contains the following.
1. The buffer is locked or busy.
2. The buffer contains valid data.
3. The buffer is marked “delayed write”.
4. The kernel is currently reading or writing the contents to the disk.
5. The process is currently waiting for the buffer to become free.
5.What is the algorithm used for buffer allocation?
The Kernel catches data in the buffer pool according to a least recently used algorithm.
6.What is the data structure used by buffer cache?
Hash queue.
7. What are the different scenarios the kernel follows to allocate a buffer for a disk block?
1.The kernel finds the block on its hash queue, and its buffer is free.
2.The kernel can’t find the block on the hash queue, (allocates a buffer from the free list).
3.Finds the buffer in the free list, and has been marked “delayed write”. The kernel writes the buffer
to the disk and allocates another buffer.
4.The free list of buffers is empty (blocked).
5.The finds the block on the hash queue, but its buffer is currently busy (blocked).
8. What happens when a buffer is released?
The waiting processes are woken up, for this buffer, or any buffer to become free.
The buffer content, if valid it is put at the end of the free list, else it is put at the beginning of the free
list.
9. What are the advantageous and disadvantages of buffer cache?
Advantages
The parts disk that does the I/O, has the same
interface irrespective of the underlying device.
Programmer need not worry about word
alignment and can write hardware independent
code. Helps in reducing disk traffic, thereby
increasing system throughput and reducing the
response time. Maintains file system integrity.
If two processes simultaneously attempt to write
to the same disk block, BMS serializes their
access.
Disadvantages
Since the kernel does not immediately write
data to the disk (“delayed write”), the system is
vulnerable to crashes that can leave the disk
data in an incorrect state.
Use of buffer cache requires an extra data copy,
when transmitting data to and fro between the
user program and the kernel. This affects
performance mainly when there is large amount
of data to be transmitted.
10. What is a file system ?
A file system is a logical method for organizing and storing large amounts of information in a way
that makes it easy manage.
11. What are the different types of files in unix ?
Ordinary files Directories Device Special files
Pipes Sockets ( Unix Domain Sockets) Symbolic links
12. What are the different blocks of the File System ?
Boot block, Super block, Inode block, Data block.
13.What is hole in a file ?
Files may have holes in them, which are created by moving pointer past file end and then writing
data.
14.What are the contents of the boot block ?
The Boot block is the beginning of a file system typically the first sector and may contain the
bootstrap code that is read into the machine to boot, or initialize the Operating system. Although
only one boot block is needed to boot the system, every file system has a possibly empty boot block.
15. What are the contents of the super block?
The super block describes the state of a file system: The super block consists of the following fields:
1. Size of the file system
2. No. of free blocks in the file system
3. A list of free blocks available on the file system
4. Index of the next free block in the free block list
5. Size of the I- node list
6. No. of free I-nodes in the file system.
7. Index of the next free I-node in the file system
16. What are the contents of the inlode?
Each I-node consists of the following information
File ownership
File type
File Access permissions
Creation time
Modification Time
Time of last access
Number of links to a file representing the number of names the file has
File size
Array of 13 pointers to file
17. What are the file data structures?
1. User File Descriptor Table.
2. File Table.
3. Inode table
18. What is the relationship between the file data structures?
19. When does a file get deleted?
A file gets deleted when the no of links (hard link) becomes 0.
20.What is a hard link ?
File has one inode but may have several names or links.
21 .What are the two types of links ?
Hard link and symbolic link.
22.What is the command used to create the hard link and symbolic link.
ln is the command to create hard link and
ln –s is the command to create symbolic link.
23.What is a symbolic link ?
Soft or symbolic link to file is implemented as file containing absolute or relative pathname.
24. What are the differences between hard link and symbolic link ?
Symlinks
Symlinks are distinctly different from normal
files, so we can distinguish a Symlink from the
original it points to.Symlink can point to any
type of file (normal file, directory, device file,
symlink, etc.)
Symlinks refer to names, so they can point to
files on other filesystems Conversely, if you
rename or delete the original file pointed to by a
symlink, the symlink gets broken Symlinks may
take up additional disk space (to store the name
pointed to)
Hard links
Multiple hard-link style names for the same file
are indistinguishable; the term ‘hard link’ is
merely conventional Hard links may not point
to a directory (or, on some non-Linux systems,
to a symlink)
Hard links work by inode number, so they can
only work within a single filesystem
Renaming or deleting the ‘original’ file pointed
to by a hard link has no effect on the hard link
Hard links only need as much disk space as a
directory entry
25. How does the kernel determine if the inode is free ?
An inode is free if its file type field is zero.
26 .What is the namei alogorith?
Converts a user-level path name to an inode.
27. What is an inode ?
An inode contains the information necessary for a process to access a file. It exists in static form on
the disk .
28. What is the incore inode ?
The in-memory copy of the disk inode is called in-core inode. The & the kernel makes a copy of the
disk in memory when the processes access files.
29.What are the contents of incore inode?
The In-core I-node contains the usual info of a disk inode in addition to the following:
Status inode locked / process waiting for I-node to be unlocked / recent changes made to file data /
changes in the inode data itself.
Logical device # of file system
The inode no(not reqd. for disk copy)
Pointers to other in-core inodes
Reference Count indicating the no. of instances of the file
30. When is the incore inode released ?
Inodes are released, when the incore reference count drops to zero.
31.What is a directory ?
A directory is a file whose data is a sequence of entries, each consisting of an inode number and the
file name .
32. What are the two files contained in every directory
Every directory contains the filenames . And ..
Contains the inode no. of the current directory and
Contains the inode no. of the parent directory.
inode no. 0 indicates that the directory entry is empty
The program mkfs initializes a FS( file system) so that . And .. of the root directory have the root
inode number of the files system.
33. What is the data structure used by the super block to hold free inodes and disk blocks.
Inodes are held in array and disk block in list.
34. What are the different types of files in unix.
A file in Unix can be
A network connection
A FIFO queue
A pipe
A terminal
A real on-the-disk file
Or just about anything else
35.What are the system calls which return file descriptor ?
open() creat() dup() pipe() socket()
accept()
36. What is a file descriptor ?
It is an unsigned integer. Stored in user file descriptor table.
37. What are the standard descriptors ?
0 stdin, 1 stdout, 2 stderr.
38.What is the difference between open() and create().
open system call can be used either to create a new file or open an existing file. But create system
call is used to create new files.
PROCESS MANAGEMENT
1. What is a process?
An executing instance of a program is called as process.
2. What are the components of a UNIX process context?
The process context can be viewed as a combination of following components:
User level: The processes address space such as text, data, stack etc.
Register level: The cpu register contents used by process while executing.
System level context: The kernel data structures
Process table entry
U-area
P-region table entry
Region table entry
Kernel stack
3. What are the kernel data structures maintained for process management?
The following kernel data structures are maintained:
Process table entry
U-area
P-region table entry
Region table entry
4. What are the different states of a process?
A UNIX process has following nine states:
Create state Ready to run in memory Ready to run swapped
Kernel running User running Preempted
Asleep in memory Asleep swapped Zombie
5. What are the two modes of execution of a unix process?
The two modes of execution for a unix process are User running mode and kernel running mode.
6. What is the difference between the two modes of execution?
Following are the differences:
- In Kernel running mode the process can access both user as well as Kernel address space where
as in user mode the process can access only user address space.
- In kernel running mode a process is non – preemptible whereas in User running mode it is
preemptible.
7. When is context switch possible in unix OS?
Context switch can occur in following situations:
- When a process returns from kernel running to user running Mode.
When a process terminates.
When a process goes to sleep state.
8. Why is kernel mode of execution used by UNIX processes?
The UNIX kernel maintains kernel mode of execution in order to safe guard system resources and
make sure that users can access them only via kernel. The kernel mode is non pre-emptible in order
to preserve the consistency of system resources.
9. What do you mean by u-area (user area) or u-block?
U-area is a kernel data structure maintained for every process in the system. It contains private data
associated with process that is manipulated only by the Kernel.
10. Brief about the initial process sequence while the system boots up.
While booting, the first process is created with Process-ID 0 and is called as swapper. The swapper
manages memory allocation for processes and influences CPU allocation. It creates several
processes to maintain system, one of which is called init process created with pid 1. The init process
has a special relationship with the other processes. It reads system files to properly start the system.
It forks and execs getty program per terminal. When a user logs in to a terminal, the corresponding
getty execs the login program and if user name and password are correct then it execs the default
shell. When user logs out the init immediately spawns another getty program to monitor the
terminal.
11. What are various IDs associated with a process?
Every process is associated with following identification numbers:
- Process id: UNIX identifies each process with a unique integer called ProcessID.
- Parent process id: The process that executes fork() to creat the new process is called the 'parent
process' whose PID is 'Parent Process ID'.
- User’s id: Every process has a owner who has privileges over the process. The identification for the
user is 'UserID'. Owner is the user who executes the process.
- Effective uid: Process also has 'Effective User ID' which determines the access privileges for
accessing resources like files.
The following system calls return the different identification numbers:
getpid() -process id
getppid() -parent process id
getuid() -user id
geteuid() -effective user id
12. Explain fork() system call.
The `fork()' system calls is used to create a new process from an existing process. The new process
is called the child process, and the existing process is called the parent. It creates the new process by
making a new entry in process table. The new process gets copies of parent’s address space and uarea.
Thus child inherits same group, session, ownership etc. The fork() system call returns twice, to
the parent it returns the newly created process’s pid and to the child process it returns zero.
13. What will be the output of the following program code?
main()
{
fork();
printf("Hello World!");
}
Hello World!Hello World!
The fork creates a child that is a duplicate of the parent process. The child begins from the fork().All
the statements after the call to fork() are executed twice.(once by the parent process and other by
child). The statement before fork() is executed only by the parent process.
14. Predict the output of the following program code
main()
{
fork(); fork(); fork();
printf("Hello World!");
}
"Hello World" will be printed 8 times.
Total number of processes: 2^n times where n is the number of calls to fork(). Child processes
created are 2^n-1.
15. Enlist the attributes which a child process inherits from its parent.
The following attributes are inherited by the child process from its parent:
- real user id, real group id, effective user id, effective group id
- process group id
- process session id
- controlling terminal
- setuid flag and set group id flag
- current working directory
- file mode creation mask (umask)
- signal mask
- close on exec flags for any open file descriptors
- environment
- attached shared memory segments
- resource limits
16. Enlist the attributes which child process doesn’t inherit from its parent.
Following attributes are not inherited from the parent process:
return value from fork() process id parent process id
time stamps are cleared file locks pending alarms are cleared
set of pending signals is cleared
17. List the system calls used for process management:
System calls Description
fork() To create a new process
exec() To execute a new program in a process
wait() To wait until a created process completes its execution
exit() To exit from a process execution
getpid() To get a process identifier of the current process
getppid() To get parent process identifier
nice() To change the existing priority of a process
brk() To increase/decrease the data segment size of a process
18. How can you get/set an environment variable from a program?
Getting the value of an environment variable is done by using `getenv()'.
Setting the value of an environment variable is done by using `putenv()'.
19. Differentiate between fork() and vfork() system call?
During the fork() system call the Kernel makes a copy of the parent process’s address space and
attaches it to the child process.
But the vfork() system call doesn’t copy parent’s address space instead it uses the same address
space for both parent and child processes hence is faster than the fork() system call. Another
difference is that after fork() the sequence of execution between parent and child can not be
predicted whereas vfork() suspends the parent process unless child calls either exec or exit functions.
Therefore child always executes first. Upon call to any of the two functions the child gets its own
copy of the address space.
20. What are the possible ways for a process to terminate?
There are five ways for a process to terminate:
1. Normal termination
- return from main (calls exit implicitly)
- calling exit
- calling _exit
2. Abnormal termination
- calling abort
- terminated by a signal
Regardless of how a process terminates, the same code in the kernel is eventually executed by
calling _exit(). This kernel code closes all the open file descriptors for the process, releases memory
that it was using etc.
21. What are exit status and termination status of a process?
It is required that a terminating process be able notify its parent how it terminated, this is achieved
by the exit and termination status. The exit status is passed as an argument to the return(), exit() and
_exit() functions which is an integer value. The exit status is converted to termination status when
_exit is finally called.
In the case of an abnormal termination the kernel generates a termination status to indicate the
reason for the abnormal termination.
In any case the termination status can be obtained by the parent by calling either wait() or waitpid()
to analyse the reason for child process’s termination.
22.What is the difference between exit() and _exit()?
exit() is a function defined by ANSI C and includes terminating the calling process by calling exit
handlers, flushing and closing all standard I/O streams. Since ANSI C doesn’t deal with file
descriptors, multiple processes and job control, the definition of exit is incomplete for a UNIX
system .
_exit() is a system call defined by Unix system. It is called by exit function and handles the UNIX
specific details to terminate the calling process for example closing all the open file descriptors for
the process, releasing memory that it was using etc.
23.What is a zombie?
When a program forks and the child finishes before the parent, the kernel still keeps some minimal
information about the child in case the parent might need it. Minimally, this information consists of
the child’s process id, the termination status of the process and the amount of CPU time taken by the
process. The kernel discards all the memory used by the process and closes all open files. In UNIX
terminology such a process is called as a Zombie process. To be able to get this information, the
parent calls `wait()'; In the interval between the child terminating and the parent calling `wait()', the
child is said to be a `zombie' (If you do `ps', the child will have a `Z' in its status field to indicate
this.)
24. Explain wait().
the prototype of the system call is :
int wait (int *)
wait() is called by the a process in order to collect its zombie child processes. There are three
possibilities:
- if the calling process doesn’t has any child processes then the system call returns with -1 to indicate
error.
- if the calling process has any zombie child processes then the exit status of any one of them is
written in the address passed to system call as an argument, the zombie is removed and the system
call returns with the pid of the zombie process whose entry has been collected.
- if the calling process has any child process which hasn’t terminated yet then the system call blocks
the calling process till atleast one the child process terminates. Upon termination of a child process it
returns as described in previous case.
25.What Happens when a program is executed on a UNIX system?
When a program is executed on a UNIX system, the system creates a special environment for that
program. This environment contains information needed by the system to run the program as if no
other program were running on the system. Each process has process context, which is everything
that is unique about the state of the program currently running. Every time a program is executed the
currently running shell does a fork, which performs a series of operations to create a process context
and then execute your program in that context. The steps include the following:
- Allocate a slot in the process table, a list of currently running programs kept by UNIX.
- Assign a unique process identifier (PID) to the process.
- Copy the context of the parent, the process that requested the spawning of the new process.
- Return the new PID to the parent process. This enables the parent process to examine or control the
process directly.
After the fork is complete, UNIX loads the program into the address space of the newly created
process by calling any of the exec family functions.
26. How is a program loaded into the memory?
In Unix system, a program is loaded into the system memory by calling any of the exec family
functions which call the execve() system call. This system call overlays the address space of the
calling process with the text, data and stack of the new program which is specified to the system call
as first argument. The reference count of current region table entries is decremented and memory
pages of current address space are released if the count drops to zero. New region table entries are
made and new pages are allocated to load the given program. The same process now executes this
program from first instruction.
27. What is the return value of exec functions?
The exec family functions don’t return on success, because the address space from which the
function call was made doesn’t exist anymore and the process has been loaded with a different
address space.
On error, the system call returns -1.
28 .Predict the output of following code.
main() {
execlp( “ls”, “ls”, (char *) 0);
fork();
execlp( “ls”, “ls”, (char *) 0);
}
The output of above program is list of current working directory only once. The fork and second call
to exec don’t execute because the current address space is loaded with ls program when first exec is
executed by the process. Since the original text doesn’t exist anymore any instructions after a
successful call to exec are never executed.
29. What is a process group and session?
A process group is a collection of one or more processes. Each process group has a unique process
group id. A process by default inherits the process group from its parent. Each process group can
have a process group leader. The leader is identified by having its process group id equal to its
process id. The function getpgrp() returns the process group id of the calling process. For example in
the command line ls|wc the two processes belong to the same process group.
The knowledge of group can be used to send signals to processes collectively and also waitpid() can
wait for child processes in a certain group.
A session is collection of one or more process groups. For example, all the processes in a login
belong to the same session by default. It may have a session leader which is identified by having its
session id equal to its process id. The function getsid() returns the process’s session id.
30. Does a command executed from command prompt inherit process group and session from
shell?
The commands executed from shell prompt get a new process group whereas the session is shared by
all the processes in the same login unless explicitly changed by calling appropriate system calls.
The shell calls setpgid() to set the process group of the command process to its own pid immediately
after fork(). Therefore in a login we can have any number of background process groups and a single
foreground process group.
31. What is a Daemon?
A daemon is a process that detaches itself from the terminal and runs, disconnected, in the
background, waiting for requests and responding to them. It can also be defined as the background
process that does not belong to a terminal session. Many system functions are commonly performed
by daemons. Some of the most common daemons are:
- init: Takes over the basic running of the system when the kernel has finished the boot process.
- inetd: Responsible for starting network services that do not have their own stand-alone daemons.
For example, inetd usually takes care of incoming rlogin, telnet, and ftp connections.
- cron: Responsible for running repetitive tasks on a regular schedule
32. How is a daemon process created?
There are some basic rules to coding a daemon:
1) the first thing to do is call fork() and have the parent exit. This serves two purposes. First,
if the daemon was started as a simple shell command, having the parent terminate makes the shell
think that the command is done. Second, the child inherits the process group id from parent but gets
a new process id, so we’ve guaranteed that the child is not a process group leader. This is a
prerequisite for the call to setsid that is done next.
2) call setsid to create a new session. As a result of setsid(), the process becomes a session
leader of a new session, becomes the process group leader of a new process group, and has no
controlling terminal.
3) change the current working directory to the root directory. The current working directory
inherited from the parent could be on a mounted filesystem. Since daemons normally exist until the
system is rebooted, if the daemon stays on a mounted filesystem, that filesystem cannot be
unmounted. Alternately, some daemons may change the current working directory to some specific
location, where they will do all their work. For example, line printer spooling daemons often change
to their spool directory.
4) set the file mode creation mask to 0. The file mode creation mask that’s inherited could be
set to deny certain permissions. For example, it it specifically creates files with group read and group
write enabled, a file mode creation mask that turns off either of these permissions would undo its
efforts.
5) Unneeded file descriptors should be closed. This prevents the daemon form holding open
any descriptors that it may have inherited from its parent. Exactly which descriptor to close,
however, depends on the daemon.
6) It needs to handle error messages.
7) It needs to handle death of child signal, if any.
33. How can a process be killed?
The kill command can be used to kill a process; it takes the process id of the process to be killed as
one argument.
SCHEDULING IN UNIX
1. Which is the class of scheduling policy observed by UNIX system?
The UNIX system belongs to the general class of schedulers known as round robin with multilevel
feedback, meaning that the kernel allocates the CPU to a process for a time quantum, preempts a
process that exceeds its time quantum, and feeds it back into one of several priority queues.
2. Briefly describe the scheduling algorithm.
At the conclusion of context switch, the kernel executes the algorithm to schedule a process,
following steps are performed:
- Selects the highest priority process from those in the states “ready to run and loaded in memory”
and “preempted”.
- If several processes tie for the highest priority, the kernel picks the process which has been “ready
to run” for the longest time, following round robin scheduling policy.
- If there are no processes eligible for execution, the processor idles until the next interrupt, which
will happen in at most one clock tick; after handling that interrupt, the kernel again attempts to
schedule a process.
3.What are the scheduling parameters for a UNIX process?
The UNIX scheduler associates a priority value with each process in the system which is a function
of process’s CPU usage or the event on which the process is going to sleep. This priority is used by
the scheduler to schedule processes. A process with highest priority is chosen as described in
previous question.
4.What do you mean by priority wrt a UNIX process? What are its types?
Each process table entry contains a priority field which is used for process scheduling. The priority
is an integer value which is calculated by the kernel depending upon the state of a process. A higher
numerical value indicates a lower priority and vice versa.
The two types of priorities are:
user level priority: The user level priority is a function of its recent CPU usage. The processes which
have used CPU recently get a lower priority ( ie a higher numerical value )
kernel level priority: The kernel level priority is assigned when a process goes to sleep, its a hardcoded
fixed priority value dependent on the reason for which process is going to sleep.
5.What do you mean by the base level priority?
The base level priority is the threshold value which divides the user and kernel level priorities. The
user level priorities are below this base priority and the kernel level priorities are above the base
priority. The base level priority is used in the calculation of user level priority of processes returning
from kernel to user mode.
6.What are the states when kernel calculates the priority of a process?
The kernel calculates the priority of a process in following process states:
- It assigns priority to a process about to go to sleep, which is a fixed value depending upon the
reason for sleep. This is known as kernel priority.
- The kernel readjusts the priority of a process that returns from kernel mode to user mode,
depending upon the percentage of CPU usage.
- The clock handler adjusts priorities of all processes in user mode at 1 second ( system V UNIX )
and causes the kernel to call the scheduler to prevent a process form monopolizing use of the
process.
7. What are types of kernel level priorities?
There are two types of kernel level priority:
- Interruptible: these are low level kernel priorities. A process sleeping on interruptible priority can
be woken up upon receipt of a signal.
- Non-interruptible: these are high level kernel priorities. A process sleeping on non-interruptible
priority doesn’t wake up on receipt of signal.
8. Can a process control it’s priority?
Processes can exercise a crude control of their scheduling priority by using the nice() system call, it
takes an integer value as argument which is added to the priority of the process.
For ex: nice ( value )
In this case priority is calculated as follows:
Priority = (“recent CPU usage”/constant) + (base priority) + (nice value)
MEMORY MANAGEMENT
1. What is the difference between Swapping and Paging?
Swapping:
In swapping policies whole process is
moved from main memory to swap
device and vice versa. Process size
must be less than or equal to the
available main memory. It is easier in
implementation but is lesser flexible
in managing main memory space
compared to demand paging.
Paging:
In demand paging memory pages are moved from main
memory to swap device and vice versa as per the
requirement. Each page is provided to process on demand.
Process size should be less than equal to the swap device
and main memory available. It provides greater flexibility
in mapping the virtual address space into the physical
memory of the machine. It allows more number of
processes to fit in the main memory simultaneously.
Demand paging systems handle the memory more
flexibly.
2. What is a Map?
Map is an in-core array, which contains the addresses of the free space in the swap device that are
allocatable resources, and the number of the resource units available there.
Initially the Map contains one entry – address (block offset from the starting of the swap area) and
the total number of resources. Kernel treats each unit of Map as a group of disk blocks. On the
allocation and freeing of the resources Kernel updates the Map for accurate information.
3.What is a Region?
A Region is a continuous area of a process’s address space (such as text, data and stack). The kernel
in a ‘Region Table’ that is local to the process maintains region. Regions are sharable among the
process.
4.How does Unix kernel swaps a process out from the main memory?
When Kernel swaps the process out of the primary memory, it performs the following steps:
- Kernel decrements the Reference Count of each region of the process. If the reference count
becomes zero, swaps the region out of the main memory.
- then allocates space for the swapping process in the swap device,locking the region table entries
while current swapping operation is going on.
- finally saves the swap address of the region in the region table and unlocks it.
5.What are the entities that are swapped out of the main memory while swapping the process
out of the main memory?
All memory space occupied by the process, process’s u-area, and Kernel stack are swapped out,
theoretically. Practically, if the process’s u-area contains the Address Translation Tables for the
process then Kernel implementations do not swap the u-area.
6.What is Fork swap?
fork() is a system call to create a process. When the parent process calls fork() system call, the child
process is created and if there is short of memory then the child process is sent to the read-to-run
state in the swap device, and the parent process returns to the user mode without getting swapped.
When the memory is available the child process is swapped into the main memory.
7. What is Expansion swap?
At the time when any process requires more memory than it is currently allocated, the Kernel
performs Expansion swap. To do this Kernel reserves enough space on the swap device. Then the
address translation mapping is adjusted for the new virtual address space but the physical memory is
not allocated. At last Kernel swaps the process into the assigned space in the swap device. Later
when the Kernel swaps the process into the main memory this assigns memory according to the new
address translation mapping there by allocating space for the unallocated virtual pages.
8. How the Swapper works?
The swapper is the only process that swaps the processes, it’s the first process created during system
boot up. The Swapper operates only in the Kernel mode and it does not use System calls instead it
uses internal Kernel functions for swapping. It is the archetype of all kernel process.
9. What are the processes that are not bothered by the swapper? Give Reason.
The following processes are not considered by swapper:
- Zombie process: They do not take any up physical memory.
1 10,000
Address Units
- Processes locked in memories that are updating the region of the process.
Kernel first swaps only the sleeping processes rather than the ‘ready-to-run’ processes, as they have
the higher probability of being scheduled than the Sleeping processes.
10. What are the criteria for choosing a process for swapping into memory from the swap
device?
The resident time of the processes in the swap device, the priority of the processes and the amount of
time the processes had been swapped out.
11. What are the criteria for choosing a process for swapping out of the memory to the swap
device?
The process’s memory resident time, priority of the process and the nice value.
12. What do you mean by nice value?
Nice value is the value that controls {increments or decrements} the priority of the process. The
equation for using nice value is:
Priority = (“recent CPU usage”/constant) + (base- priority) + (nice value)
Only the root user can supply negative nice value.
13. What are the requirements for a machine to support Demand Paging?
In order to support demand paging the machine’s memory architecture must be based on pages and
the machine must support ‘restartable’ instructions.
14. What is ‘the principle of locality’?
Principle of locality defines that “It’s the nature of the processes that they refer only a localized
small subset of the total data space of the process. i.e. the process frequently calls the same
subroutines or executes the loop instructions.”
15. What is the working set of a process?
The set of pages that are referred by the process in the last ‘n’ references, where ‘n’ is called the
window of the working set of the process. A page outside of the working set of pages may be
considered invalid and is eligible for swapping.
16. What is the window of the working set of a process?
The window of the working set of a process is the size of the working set of pages for the process.
For ex, if window is 5 it means that the working set of process can have upto 5 pages. Any page
which hasn’t been referenced in last 5 references is out of working set and may get swapped.
17. What are data structures that are used for Demand Paging?
Kernel uses following data structures for Demand paging:
Page table entries
Disk block descriptors
Page frame data table (pfdata)
Swap-use table.
18. What are the bits present in page table entries for demand paging?
The page table has following bits:
Valid: this indicates whether a page is valid or not. When set to 1 indicates that the page is valid and
is part of working set of process. When cleared indicates that the page reference is invalid.
Reference: It is set when the page is referenced and is cleared by the page stealer.
Modify: It is set when page contents are modified
Copy on write: When set indicates that a copy of the page is to be made on write. It is used in fork().
Age: Indicates the age of a page, it is used by page stealer.
19. How does Kernel handle fork() system call in traditional Unix and in the System V Unix,
while swapping?
Kernel in traditional Unix, makes the duplicate copy of the parent’s address space and attaches it to
the child’s process, while swapping. Kernel in System V Unix, manipulates the region tables, page
table, and pfdata table entries, by incrementing the reference count of the region table of shared
regions.
20. What is Page-Stealer process?
In demand paging it is the job of page stealer process to make room for the incoming pages, by
swapping the memory pages that are not the part of the working set of a process. Page-Stealer is
created by the Kernel at the system initialization and invokes it throughout the lifetime of the
system. Kernel locks a region when a process faults on a page in the region, so that page stealer
cannot steal the page, which is being faulted in.
21. Name two paging states for a page in memory?
The two paging states are:
The page is aging and is not yet eligible for swapping,
The page is eligible for swapping but not yet eligible for reassignment to other virtual address space.
22. What are the phases of swapping a page from the memory in demand paging?
Page stealer finds the page eligible for swapping and places the page number in the list of pages to
be swapped. Kernel copies the page to a swap device when necessary and clears the valid bit in the
page table entry, decrements the pfdata reference count, and places the pfdata table entry at the end
of the free list if its reference count is 0.
23. What is a page fault? What are the types of page fault?
Page fault is an interrupt which occurs in case of a page reference which is not in memory.
There are two types of page fault:
1. Validity fault 2. Protection fault
24. In what way the Fault Handlers and the Interrupt handlers are different?
Fault handlers are also interrupt handler with an exception that the interrupt handlers cannot sleep.
Fault handlers sleep in the context of the process that caused the memory fault. The fault refers to
the running process and no arbitrary processes are put to sleep.
25. What is validity fault?
If a process references a page in the main memory whose valid bit is not set, it results in validity
fault. The valid bit is not set for those pages:
1. that are outside the virtual address space of a process,
2. that are the part of the virtual address space of the process but no physical address is assigned to it.
26.What does the swapping system do if it identifies the illegal page for swapping?
If the disk block descriptor does not contain any record of the faulted page, then this causes the
attempted memory reference is invalid and the kernel sends a “Segmentation fault” signal to the
offending process. This happens when the swapping system identifies any invalid memory reference.
27. What are the possible states for a page faulted by a process?
The faulted page can be in any of the following states:
1. On a swap device and not in memory 2. On the free page list in the main memory
3. In an executable file 4. Marked “demand zero”
5. Marked “demand fill”
28 .How does validity fault handler conclude?
After loading the faulted page in memory, it sets the valid bit of the page by clearing the modify bit.
29. What is the mode of execution of the fault handlers?
The fault handlers are executed in the Kernel Mode as all interrupts are handled in kernel mode by
the unix OS.
30. What do you mean by the protection fault?
Protection fault occurs when a process accesses pages which do not have the access permission. A
process also incurs the protection fault when it attempts to write a page whose copy on write bit was
set during the fork() system call.
31. How does Unix kernel handle copy on write bit of a page when set?
When copy on write bit of a page is set and that page is shared by more than one process (indicated
by pfdata table entry of the page), the Kernel allocates new page and copies the content to the new
page and the other processes retain their references to the old page. After copying, the Kernel
updates the page table entry with the new page number then decrements the reference count of the
old pfdata table entry. Where the copy on write bit is set and no processes are sharing the page,
Kernel allows the physical page to be reused by the process. It clears the copy on write bit.
32 .For which kind of fault the page is checked first?
The page is first checked for the validity fault, as soon as it is found that the page is invalid (valid bit
is clear), the validity fault handler returns immediately, and the process incurs the protection page
fault. Kernel handles the validity fault and the process will incur the protection fault if any one is
present.
33. How does the protection fault handler conclude?
After finishing the execution of the fault handler, it sets the modify bit and protection bits and clears
the copy on write bit.
Sockets
1.What is a socket ?
Sockets are used for inter process communication.
2.What are the different types of internet sockets?
"Stream Sockets"; the other is "Datagram Sockets".
3.How can I tell when a socket is closed on the other end?
If the peer calls close() or exits, then our calls to read() should return 0.
4. What's with the second parameter in bind()?
It is s "struct sockaddr”. The sockaddr struct though is just a place holder for the structure it really
wants. You have to pass different structures depending on what kind of socket you have. For an
AF_INET socket, you need the sockaddr_in structure. It has three fields of interest:
sin_family
Set this to AF_INET.
sin_port
The network byte-ordered 16 bit port number
sin_addr
The host's ip number. This is a struct in_addr, which contains only one field, s_addr which is a
u_long.
5. How do I get the port number for a given service?
Use the getservbyname() routine. This will return a pointer to a servent structure. You are interested
in the s_port field, which contains the port number, with correct byte ordering (so you don't need to
call htons() on it). Here is a sample routine:
6. When does bind fail?
bind may fail if the port number is already inuse.
7. When should I use shutdown()?
shutdown() is useful for deliniating when you are done providing a request to a server using TCP. A
typical use is to send a request to a server followed by a shutdown(). The server will read your
request followed by an EOF (read of 0 on most unix implementations). This tells the server that it
has your full request. You then go read blocked on the socket. The server will process your request
and send the necessary data back to you followed by a close. When you have finished reading all of
the response to your request you will read an EOF thus signifying that you have the whole response.
It should be noted the TTCP (TCP for Transactions -- see R. Steven's home page) provides for a
better method of tcp transaction management.
8. What are the pros/cons of select(), non-blocking I/O?
Using non-blocking I/O means that you have to poll sockets to see if there is data to be read from
them. Polling should usually be avoided since it uses more CPU time than other techniques.
Using select() is great if your application has to accept data from more than one socket at a time
since it will block until any one of a number of sockets is ready with data. One other advantage to
select() is that you can set a time-out value after which control will be returned to you whether any
of the sockets have data for you or not.
9. How come select says there is data, but read returns zero?
The data that causes select to return is the EOF because the other side has closed the connection.
This causes read to return zero.
10. Whats the difference between select() and poll()?
The basic difference is that select()'s fd_set is a bit mask and therefore has some fixed size. It would
be possible for the kernel to not limit this size when the kernel is compiled, allowing the application
to define FD_SETSIZE to whatever it wants (as the comments in the system header imply today)
With poll(), however, the user must allocate an array of pollfd structures, and pass the number of
entries in this array, so there's no fundamental limit. As Casper notes, fewer systems have poll() than
select, so the latter is more portable.
11. What is the difference between read() and recv()?
read() is equivalent to recv() with a flags parameter of 0. Other values for the flags parameter
change the behaviour of recv(). Similarly, write() is equivalent to send() with flags == 0.
Portability note: non-unix systems may not allow read()/write() on sockets, but recv()/send() are
usually ok.
12. When will my application receive SIGPIPE?
With TCP you get SIGPIPE if your end of the connection has received an RST from the other end.
What this also means is that if you were using select instead of write, the select would have indicated
the socket as being readable, since the RST is there for you to read (read will return an error with
errno set to ECONNRESET). Basically an RST is TCP’s response to some packet that it doesn’t
expect and has no other way of dealing with. A common case is when the peer closes the connection
(sending you a FIN) but you ignore it because you’re writing and not reading. (You should be using
select.) So you write to a connection that has been closed by the other end and the oether end’s TCP
responds with an RST.
13. What are socket exceptions? What is out-of-band data?
Unlike exceptions in C++, socket exceptions do not indicate that an error has occured. Socket
exceptions usually refer to the notification that out-of-band data has arrived. Out-of-band data
(called "urgent data" in TCP) looks to the application like a separate stream of data from the main
data stream. This can be useful for separating two different kinds of data. Note that just because it is
called "urgent data" does not mean that it will be delivered any faster, or with higher priorety than
data in the in-band data stream. Also beware that unlike the main data stream, the out-of-bound data
may be lost if your application can't keep up with it.
14.How to manage multiple connections ?
Use select() or poll().
Note: select() was introduced in BSD, whereas poll() is an artifact of SysV STREAMS. As such,
there are portability issues; pure BSD systems may still lack poll(), whereas some older SVR3
systems may not have select(). SVR4 added select(), and the Posix.1g standard defines both.
select() and poll() essentially do the same thing, just differently. Both of them examine a set of file
descriptors to see if specific events are pending on any, and then optionally wait for a specified time
for an event to happen.
[Important note: neither select() nor poll() do anything useful when applied to plain files; they are
useful for sockets, pipes, ptys, ttys & possibly other character devices, but this is system-dependent.]
15. How do I use select?
The interface to select() is primarily based on the concept of an fd_set, which is a set of FDs (usually
implemented as a bit-vector). In times past, it was common to assume that FDs were smaller than 32,
and just use an int to store the set, but these days, one usually has more FDs available, so it is
important to use the standard macros for manipulating fd_sets:
fd_set set;
FD_ZERO(&set); /* empties the set */
FD_SET(fd,&set); /* adds FD to the set */
FD_CLR(fd,&set); /* removes FD from the set */
FD_ISSET(fd,&set) /* true if FD is in the set */
In most cases, it is the system's responsibility to ensure that fdsets can handle the whole range of file
descriptors, but in some cases you may have to predefine the FD_SETSIZE macro. This is systemdependent;
check your select() manpage. Also, some systems have problems handling more than
1024 file descriptors in select().
The basic interface to select is simple:
int select(int nfds, fd_set *readset,
fd_set *writeset,
fd_set *exceptset, struct timeval *timeout);
where
nfds
the number of FDs to examine; this must be greater than the largest FD in any of the fdsets, not the
actual number of FDs specified
readset
the set of FDs to examine for readability
writeset
the set of FDs to examine for writability
exceptfds
the set of FDs to examine for exceptional status (note: errors are not exceptional statuses)
timeout
NULL for infinite timeout, or points to a timeval specifying the maximum wait time (if tv_sec and
tv_usec both equal zero, then the status of the FDs is polled, but the call never blocks)
The call returns the number of `ready' FDs found, and the three fdsets are modified in-place, with
only the ready FDs left in the sets. Use the FD_ISSET macro to test the returned sets.
Here's a simple example of testing a single FD for readability:
int isready(int fd)
{
int rc;
fd_set fds;
struct timeval tv;
FD_ZERO(&fds);
FD_SET(fd,&fds);
tv.tv_sec = tv.tv_usec = 0;
rc = select(fd+1, &fds, NULL, NULL, &tv);
if (rc < 0)
return -1;
return FD_ISSET(fd,&fds) ? 1 : 0;
}
Note that we can pass NULL for fdsets that we aren't interested in testing.
16. How to know that the other end of connection is shut down. ?
If you try to read from a pipe, socket, FIFO etc. when the writing end of the connection has been
closed, you get an end-of-file indication (read() returns 0 bytes read). If you try and write to a pipe,
socket etc. when the reading end has closed, then a SIGPIPE signal will be delivered to the process,
killing it unless the signal is caught. (If you ignore or block the signal, the write() call fails with
EPIPE.)
17. How to read directories.
The function opendir() opens a specified directory; readdir() reads directory entries from it in a
standardised format; closedir() does the obvious. Also provided are rewinddir(), telldir() and
seekdir() which should also be obvious.
If you are looking to expand a wildcard filename, then most systems have the glob() function; also
check out fnmatch() to match filenames against a wildcard, or ftw() to traverse entire directory trees.
18. How to find the file size.
Use stat(), or fstat() if you have the file open.

No comments:

Post a Comment