Tuesday, February 8, 2011

Linux Process - Tips

What is a Process ?
When a program is read from disk into memory and its execution begins, the currently executing image is called a process

The process ID is the number between 1 - 32767 by default (Certainly customizable). To set the limit, as root run the following,You can set the value higher (up to 2^22 on 32-bit machines: 4,194,304)
# echo 4194303 > /proc/sys/kernel/pid_max
you can instead add a line to your /etc/sysctl.conf. You would do this instead of the above  commands. This will be the more natural solution for such systems, but you'll need to reboot the  system or use the sysctl program for it to take effect. You need to append the following to your  /etc/sysctl.conf:
#Allow for more PIDs (to reduce rollover problems); may break some programs
kernel.pid_max = 4194303
Each process in linux has a parent. Once the system starts a single process is created, called INIT,  whose PID is 1. The INIT process then begins to start the system up, creating processes as needed. These newly created processes may start other processes but the ultimate parent is always INIT.

PS command

"ps" command with no option shows: 
  • Process ID (PID)
  • The terminal (TTY)
  • The amount of CPU time that the process has accumulated (TIME)
  • The command used (CMD)

"ps -f" give more info(full option). It displays the below options in addition to above
  • The Parent PID (PPID)
  • The process start time (STIME)
  • The user ID (UID)
Process ither than your own cab be also checked with "ps" command using -e option. This displays all the process in the system
  • -u to ps command limit display to users
  • -g limits display to groups
  • -p limits display to PID
  • -t limit display to terminal

Managing process In Linux
Usually 2 ways used to manage the process
1. Using a signaling system - (sending signals to process using commands kill,skill and pkill)
     Signals are the software interupts used to communicate status and information amongst processes. The TERM signal can be caught or ignored. The KILL signal "9" is not able to be caught or ignored,  and causes immediate termination of the process. Ctrl C sends the INT (2)(Interrupt) signal to the  process- this is the reason of Process termination. "Ctrl \" sends quit (3) signal to running process TERM (Terminate)signal (15) is the default signal send to the process while running the kill  command.
HUP signal is generated by a modem hangup. It often tell a daemon to reconfigure (restart) itself. 
kill -1 (kills the shell and logs out).

2. Using /proc interface
     Much of the processess information is available to a user through a special interface known as  /proc file system. Every process running in Linux system has an correspondence directory in proc file system. The /proc file system does not exist on disk. It is an interface to the running system and  present kernel. It gives kernel and process information in an easy to access manner. Every process  that runs on a Linux system has a corresponding directory in /proc named with the PID of the process.

Managing process using /proc:
There is wealth of information about a running proces in its /proc entry. Most of this information is meant for use by programs like ps, So we need to do some pre-processing before we can view it. You can use "tr" command to do it. By translating ASCII NUL characters to LF (Line feed) characters, we  can get a meaningful display.
# tr '\0' '\n' < /proc/1223/environ
This example shows the details about the process 1233.

"environ" is the file which contains the environment details of the process.
"cwd" folder shows the current working directory
"fd" contains the links to every file that a process may have opened. This directory called fd (File  Descriptor). File Descriptor is a number used by a program to identify an open file. Each process in /proc file system will have a "fd". This is a vital information for a system administrator trying to  manage a large and complex system. For instance, a file system may not be unmounted if any process  has a file opened in that file system. By checking /proc, and administrator can determine and  resolve the problem

# umount /home
Umount: /home: device us busy
#ls -al /proc/*/fd | grep home
This will show what all process opened the file /home
Killing a Job
# kill -9 %1
The % should be added before the job number. This will make shell to replace job number with the process ID

Log Files, Errors and Status

Syslog facilities:

Local0 - Local7

Syslog Priority:
emerg -  Emergency condition, such as an imminent system crash, usually broadcast to all users
alert -    Condition that should be corrected immediately, such as a corrupted system database
crit -     Critical condition, such as a hardware error
err -      Ordinary error
warning - Warning
notice -  Condition that is not an error, but possibly should be handled in a special way
info -    Informational message
debug -  Messages that are used when debugging programs
none -   Do not send messages from the indicated facility to the selected file. For example, specifying
*.debug;mail.none sends all messages except mail messages to the selected file.

Logrotate keeps 4 weeks of logs before the oldest log is rottated out or deleted. Syslog entries all  share a common format. The entry starts with the date and time, followd by the name of the system  which logged message.

CORE Error handling:
When unexpected errors occur, the system may create a core file. A core file contains a copy of the  memory image of the process at the time that the error occurred. It is named "core" because the mail  system emory was originally called core memory, as it was made up of ferrite donuts that were wired  together through their holes, or cores.

A core file can be used to autospy a dead pricess. Even if you are not a programmer, and do not have  the access to core analysis tools, core files can still be used to find information that may help  you to identify the cause if the program's death.

The first thing to do with a "core" file is use the "file" command to determine what program caused  the core and what (if any) signal initiated the dumping of core. Core files are normally called core or  core.xxxx where "xxxx" is the PID of the process before it died. Using "man 7 signal" will bring up a  list of signals. By this mean we can determine the issue and also the author can be notified if there is  any kind og bugs (If any Invalid memory reference error occurs).

strings Command:
     Strings program displays printable strings from a binary file. Using strings on a core file, you can  display all of the strings included in the core image. At the end of the core file will be the process  environment. This includes the command used to start program.
This information can give vital clues to the case of death. Looking through the core file for pathnames  can also give information about the configuration files and shared libraries required to run the program.

Customizing the Shell
     In Bash, there are 4 prompt strings used. All of them are able to customize. These strings are  represented by the environment variables PS1, PS2, PS3 and PS4. The normal command prompt,  which is displayed to indicate the shell is ready for a new command, is found in the PS1 variable. Should a command require more than a single line of input, the secondary prompt string PS2 is  displayed. This can be seen when typing in flow control statements interactvely.
The select statement uses PS3 to display the prompt for the generated menu. The default is "#?"
Finally, the PS4 prompt is used when debugging shell scripts. The shell allows an exection trace,  showing each command as it is executed. This is enabled by using -x option to the shell, or using "set  -x" at the start of the script.

PS3 and PS4 can be set to any text. The text is displayed with no change. There is no way to place  variable text within these strings. However, PS1 and PS2 can have test that is evaluated each time the  prompt is displayed. This can be done with the $(command) syntax, or with a special set of  characters used specifically for the purpose.

Some notes about Linux File System
     The structure of a file system determines its use and the manner in which commands and utilities  interact with it. This is especially true of management commands that change or effect the file system. Beacause of this we need to explore the structure of a Linux file system before we can look at the file  system management commands.

All Linux filesystem have a similar logical structure as far as the user or system commands are  concerned. This is achieved by the file system driver logic in the Linux Kernel, regardless of the  underlying data layout. "A file system usually consists of a master information table called the  superblock, a list of file summary information blocks, called inodes, and the data blocks assosiated  with your data.

Every filesystem has its own root directory, which is always identified by inode number 2. This is the  first usable inode in a Linux filesystem. This directory is special, in that it can be used to attach the  filesystem to the main, or root filesystem. The directory on the root or parent filesystem at which the  new filesystem is attached is called the mount point.

/dev files:
     A /dev entry looks like any other file except that it does not have a size. Instead it has a major and  minor device number, and a block or character designation. The major number identifies which  device driver is being used. There are two kind of device drivers: Block and Character, each with  their own set of major numbers.

The minor number identifies the sub-device or operation for the device driver. For example, a tape  drive may have different minor numbers for operation in compressed and uncompressed mode. There are a number of general-purpose devices as well. The /dev/bull file is also known as the "bit bucket"  because it will take anything that is written to it and discard it. It is often used to discard unwanted  error messages or to test commands. A similar file is /dev/zero which does the same for writes, but  when read will return as many NUL (Hex 00) characters as you ask to read. This is often used to  create zero-filled files for testing or for database initialization.

lost+found Directory:
     Every file system requires a directory called lost+found in the root directory of the filesystem. This is used by the system when checking and rebuilding a corrupted file system. Files that have  inodes, but no directory entry, are moved to the lost+found directory. If there are files in this  directory, they will be named with the inode number, as an indication that the file system has suffered  some damage.