Linux System Programming: Difference between revisions

From miki
Jump to navigation Jump to search
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Reference ==
== Reference ==
* [http://perl.plover.com/yak/commands-perl/samples/slide001.html System Programming in Perl: The Unix Process Model (slides)]
* [http://perl.plover.com/yak/commands-perl/ System Programming in Perl: The Unix Process Model (slides)]


== Process ==
== Process ==
Line 34: Line 34:
./pie > piefile
./pie > piefile
</source>
</source>

== <tt>/proc</tt> filesystem ==
Some links:
* [http://blog.ksplice.com/2011/01/solving-problems-with-proc/ Solving problems with proc]
** ''phantom progress bar'' &mdash; Showing progress of a process in a file, after process was launched

* For a shell, <tt>/dev/fd/###</tt>, <tt>/proc/self/fd/###</tt> and <tt>/proc/$$/fd/###</tt> refer to the same file

* If a program only takes a filename as argument and you want them to process file from standard input, use the fake file <tt>/proc/self/fd/O</tt> ([http://unix.derkeiler.com/Newsgroups/comp.unix.questions/2003-06/0077.html]):
<source lang="bash">
antiword /proc/self/fd/0 < test.doc > test.txt
somecommand | antiword /proc/self/fd/0 > test.txt # Not guarantee to work, because pipe does not support random access
antiword <(somecommand) >test.txt # Using bash process substitution
</source>
* Read the ''manpages'' for more information:
<source lang="bash">man proc</source>

== pthreads ==

* [http://www.domaigne.com/blog/computing/condvars-signal-with-mutex-locked-or-not/ condvars: signal with mutex locked or not?]
* [http://stackoverflow.com/questions/4544234/calling-pthread-cond-signal-without-locking-mutex Calling pthread_cond_signal without locking mutex]
* [https://groups.google.com/forum/?hl=ky&fromgroups#!msg/comp.programming.threads/wEUgPq541v8/ZByyyS8acqMJ basic question about concurrency]

== Input / output (IO) ==

Some traps:

* Make sure to call <code>write</code> in a loop because write may do ''partial'' writes [https://www.gnu.org/software/libc/manual/html_node/I_002fO-Primitives.html#I_002fO-Primitives].
* When <code>fsync</code> fails, it will delete the cached pages, meaning that the data will be '''lost'''. Note that next fsync would then succeed. See more on fsync [https://lwn.net/Articles/752063/ here], [https://stackoverflow.com/questions/42434872/writing-programs-to-cope-with-i-o-errors-causing-lost-writes-on-linux here], and [https://stackoverflow.com/questions/37288453/calling-fsync2-after-close2/50158433 here].

References:
* [https://danluu.com/file-consistency/ Files are hard] (how to write to disk in a robust way).
: Very detailed way on how to write to files and dealing with I/O errors.

== Memory protection ==
See [[C]] page how we used <code>mprotect</code> to change the protection (read / write / execute) on a memory zone.

== Examples ==

=== Kill on ALARM signal ===
Small script to kill a process on ALARM signal, implementing a kindof timeout kill (see [[Perl#Kill on ALARM signal|Perl page]]).

Latest revision as of 23:16, 13 February 2020

Reference

Process

A process contains:

  • A unique process ID number ('pid')
  • A current working directory ('cwd')
  • A user ID and group ID number (actually more than one of each)
  • An open file table
  • An environment' (just a bunch of string data)
  • A signal table
  • An alarm clock
  • Lots of other equipment

Why fork & exec

  • 2 main primitives of kernel to manage processes:
    • fork: create a new process
    • exec: replace a process's object code with the contents of a file
  • Launching a new process is done in 2 steps:
    1. Fork the current process
    2. In the child process, exec the new file that will then replace the code of the current child but keep environment.
  • Why fork and exec with 2 separate commands?
    Because the child process can alter its environment before it does the exec. For example:
    ps > /tmp/procs                # The child shell will first change the output FD, without ps knowing
    

Fork

  • The child will get a new copy of the file descriptor (FD) table.
    So closing a file in the child will not interfere with the parents.
  • However the system open file table is not copied; the same table is shared between child and parent. Why? Because parent must keep the same Seek pointer as the child, even after child is dead:
cat ./pie
# #!/bin/sh
# echo "I like pie.";                      # Will move seek pointer, also in parent
# echo "Especially blackberry.";           # Will write starting from new position of seek pointer!
./pie > piefile

/proc filesystem

Some links:

  • For a shell, /dev/fd/###, /proc/self/fd/### and /proc/$$/fd/### refer to the same file
  • If a program only takes a filename as argument and you want them to process file from standard input, use the fake file /proc/self/fd/O ([1]):
antiword /proc/self/fd/0 < test.doc > test.txt
somecommand | antiword /proc/self/fd/0 > test.txt   # Not guarantee to work, because pipe does not support random access
antiword <(somecommand) >test.txt                   # Using bash process substitution
  • Read the manpages for more information:
man proc

pthreads

Input / output (IO)

Some traps:

  • Make sure to call write in a loop because write may do partial writes [2].
  • When fsync fails, it will delete the cached pages, meaning that the data will be lost. Note that next fsync would then succeed. See more on fsync here, here, and here.

References:

Very detailed way on how to write to files and dealing with I/O errors.

Memory protection

See C page how we used mprotect to change the protection (read / write / execute) on a memory zone.

Examples

Kill on ALARM signal

Small script to kill a process on ALARM signal, implementing a kindof timeout kill (see Perl page).