Bash Tips and Pitfalls: Difference between revisions

Latest revision as of 07:48, 17 July 2024

Reference

Local page:

Bash

External links:

Tips for Robust Scripts

Reference: [1], [2], [3].

Use set -u

This will detect uninitialized variable, the king of all evils!

#! /bin/bash
set -o nounset                          # Or "set -u"

chroot=$1
rm -r $chroot/etc                       # Will delete /etc if $1 is not given!!!

Use set -e

Script will exit if any command fails. But beware of the gotchas.

#! /bin/bash
set -o errexit                                               # Or "set -e"

# Don't do
command                                                      # Will fail and exit!
if [ "$?"-ne 0]; then echo "command failed"; exit 1; fi 
# But do instead:
command || { echo "command failed"; exit 1; }                # Ok

# Temporarily disable the check for some code section
set +e
command1
command2
set -e

Expect space in filenames

if [ $filename = "foo" ];                      # WRONG
if [ "$filename" = "foo" ];                    # Correct

for i in $@; do echo $i; done                  # WRONG
for i in "$@"; do echo $i; done                # Correct

find | xargs ls                                # WRONG
find | xargs -d '\n' ls                        # Correct
find -print0 | xargs -0 ls                     # Better

for f in $(locate .pdf); do basename $f; done  # WRONG
locate .pdf | xargs -d '\n' -n 1 basemane      # Correct
locate -0 .pdf | xargs -0 -n 1 basemane        # Better

for f in $(ls); do basename $f; done           # WRONG
for f in *; do basemane $f; done               # Correct

More safe shell tips

From mit.eud:

Use set -euf -o pipefail. This enables:

set -e, exit on fails.
set -u, exit on undefined variables.
set -f, disable filename expansion (globbing), when seeing * ?...
set -o pipefail, fails when one step in a pipeline fails (otherwise, only last step is checked).

In addition:

Quote liberally *all* variables (use "$filename").
Always use -- to make sure variables are passed as positional parameter (sudo -u nobody -- "$@" safer than sudo -u nobody "$@" if $@ expands to -u root reboot).
Use shellcheck.

Use signals to fail cleanly

if [ ! -e $lockfile ]; then
   trap "rm -f $lockfile; exit" INT TERM EXIT        # Do we need HUP?
   touch $lockfile                                   # !!! race-condition. gap between testing and file creation
   critical-section
   rm $lockfile
   trap - INT TERM EXIT
else
   echo "critical-section is already running"
fi

(Not sure we need to trap INT and TERM. Note that we can't trap KILL anyway).

A better solution without TOCTTOU (time-of-check to time-of-use) race condition:

if mkdir $lockdir; then                              # mkdir is atomic on all fs
   trap "rmdir $lockdir; exit" INT TERM EXIT ERR
   critical-section
   rmdir $lockdir
   trap - INT TERM EXIT
else
   echo "critical-section is already running"
fi

Some extra tips:

Use trap ERR to trap exit due to the -e shell option.

set -e
trap "die 1 'ERR signal trapped'" ERR

die() {
   CODE=$1
   shift
   >&2 echo "$0: Error - $@"
   exit $CODE
}

It is not necessary to clean the trap handler at the end of the script.
Setup the trap as soon as possible in the script since errors may occur at any command.

Create temp file and cleanup using signals

From [4]:

tempfiles=( )
cleanup() {
  rm -f "${tempfiles[@]}"
}
trap cleanup EXIT           # Note that there is no need to trap TERM or KILL

Create a temporary file with

temp_foo="$(mktemp -t foobar.XXXXXX)"
tempfiles+=( "$temp_foo" )

Alternatively, one can create a file and descriptor to it, then remove the file immediately:

touch 'temp.txt'
exec 3 < 'temp.txt'
exec 4 > 'temp.txt'
rm -f 'temp.txt'
# Now we can still use fd 3 and 4, but files is no longer on fs

Beware of Race conditions

References:

There is race condition between the test of file and its creation. If 2 processes run simultaneously, they might both pass the test successfully and think that they are running alone. To solve it, we need an operation that tests & create the file in an atomic way.

The safest solution is to use mkdir, which is atomic on most filesystem [5]. It will fail if directory already exists, or create it otherwise, both atomically.

lockdir=/var/tmp/mylock
pidfile=/var/tmp/mylock/pid

if ( mkdir ${lockdir} ) 2> /dev/null; then
        echo $$ > $pidfile
        trap 'rm -rf "$lockdir"; exit $?' INT TERM EXIT
        # do stuff here

        # clean up after yourself, and release your trap
        rm -rf "$lockdir"
        trap - INT TERM EXIT
else
        echo "Lock Exists: $lockdir owned by $(cat $pidfile)"
fi

The PID of locking script is stored in a file in locked directory. This way, another script can detect stale lock (by verifying that the owner script is still running).

Note that on exit, trap will be executed twice.

<source lang=bash>
lockdir=/var/tmp/mylock
pidfile=/var/tmp/mylock/pid

if ( mkdir ${lockdir} ) 2> /dev/null; then
        echo $$ > $pidfile
        trap 'trap - INT TERM EXIT; rm -rf "$lockdir"; exit $?' INT TERM EXIT
        # do stuff here

        # exit explicitly to call the trap
        exit 0
else
        echo "Lock Exists: $lockdir owned by $(cat $pidfile)"
fi

</source>

Here a complete example on how to manage lockdir and stale process [6]:

#!/bin/bash
 
# lock dirs/files
LOCKDIR="/tmp/statsgen-lock"
PIDFILE="${LOCKDIR}/PID"
 
# exit codes and text
ENO_SUCCESS=0; ETXT[0]="ENO_SUCCESS"
ENO_GENERAL=1; ETXT[1]="ENO_GENERAL"
ENO_LOCKFAIL=2; ETXT[2]="ENO_LOCKFAIL"
ENO_RECVSIG=3; ETXT[3]="ENO_RECVSIG"
 
###
### start locking attempt
###
 
trap 'ECODE=$?; echo "[statsgen] Exit: ${ETXT[ECODE]}($ECODE)" >&2' 0
echo -n "[statsgen] Locking: " >&2
 
if mkdir "${LOCKDIR}" &>/dev/null; then
 
    # lock succeeded, install signal handlers before storing the PID just in case 
    # storing the PID fails
    trap 'ECODE=$?;
          echo "[statsgen] Removing lock. Exit: ${ETXT[ECODE]}($ECODE)" >&2
          rm -rf "${LOCKDIR}"' 0
    echo "$$" >"${PIDFILE}" 
    # the following handler will exit the script upon receiving these signals
    # the trap on "0" (EXIT) from above will be triggered by this trap's "exit" command!
    trap 'echo "[statsgen] Killed by a signal." >&2
          exit ${ENO_RECVSIG}' 1 2 3 15
    echo "success, installed signal handlers"
 
else
 
    # lock failed, check if the other PID is alive
    OTHERPID="$(cat "${PIDFILE}")"
 
    # if cat isn't able to read the file, another instance is probably
    # about to remove the lock -- exit, we're *still* locked
    #  Thanks to Grzegorz Wierzowiecki for pointing out this race condition on
    #  http://wiki.grzegorz.wierzowiecki.pl/code:mutex-in-bash
    if [ $? != 0 ]; then
      echo "lock failed, PID ${OTHERPID} is active" >&2
      exit ${ENO_LOCKFAIL}
    fi
 
    if ! kill -0 $OTHERPID &>/dev/null; then
        # lock is stale, remove it and restart
        echo "removing stale lock of nonexistant PID ${OTHERPID}" >&2
        rm -r "${LOCKDIR}"
        if [ $? != 0 ]; then
          echo "lock failed, another script is cleaning up stale lock" >&2
          exit ${ENO_LOCKFAIL}
        fi
        echo "[statsgen] restarting myself" >&2
        exec "$0" "$@"
    else
        # lock is valid and OTHERPID is active - exit, we're locked!
        echo "lock failed, PID ${OTHERPID} is active" >&2
        exit ${ENO_LOCKFAIL}
    fi
 
fi

Issue! — there is a race-condition when the lock is stale and two scripts are trying to clean up. Another script could remove the stale lock and create a new one, when first script still thinks lock is stale and remove it successfully with rm -r.

Another example in [7] and [8], is to use IO redirection and bash's noclobber mode, which won't redirect to an existing file:

if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null; 
then
   trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT

   # critical-section
   
   rm -f "$lockfile"
   trap - INT TERM EXIT
else
   echo "Failed to acquire lockfile: $lockfile." 
   echo "Held by $(cat $lockfile)"
fi

The shortest solution [9]:

set -o noclobber
{ > file ; } &> /dev/null

A more thorough example below from [10]:

~/test-locking.sh (<file name="test-locking.sh" tag="source">download</file>)

#!/bin/sh


# Lock (mutex) sample code for Bourne shell
#
# Stephen Thomas <flabdablet@gmail.com> 14-Oct-2009
#
# This is free software - do whatever you like with it
# except hold me accountable for any grief it causes you.



# Acquire specified lock
# Return 0 if successful, 1 if not

acquire_lock () {
	local me=$(sh -c 'echo $PPID')
	local owner
	local shell
	local status
	local result
	local flags=$-
	set -o noclobber #make output redirection into atomic test-and-set
	if echo $me $$ valid >"$1"
	then
		result=0
	else
		read owner shell status <"$1"
		test "$owner $shell $status" = "$me $$ valid"
		result=$?
	fi 2>/dev/null
	set +$- -$flags
	return $result
}



# Remove specified lock if stale (valid, but neither the
# owning process nor the shell that spawned it are still
# running)

purge_stale_lock () {
	local owner
	local shell
	local status
	if
		read owner shell status <"$1" &&
		test "$status" = valid &&
		! ps p "$shell" &&
		! ps p "$owner" 
	then
		rm -f "$1"
	fi >/dev/null 2>&1
}



# Exercise locking functions

test_locking () {
	local me=$(sh -c 'echo $PPID')
	echo Process $me from shell $$ attempting to acquire lock $1
	if acquire_lock "$1"
	then
		echo Process $me from shell $$ acquired lock - sleeping 5 seconds
		sleep 5
		echo Process $me from shell $$ attempting to re-acquire same lock
		if acquire_lock "$1"
		then
			echo Process $me from shell $$ re-acquired same lock - sleeping 5 seconds
			sleep 5
		else
			echo Process $me from shell $$ failed to re-acquire lock
		fi
		echo Process $me from shell $$ releasing lock
		rm -f "$1"
	else
		echo Process $me from shell $$ locked out
	fi
}

lock=~/test.lck

purge_stale_lock "$lock"

for i in $(seq 1 10)
do
	test_locking "$lock" &
done

Alternate solutions using flock:

exec 200>"$LOCK_FILE"
flock -e -n 200 || exit 1
# ...critical section...
rm "$LOCK_FILE"                   # Optional

Use unique variable names in functions

In bash, changing a variable in a function, change that variable in the parents as well, even if that variable was declared local in the parent!

So to avoid conflicts, use unique variable names. But if all function calls are local, using local in all child functions is enough, but potentially unsafe.

function achild() {
  A=achild
  MYSCRIPT_ACHILD=achild
  echo $A $MYSCRIPT_ACHILD
}

function a() {
  local A=a                # Name too generic. Potential name clash!
  local MYSCRIPT_A=a       # Unique name, using script name as prefix
  echo $A MYSCRIPT_A
  achild
  echo $A $MYSCRIPT_A
}

a           # a a
            # achild achild
            # achild a

Avoid `eval` like the plague

Trap EXIT or RETURN for cleanup

Trap EXIT signal to perform cleanup in all cases (either normal exit, or kill).

tmp=$(mktemp)
trap "rm -f $tmp" EXIT

For functions, trap RETURN signal. Note that the signal handler will be automatically called and removed from handler list.

foo() {
    trap "echo 'Cleanup from foo'" RETURN
    return
}

bar() {
    return
}
baz() {
    trap "echo 'Cleanup from baz'" RETURN
    return
}

foo # Will call foo cleanup
bar # no call here
baz # Will call baz cleanup

Using trap, we can build a defer operator similar to Go [12]:

#!/bin/sh
#
# TODO: Quoting not perfect!
DEFER=
defer() {
    DEFER="$*; ${DEFER}"
    trap "{ $DEFER }" EXIT
}

Example of use:

# Mount /tmp as tmpfs and umount it on script exit.
mount -t tmpfs tmpfs /tmp
defer umount -f /tmp

# Create a temporary file and delete it on script exit.
TEMP=$(mktemp)
echo "Hello!" > "$TEMP"
defer rm -f "$TEMP"

Tips for Fast Scripts

Avoid forking

Avoid calling an external program. Use Bash internal commands as much as possible. Here some common replacement:

don't

DO

cat FILE | some_pgm

<FILE some_pgm            # Don't cat, use redirection!
A=$(<FILE)                 # Put FILE content into A

basename FILE

echo ${FILE/*\/}           # Remove everything up to last slash

ps aux | grep ssh-agent && ...

[[ $(ps aux) =~ ssh-agent ]] && ...   # Use built-in regex engine
[[ $(ps aux) == *ssh-agent* ]] && ... # Use built-in pattern matching

Syntax Tips

Function body

The { ... } after a function is actually not a function body but a compound command [13]:

function name () {
    ...
}

We can do more fancy things like:

function fileExists () [[ -f $1 ]]
function isEven () (( $1 % 2 == 0 ))
function sleep1 () while :; do "$@"; sleep 1; done

# Below we run the function in its own shell, meaning we don't need to save previous values:
function caseInsensitiveMatch () (
    shopt -s nocasematch
    ....
)

External tools

Most comes from MIT hacker-tools lectures on the command-line environment.

fasd A command-line productivity booster, with fuzzy matcher for cd similar to z.
bat A cat(1) clone with wings (syntax highlighting).
fd A simple, fast and user-friendly alternative to find (inspired from ripgrep).
rg A ultra-fast grep replacement.
tldr Simplified and community-driven man pages.

Template

Minimal safe

#!/bin/bash

set -Eeuo pipefail
trap cleanup SIGINT SIGTERM ERR EXIT

script_dir=$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd -P)

cmd(){ echo $(basename "$0"); }

usage() {
  # ...
  exit
}

cleanup() {
  trap - SIGINT SIGTERM ERR EXIT
  # script cleanup here
}

die() {
  local code=$1
  shift
  echo >&2 -e "$(cmd): Error: $@"
  exit "$code"
}

parse_params() {
  # ...
}

parse_params "$@"

# script logic here

Long

From betterdev blog [14]:

#!/usr/bin/env bash

set -Eeuo pipefail
trap cleanup SIGINT SIGTERM ERR EXIT

script_dir=$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd -P)

usage() {
  cat <<EOF
Usage: $(basename "${BASH_SOURCE[0]}") [-h] [-v] [-f] -p param_value arg1 [arg2...]

Script description here.

Available options:

-h, --help      Print this help and exit
-v, --verbose   Print script debug info
-f, --flag      Some flag description
-p, --param     Some param description
EOF
  exit
}

cleanup() {
  trap - SIGINT SIGTERM ERR EXIT
  # script cleanup here
}

setup_colors() {
  if [[ -t 2 ]] && [[ -z "${NO_COLOR-}" ]] && [[ "${TERM-}" != "dumb" ]]; then
    NOFORMAT='\033[0m' RED='\033[0;31m' GREEN='\033[0;32m' ORANGE='\033[0;33m' BLUE='\033[0;34m' PURPLE='\033[0;35m' CYAN='\033[0;36m' YELLOW='\033[1;33m'
  else
    NOFORMAT='' RED='' GREEN='' ORANGE='' BLUE='' PURPLE='' CYAN='' YELLOW=''
  fi
}

msg() {
  echo >&2 -e "${1-}"
}

die() {
  local msg=$1
  local code=${2-1} # default exit status 1
  msg "$msg"
  exit "$code"
}

parse_params() {
  # default values of variables set from params
  flag=0
  param=''

  while :; do
    case "${1-}" in
    -h | --help) usage ;;
    -v | --verbose) set -x ;;
    --no-color) NO_COLOR=1 ;;
    -f | --flag) flag=1 ;; # example flag
    -p | --param) # example named parameter
      param="${2-}"
      shift
      ;;
    -?*) die "Unknown option: $1" ;;
    *) break ;;
    esac
    shift
  done

  args=("$@")

  # check required params and arguments
  [[ -z "${param-}" ]] && die "Missing required parameter: param"
  [[ ${#args[@]} -eq 0 ]] && die "Missing script arguments"

  return 0
}

parse_params "$@"
setup_colors

# script logic here

msg "${RED}Read parameters:${NOFORMAT}"
msg "- flag: ${flag}"
msg "- param: ${param}"
msg "- arguments: ${args[*]-}"

Tips

Parsing command-line option parameters (getopt/getopts)

getopt

To ease parsing, pre-parse with executable getopt (see here for more information and examples).

#!/bin/bash
# Gets the command name without path
cmd(){ echo $(basename "$0"); }

# Help command output
usage(){
    echo "`cmd` [OPTION...]"
    column -t -s  ";" << __USAGE__
    -a; hey
    -b; bee
    -c FILE; cee FILE.
__USAGE__
    exit $1
}

# (old getopt syntax)
args=$(getopt abc: "$@")
[ $? -eq 0 ] || usage 1

set -- $args
for i
do
    case "$i" in
        -c) shift; echo "flag c set to $1"; shift ;;
        -a) shift; echo "flag a set" ;;
        -b) shift; echo "flag b set" ;;
    esac
done

$ ./g -abc "foo"
flag a set
flag b set
flag c set to foo

A more complete example with getopt using both short and long options (from SO, Cosimo (GitHub), and shakefu (GitHub)):

# Gets the command name without path
cmd(){ echo $(basename "$0"); }

# Error message
error(){
    echo "`cmd`: invalid option -- '$1'";
    echo "Try '`cmd` -h' for more information.";
    exit 1;
}

# Help command output
usage(){
    echo "`cmd` [OPTION...]"
    column -t -s  ";" << __USAGE__
    -x, --exclude VALUE; Add VALUE to exclude.
    -h, --help; Print this help.
    -v, --verbose; Enable verbose output (include multiple times for more
                 ; verbosity, e.g. -vvv).
__USAGE__
    exit $1
}

# Parse options
OPTS="$(getopt -o e:hv -l exclude:,help,verbose --name "`cmd`" -- "$@")"
[ $? -eq 0 ] || usage 1

eval set -- "$OPTS"
unset OPTS

EXCLUDES=
VERBOSE=false         # Or leave empty, and use [ -n "$VERBOSE" ]

while true
do
    case $1 in
        -e | --exclude ) EXCLUDES+=("$2"); shift; shift ;; # Note: $2 can't be empty here
        -h | --help )    usage 0 ;;
        -v | --verbose ) VERBOSE=true; shift ;;
        -- )             shift; break ;;
        * )              error $1 ;;
    esac
done

getopts (Bash built-in)

A slightly lighter alternative is to use builtin command getopts (see here for more information and examples).

#!/bin/bash
cmd(){ echo $(basename "$0"); }

# Error message
error(){
    echo "`cmd`: invalid option -- '$1'";
    echo "Try '`cmd` -h' for more information.";
    exit 1;
}

usage(){
    echo "`cmd` [OPTION...] [--] ARGS"
    column -t -s  ";" << __USAGE__
    -a; hey.
    -b; bee.
    -c FILE; cee FILE.
    -h; Print this help.
__USAGE__
    exit $1
}

while getopts  "abc:h" flag
do
    case "$flag" in
        a) echo "$OPTIND: flag a set" ;;
        b) echo "$OPTIND: flag b set" ;;
        c) echo "$OPTIND: flag c set to $OPTARG" ;;
        h) usage 0 ;;
        *) error $flag ;; # ?) Unknow flag / :) Missing arg
    esac
done
shift $((OPTIND-1))
echo ARGS: $@

$ ./g -abc "foo" "bar"
1: flag a set
1: flag b set
3: flag c set to foo
ARGS: bar

To parse option like --value=name ([15])

until [[ ! "$*" ]]; do
  if [[ ${1:0:2} = '--' ]]; then
    PAIR=${1:2}
    PARAMETER=$(echo ${PAIR%=*} | tr [:lower:]- [:upper:]_)
    eval P_$PARAMETER=${PAIR##*=}
  fi
  shift
done

Another built-in example:

N_ARGS="$#"
while [ "$#" -gt 0 ]
do
    case "$1" in
        # List long options '--*' *FIRST*
        --verbose) VERBOSE=1
            ;;
        --output) OUTPUT="$2"
            shift
            ;;
        --*) die_usage "Illegal option '$1'"
            ;;
        -*)
            OPTS="$1"
            while [ "$OPTS" != "-" ]; do
                case "$OPTS" in
                    # Options that takes an extra param does not have a trailing '*'
                    # because they must be the last in the group.
                    -b) BAR="$2"
                        shift
                        ;;
                    -f*) FOO=1
                        ;;
                    -q*) QUIET=1
                        ;;
                    -*) die_usage "Illegal option '-${OPTS:1:1}'"
                        ;;
                esac
                OPTS=${OPTS/-?/-}              # Get next option
            done
            ;;
        *)  break
            ;;
    esac
    shift
done
shift $(($N_ARGS - $#))
# $1 $2 ... contains positional args

Empty a file keeping permissions

Empty a file named filename, keeping the same permission and user/group:

>filename

Print multi-lines with echo

Print multi-lines text with echo:

$ echo -e "Some text\n...on 2 lines..."                    # Enable interpretation of backslash escapes (must be quoted!)
Some text
...on 2 lines...

Print multi-line variables with echo

One can save in a variable the multi-line output of a command. Later this variable can echoed while preserving the linefeeds if the variable is enclosed in quotes "...":

$ mymultilinevar=$(<myfile.txt sed -e'/first line/,/last line/')
$ echo "$mymultilinevar"
first line
second line
...
last line

Echo with colors

References:

The command echo can display colors thanks to escape sequence commands [17]:

echo -e "\033[35;1m Shocking \033[0m"       #Display "shocking" in bright purple

The first character is the escape character 27 (033 in octal). One can also type directly ^[ (i.e. Ctrl-AltGr-[). The syntax is (where spaces were added for clarity)

\033 [ <command> m
\033 [ <command> ; <command> m

Note that commands can be chained. The set of commands is given in the color table below:

code	style	code	foreground	code	foreground	code	background	code	background
0	default colour			90	dark grey	40	black	100	dark grey
1	bold	31	red	91	light red	41	red	101	light red
4	underlined	32	green	92	light green	42	green	102	light green
5	flashing text	33	orange	93	yellow	43	orange	103	yellow
7	reverse field	34	blue	94	light blue	44	blue	104	light blue
		35	purple	95	light purple	45	purple	105	light purple
		36	cyan	96	turquoise	46	cyan	106	turquoise
		37	grey			47	grey

A more portable solution is to use tput.

ANSI Color Code Variables

See [18]. Use echo -e "${Red}Red" to use them:

# Reset
Color_Off='\e[0m'       # Text Reset

# Regular Colors
Black='\e[0;30m'        # Black
Red='\e[0;31m'          # Red
Green='\e[0;32m'        # Green
Yellow='\e[0;33m'       # Yellow
Blue='\e[0;34m'         # Blue
Purple='\e[0;35m'       # Purple
Cyan='\e[0;36m'         # Cyan
White='\e[0;37m'        # White

# Bold
BBlack='\e[1;30m'       # Black
BRed='\e[1;31m'         # Red
BGreen='\e[1;32m'       # Green
BYellow='\e[1;33m'      # Yellow
BBlue='\e[1;34m'        # Blue
BPurple='\e[1;35m'      # Purple
BCyan='\e[1;36m'        # Cyan
BWhite='\e[1;37m'       # White

# Underline
UBlack='\e[4;30m'       # Black
URed='\e[4;31m'         # Red
UGreen='\e[4;32m'       # Green
UYellow='\e[4;33m'      # Yellow
UBlue='\e[4;34m'        # Blue
UPurple='\e[4;35m'      # Purple
UCyan='\e[4;36m'        # Cyan
UWhite='\e[4;37m'       # White

# Background
On_Black='\e[40m'       # Black
On_Red='\e[41m'         # Red
On_Green='\e[42m'       # Green
On_Yellow='\e[43m'      # Yellow
On_Blue='\e[44m'        # Blue
On_Purple='\e[45m'      # Purple
On_Cyan='\e[46m'        # Cyan
On_White='\e[47m'       # White

# High Intensty
IBlack='\e[0;90m'       # Black
IRed='\e[0;91m'         # Red
IGreen='\e[0;92m'       # Green
IYellow='\e[0;93m'      # Yellow
IBlue='\e[0;94m'        # Blue
IPurple='\e[0;95m'      # Purple
ICyan='\e[0;96m'        # Cyan
IWhite='\e[0;97m'       # White

# Bold High Intensty
BIBlack='\e[1;90m'      # Black
BIRed='\e[1;91m'        # Red
BIGreen='\e[1;92m'      # Green
BIYellow='\e[1;93m'     # Yellow
BIBlue='\e[1;94m'       # Blue
BIPurple='\e[1;95m'     # Purple
BICyan='\e[1;96m'       # Cyan
BIWhite='\e[1;97m'      # White

# High Intensty backgrounds
On_IBlack='\e[0;100m'   # Black
On_IRed='\e[0;101m'     # Red
On_IGreen='\e[0;102m'   # Green
On_IYellow='\e[0;103m'  # Yellow
On_IBlue='\e[0;104m'    # Blue
On_IPurple='\e[10;95m'  # Purple
On_ICyan='\e[0;106m'    # Cyan
On_IWhite='\e[0;107m'   # White

Using tput

tput is an utility that can configure terminal-dependent capabilities from the shell.

Example of use:

# See 'man terminfo 5' for a list of capabilities
echo "$(tput sgr0)This text is displayed normally."
echo "$(tput setaf 1)This text is displayed in RED."
echo "$(tput setaf 2)This text is displayed in GREEN."
echo "$(tput sgr0)This text is displayed normally."

Assuming that tput always generate escape sequence, we can avoid the extra shell call by calling tput once for every format:

Z="$(tput sgr0)"
R="$(tput setaf 1)"
G="$(tput setaf 2)"
echo "${Z}This text is displayed normally."
echo "${R}This text is displayed in RED."
echo "${G}This text is displayed in GREEN."
echo "${Z}This text is displayed normally."

Get file size

The different ways to extract file size in a Bash script:

SIZE=$(stat -c%s "$FILENAME")                              # Using stat
SIZE=$(ls -l $FILENAME | awk -F" "'{ print $5 }')          # Using ls / awk
SIZE=$(du -b $FILENAME | sed 's/\([0-9]*\)\(.*\)/\1/')     # Using du
SIZE=$(cat $FILENAME | wc -c)                              # Using cat / wc
SIZE=$(ls -l $FILENAME | cut -d " " -f 6)                  # Using ls / cut

Read file content into env variable

Read the content of a file into an environment variable:

PID=`cat $PIDFILE`
read PID < $PIDFILE

Get the PID of a new / background process

Getting the pid of a new process (when other processes with same name are already running)

oldPID=`pidofproc /usr/bin/ssh`
/usr/bin/ssh -f -N -n -q -D 1080 noekeon
RETVAL=$?
newPID=`pidofproc /usr/bin/ssh`
uniqPID=`echo $oldPID $newPID|sed -e 's/ /\n/g'|sort|uniq -u`
echo $uniqPID

Or if the process was launched in the background in a script [19]:

foo &
FOO_PID=$!
# do other stuff
kill $FOO_PID

Get the PID of a running process

Getting the pid of a running process

pid=$(pidof -o $$ -o $PPID - o %PPID -x /bin/ssh)

Detect if a given process is running

This is actually a tricky one. Some good solutions, all giving answer in $?:

[ -e /proc/$pid ]               # PID  - nice, but is it portable?
ps -p $pid >/dev/null           # PID  - need redirect, otherwise ps will print the process found
pgrep "^$name$"                 # NAME - probably the best using command-name
pkill -0 $name                  # NAME - ... similar & less robust (fail if process can't accept signal)
/bin/kill -0 $pid 2>/dev/null   # PID  - need redirect, otherwise kill will complain if no process found
                                #        ... also works with bash built-in kill

Using ... =~ ...:

if [[ $(ps $pid) =~ $name ]];   # Test both PID and process name

Some wrong / bad solutions:

ps -aef | grep $pid                   # --== FAIL ==-- Will match grep process itself + $pid as ppid
ps -aef | grep $name                  # --== FAIL ==-- Will match grep process itself
ps -aef | grep -v grep | grep $pid    # --== UGLY ==-- ... and slow. Better use ps -fp $(pgrep $pid)
ps -p $pid | grep $pid                # --== SLOW ==-- better test $? immediately

Don't use this method for locking in startup scripts. Be careful with race condition. The best solution is to use a mutex, or use an atomic command (like mkdir). See for example:

Launch a process in the background

Different ways to launch process in the background (unordered - might be useful one day...). The double ampersand trick comes from here.

myprocess.exe &
exec myprocess.exe
exec myprocess.exe &
( ( exec myprocess.exe & ) & )
nohup myprocess.exe &
( ( nohup myprocess.exe & ) & )

Display the name / body of functions

To list the functions declared in the current environment, or to list the body of a function:

declare -f                    # List all defined functions and their bodies
declare -f name               # List the body of function "name"
declare -F                    # List name of all defined functions

Or alternatively use bash built-in type:

type name                     # Works with commands, builtins, function, aliases...

Return the subnet address

Solution from [20].

/sbin/ifconfig eth0 |
grep 'inet addr' | tr .: '  ' |
(read inet addr a b c d Bcast e f g h Mask i j k l;
echo $(( $a & $i )).$(( $b & $j )).$(( $c & $k )).$(( $d & $l )) )

Remove file name extensions

FILENAME="myfile.pdf"
echo ${FILENAME%%.pdf}          # only matches '.pdf', not '.PDF'
echo ${FILENAME%%.???}          # only matches 3-letter extension

Formatted output / printing using printf

printf is a Bash built-in function that allows printing formatted output much like the standard C printf instructions.

printf "%02d" 1                  # outputs '01'

Delete files with special characters

find . -inum [inode] -exec rm -i {} \;     # Use inode
rm -- -foo                                 # Special case for name with a heading dash
rm ./-foo

Remove useless invocation of 'cat'

There are basically only 3 valid uses of cat:

Show the content of a file in a terminal
Write a "here" document or standard input to a file in a terminal
Concatenating several files together (hence the name of cat)

However cat is frequently used for other purposes like piping a file in a process. This is a bad habit. It is slow and add an unnecessary process. A better alternative is to use the file redirection feature of the shell:

Correct use
cat file # Correct cat <<EOF >file # Correct cat file1 file2 # Correct

Bad use (and fix)
cat file \| myprocess # Bad $(cat file) # Bad	<file myprocess # Correct $(< file) # Correct

Using Process Substitution

The process substitution feature of Bash takes the form <(list) or >(list). The process list is run with its input or output connected to a FIFO (named pipe) or a file in /dev/fd. The name of this file is then passed as an argument to the current command (as a result of the expansion). We can see this explicitly with the following examples:

echo >(true)
# /dev/fd/63
echo <(true)
# /dev/fd/63

This feature can be used to build some very advanced redirection [21]:

diff <(ls dir1) <(ls dir2)                                         # Compare the content of 2 directories
sort -k 9 <(ls -l /bin) <(ls -l /usr/bin) <(ls -l /usr/X11R6/bin)  # Sort content of 3 directories
tar cf >(gzip -c > file.tar.gz) $directory                         # Equivalent of tar czf file.tar.gz $directory

It can also be used to use variables that would otherwise be limited to some subprocess, like:

: | ((x++))           # This actually starts a subprocess
: | ( ((x++)) )       # ... like this. 
echo x                # ... so 'x' is undefined here

((x++)) < <(:)        # now variable 'x' remains in the main process
echo $x               # x is defined

One can use lastpipe to tell Bash to run the last pipe in the current shell though:

set +m                # Optional in script - disable job control (needed for lastpipe)
shopt -s lastpipe
: | ( ((x++)) )       # Now variable 'x' remain in the main process
echo $x               # x is defined

Redirecting stdout and stderr with tee and a pipe

Using tee and the standard piping mechanism, it is easy to redirect the content of stdout to a file and stdout:

command | tee stdout.log           # Keep a copy of 'command' output in file 'stdout.log'

What if we also want to do the same with stderr? In other words, can we also pipe stderr?
Yes, in Bash this is easy! We only need to use the process substitution feature (reference [22])!

command |& tee stdoutnerr.log                        # Pipe BOTH stdout and stderr
command 2> >(tee stderr.log) >&2                     # Keep a copy of 'command' stderr in file 'stderr.log'
command 2> >(tee stderr.log) >&2                     # Keep a copy of 'command' stderr in file 'stderr.log'
command > >(tee stdout.log) 2> >(tee stderr.log >&2) # Keep both a copy of stdout and stderr in separate files

Note that tee always print the content of stdin to stdout. That's why we need the redirection >&2 to redirect it back to stderr.

To redirect stdout for current script:

#! /bin/bash

exec > >(tee foo)

To redirect both stdout and stderr for current script:

#! /bin/bash

exec > >(tee foo) 2>&1

Forcing program to read from standard input instead of file

See /proc filesystem

Finding symbolic link target

Use readlink:

target=$(readlink -n source)       # Return target basename of link 'source'
target=$(readlink -nf source)      # Return target fullname of link 'source'

Escape special / meta- character in a string

Use printf "%q" to automatically escape special characters in a string, so that they can be reused as shell input:

printf "%q" 'pipe:[12345]'         # Returns "pipe:\[12345\]"
safefname=$(printf "%q" "$fname")  # Protects file name if it contains special character

Find intersection between 2 files

grep -f file1 file2

Join lines with comma

pgrep -P $somepid | sed -re ':a N; s/\n/,/; b a'                    # With Sed
pgrep -P $somepid | perl -e '@_=<>; chomp @_; print join ",",@_'    # With Perl
pgrep -P $somepid | perl -e '@_=<>; chomp @_; $,=","; print @_'     # With Perl

Another example using tr:

echo -n "$(pgrep -P $somepid)" | tr '\n' ','                        # use -n "..." so that interim newline are kept, but none added at the end
echo $(pgrep -P $somepid) | tr ' ' ','                              # Here echo will translate interim newlines to space

Join arrays with delimiters

From StackOverflow:

# Multi-character delimiter
function join_by {
  local d=${1-} f=${2-}
  if shift 2; then
    printf %s "$f" "${@/#/$d}"
  fi
}

join_by , a b c #a,b,c
join_by ' , ' a b c #a , b , c
join_by ')|(' a b c #a)|(b)|(c
join_by ' %s ' a b c #a %s b %s c
join_by $'\n' a b c #a<newline>b<newline>c

# Single-character delimiter
function join_by { local IFS="$1"; shift; echo "$*"; }

join_by , a "b c" d #a,b c,d
join_by / var local tmp #var/local/tmp
join_by , "${FOO[@]}" #a,b,c

Force single trailing slash in directory

#function single() { echo ${1%%\/*}/; }             # WRONG!
function single() { A=${1%//}; echo ${A%/}/; }

for i in / // . ./ .// dir dir/ dir// /home/john; do single $i; done
# /
# /
# ./
# ./
# ./
# dir/
# dir/
# dir/
# /home/john/

Keep Color with Less

colordiff -bu file1 file2 | less -R            # Use -R to preserve color with less pager

Pad with newlines

Padding with newlines is a bit difficult because we cannot use a function and a command substitution because the latter will always remove the trailing newlines no matter what. A solution is as follows:

function padln()
{
    PAD=
    local N=$1
    while (( N-- > 0 )); do 
        PAD=$PAD$'\n'
    done
}
padln 2
VAR=$'line1\nline2\n'$PAD
echo "$VAR" | wc                   # Don't forget quotes!
#     4 ...

Avoid duplicate entries in PATH

From [23]:

function addpath()
{
  new_entry=$1
  case ":$PATH:" in
    *":$new_entry:"*) :;; # already there
    *) PATH="$new_entry:$PATH";; # or PATH="$PATH:$new_entry"
  esac
}

Or using == operator:

function addpath()
{
  if ! [[ $PATH == *:$1:* ]]; then 
    export PATH="$1:$PATH"  # or PATH="$PATH:$1"
  fi
}

Another option is to use [ $(expr match ":$PATH:" ".*:$1:.*") -eq 0 ], but this spawns a process and hence is much slower.

Remove directory from PATH

Several solutions available from SO. Using the pure bash one (w/o process spawn):

rmpath() {
   local d
   d=":$PATH:"    # Surround $PATH with :
   d=${d//:$1:/:} # Replace all occurences of :$1: with :
   d=${d#:}       # Remove heading :
   PATH=${d%:}    # Remove trailing :
}

Get directory of a sourced script

The best and simplest solution is to use readlink with parameter -f (requires package coreutils, and not portable on Mac OSX). The following works even if the script is itself a symlink.

BASEDIR=$(dirname "$(readlink -nf "${BASH_SOURCE[0]}")")

On Mac OSX, we have to use the more complex solution [24]:

SOURCE="${BASH_SOURCE[0]}"
while [ -h "$SOURCE" ] ; do SOURCE="$(readlink "$SOURCE")"; done
BASEDIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"

Some bad or limited solutions:

# BAD - Does not work if script is a symlink; only give a RELATIVE path.
BASEDIR="$(dirname "${BASH_SOURCE[0]}")"

# BAD - Does not work if script is a symlink.
BASEDIR="$(cd "$(dirname "${BASH_SOURCE[0]}" )" && pwd )"

# BAD - Does not work if script is a symlink.
# Dereference all paths, except script itself.
BASEDIR="$(cd -P "$(dirname "${BASH_SOURCE[0]}" )" && pwd )"

# BAD - Works only is script is executable and within PATH
PROGDIRNAME=$(dirname $(which "$0"))

Detect spaces in file name

Some script-fu of mine:

if [ $(wc -w <<< $FILENAME) -eq 1 ]; then echo no spaces; else echo space found in filename; fi

Get SSH hostname from given host name

Say we have the following .ssh/config:

Host myhost
    UserName    myuser
    HostName    myhost.domain.com

[...]

We want to get the HostName corresponding to myhost:

#First pre-process ssh config file, only keeping lines of the form "host xxx yyy hostname zzz"
SSH_CONFIG="$(< ~/.ssh/config sed -rn 's/#.*//; s/ +/ /g; s/[hH]ost/host/; s/[nN]ame/name/; /host |hostname/p'|sed -r ':a /host/N; /hostname/!b a; {s/\n *hostname/ hostname/; p; d}')"

NAME="myhost"
$(echo "$SSH_CONFIG" | perl -lne 'print for / '"$NAME"' .*hostname +(.*)/g')

String and path manipulation

Echo first word in a space-separated list:

make="/usr/bin/make -r --no-print-directory -j 2"

# Using array
words=($make)
echo $words                  # $words same as ${words[0]}

# Using suffix matching
echo ${make% *}

# Using pattern matching
echo ${make/ */}

Replace a folder name within a path (i.e. not trailing or ending).

FILE=/foobar/bar/foobar.txt
echo ${FILE/\/bar//fuu}      # We *must* escape first /, but 2nd can be as-is.
echo ${FILE//bar//fuu}       # WRONG. Will replace *all* occurences of "bar" with "/fuu"

Use if `... =~ ''pattern''` instead of `if ( ... | grep ... )`

Constructs like if ( ... | grep ... ) spawn 2 processes, and are then inefficient (in particular on Cygwin).

if ( ps aux | grep ssh-agent ); then echo ssh-agent found; fi    # NOT EFFICIENT, 2 processes spawn
 
if [[ $(ps aux) =~ ssh-agent ]]; then echo ssh-agent found; fi   # BETTER!!!

Test whether a variable is set/defined/unset/empty

One can use the rich parameter expansion possibilities:

echo ${VAR:-word}	Use Default Values — (expansion of) word if VAR is unset or null; `$VAR` otherwise
echo ${VAR-word}	Use Default Values — (expansion of) word if VAR is unset; `$VAR` otherwise
echo ${VAR:+word}	Use Alternate Values — nothing if VAR is unset or null; (expansion of) word otherwise
echo ${VAR+word}	Use Alternate Values — nothing if VAR is unset; (expansion of) word otherwise

We have:

unset U
E=""
S="s e t"
echo U${U+x} E${E+x} S${S+x} U${U:+x} E${E:+x} S${S:+x}
#    U       Ex      Sx      U        E        Sx

echo U${U-x} E${E-x} S${S-x} U${U:-x} E${E:-x} S${S:-x}
#    Ux      E       Ss e t  Ux       Ex       Ss e t

So one can test if VAR is unset with (quote when present are necessary in the test):

[ -z ${var+x} ] && echo "unset" || echo "set to '$var'"
[ -n "${var+x}" ] && echo "set to '$var'" || echo "unset"
[ -z "${var-x}" ] && echo "empty" || echo "set or unset"
[ -n "${var:+x}" ] && echo non-empty || echo empty or unset

If we want to test that a set of variables are defined, we can use indirect expansion:

REFS="FOO BAR[0] BAR[1]"
for refs in $REFS; do
    [ -n "${!refs+defined}" ] || echo "Variable '$refs' is NOT defined"
done

As we see it also works nicely with arrays!

Alternatively type echo $VARTAB, Bash shall add a space if VAR is set or empty.

Use `sponge` to easily modify a file inplace

sponge is part of package moreutils. It can be used to easily edit file in-place:

sed -r '...' FILE | grep ... | sponge FILE                   # Sponge soaks its full input before creating output file

Use auto-complete with command starting with 'sudo'

Just add to .bashrc ([25]):

if [ "$PS1" ]; then
    complete -cf sudo            
fi

Test if a directory is empty

From [26]:

$ [ "$(ls -A /tmp)" ] && echo "Not Empty" || echo "Empty"
# OR
if [ "$(ls -A /tmp)" ]; then
    echo "Not Empty"
else
    echo "Empty"
fi

A solution that does not invoke a sub-shell [27]:

shopt -s nullglob
shopt -s dotglob # To include hidden files
files=(/some/dir/*)
if [ ${#files[@]} -gt 0 ]; then echo "huzzah"; fi
shopt -u nullglob dotglob

Be more efficient with Bash console

use Alt-. to replace the last argument of last command.

$ cd mydirectory
bash: cd: mydirectory: No such file or directory
$ mk Alt-.

use !! to replace last command. Very handy for:

$ apt-get install package
E: Could not open lock file /var/lib/dpkg/lock - open (13: Permission denied)
E: Unable to lock the administration directory (/var/lib/dpkg/), are you root?
$ sude !!

Sum integers, one per line?

From stackoverflow.com

awk '{s+=$1} END {print s}' mydatafile
awk '{s+=$1} END {printf "%.0f", s}' mydatafile       # To avoid 2^31 overflow in some version of awk

Test existence of an array index or key

We find the following solution on stackoverflow.com

[ ${array[key]+abc} ] && echo "exists"

We can extend the solution. For instance, say we want to return a default key if a given key is not found:

read -p "enter key" key
echo "Value for key $key is ${array[$key]:-array[default]} ]"      # Will print value for $key, or for defaultkey if not found

How to detect if a script is being sourced

This is a though question, see stackoverflow for details [[28]].

The best solution if bash support BASH_SOURCE:

[[ "${BASH_SOURCE[0]}" != "${0}" ]] && echo "script ${BASH_SOURCE[0]} is being sourced ..."

The following solution is portable between Bash and Korn:

[[ $_ != $0 ]] && echo "Script is being sourced" || echo "Script is a subshell"

Get ip address of local host / remote host

Remote host:

getent hosts remotehost | awk '{ print $1; exit }'
dig +short remotehost | head -n 1

local host:

hostname -I | awk '{ print $1 }'     # awk because might have several ip address

Expand tilde `~` in variables

The simplest [29]:

var="${var/#\~/$HOME}"             # If var contains a single file name, var="~/myfile"
var="${var//\~/$HOME}"             # If var contains several file names, var="~/myfile1 ~/myfile2"

DO NOT USE eval. Using eval is not safe if applied without safeguard (variable could eval to rm -rf $HOME).

Run a command when a file changes

Easiest solution is to use entr:

find -name *.c | entr make

Alternatively, use inotifywait or script sleep_until_modified.sh [30].

Remove CRLF and trailing whitespace in text files

Using ack-grep:

# Convert CRLF to LF (2x to get rid of CRCRLF)
ack-grep -f --text --print0 | xargs -0 dos2unix
ack-grep -f --text --print0 | xargs -0 dos2unix
# Convert CR to LF
ack-grep -f --text --print0 | xargs -0 mac2unix
# Remove trailing blanks/tabs
ack-grep -f --text --print0 | xargs -0 sed -ri 's/[ \t]+$//'

Using ag:

# Convert CRLF to LF (2x to get rid of CRCRLF)
ag -lt0 | xargs -0 dos2unix       # or 'ag --files-with-matches --all-text --print0 ...'
ag -lt0 | xargs -0 dos2unix
# Convert CR to LF
ag -lt0 | xargs -0 mac2unix
# Remove trailing blanks/tabs
ag -lt0 | xargs -0 sed -ri 's/[ \t]+$//'

Using find to restrict to some extensions:

# Convert CRLF to LF (2x to get rid of CRCRLF)
find -type f -regex ".*\.\(c\|h\|cpp\|hpp\)" -print0 | xargs -0 dos2unix
find -type f -regex ".*\.\(c\|h\|cpp\|hpp\)" -print0 | xargs -0 dos2unix
# Convert CR to LF
find -type f -regex ".*\.\(c\|h\|cpp\|hpp\)" -print0 | xargs -0 mac2unix
# Remove trailing blanks/tabs
find -type f -regex ".*\.\(c\|h\|cpp\|hpp\)" -print0 | xargs -0 sed -ri 's/[ \t]+$//'

Detect if script redirected through pipe

From stackoverflow.com:

if [ -t 1 ] ; then echo terminal; else echo "not a terminal"; fi
# terminal
(if [ -t 1 ] ; then echo terminal; else echo "not a terminal"; fi) | cat
# not a terminal

Try running a program until it succeeds

This is typically useful for cron scripts. From StackExchange:

#!/bin/sh
# Check to see if this is already running from some other day
mkdir /tmp/lock || exit 1
while ! command-to-execute-until-succeed; do
    # Wait 30 seconds between successive runs of the command
    sleep 30
done
rmdir /tmp/lock

Infinite wait in Bash

From SO:

#! /bin/bash

trap 'trap - INT TERM EXIT; rm -f mypipe; exit $?' INT TERM EXIT
mkfifo mypipe

while : ; do
    read S <mypipe
    case "$S" in
      *EXIT*)
        >&2 echo "Got EXIT."
        break
        ;;
      *)
        >&2 echo "Signal '$S' not supported."
        ;;
    esac
done

exit 0

Only drawback: the source process writing to fifo will block until the sink process start to read the fifo again. See SO again for ftee, a tee-like clone that can pipe to a fifo without blocking.

Functions to manipulate IP addresses

ip_to_int()
{
    local IP=$1
    echo $(( $(echo $IP | sed -r 's/^/(((/; s/\./)*256+/g') ))
}

cidr_to_int()
{
    local CIDR=$1
    echo $(( (0xFFFFFFFF << (32-CIDR)) & 0xFFFFFFFF ))
}

int_to_ip()
{
    local INT=$1
    local IP3=$(( (INT >> 24) & 0xFF ))
    local IP2=$(( (INT >> 16) & 0xFF ))
    local IP1=$(( (INT >> 8) & 0xFF ))
    local IP0=$(( INT & 0xFF ))
    echo "$IP3.$IP2.$IP1.$IP0"
}

cidr_to_mask()
{
    local CIDR=$1
    int_to_ip $(cidr_to_int $CIDR)
}

ip_cidr_to_subnet()
{
    local IP_INT=$(ip_to_int $1)
    local CIDR_INT=$(cidr_to_int $2)
    int_to_ip $((IP_INT & CIDR_INT))
}

Example of use:

cidr_to_mask 24
# 255.255.255.0
ip_cidr_to_subnet 192.168.10.15 24
# 192.168.10.0

Check if a program exists from a Bash script

From SO.

Ideally use

hash (Bash shell)

Or either

command (POSIX compatible).
type (Bash shell)

hash foo 2>/dev/null || { echo >&2 "I require foo but it's not installed.  Aborting."; exit 1; }
command -v foo >/dev/null 2>&1 || { echo >&2 "I require foo but it's not installed.  Aborting."; exit 1; }
type foo >/dev/null 2>&1 || { echo >&2 "I require foo but it's not installed.  Aborting."; exit 1; }

hash has added advantages that given command will be hashed if it exists, and will ignore aliases.

DO NOT USE which FOR TESTING! It spawns a process for doing little and is not guaranteed to return an error code.

Change a relative path into an absolute (aka full) path

The easiest is to use readlink from package coreutils:

RELATIVE=./src/my.c
echo $(readlink -e "$RELATIVE")          # $RELATIVE must exist
echo $(readlink -f "$RELATIVE")          # All path components but the last must exist
echo $(readlink -m "$RELATIVE")          # Works even if $RELATIVE is missing

Escape positional args for reuse in shell input

Say we write a script that takes a few parameters, and this script must pass along these parameters to another script on a remote machine through ssh. For instance, we would call the script with

local-exec "1st arg" '2nd (arg)'

Then we would like the script to run the ssh command

ssh user@remote remote-exec "1st arg" '2nd (arg)'

Again Stack Overflow comes to the rescue, which we summarize here:

Use $(printf " %q" "$@") (note the space before %).
Use ${*@Q} or "${*@Q}" (available since Bash 4.4?).

Script local-exec

#! /bin/bash
#
#  local-exec

# 1st solution -- using printf and %q -- NOTE THE *SPACE* BEFORE %
# ssh user@server ./remote-exec "$(printf " %q" "$@")"

# 2nd solution:
ssh user@server ./remote-exec ${*@Q}

Script remote-exec on the remote machine:

#! /bin/bash

for arg; do
    echo "'$arg'"
done

This gives:

./local-exec "1st arg" '2nd (arg)'
# '1st arg'
# '2nd (arg)'

This works for passing command and command parameters with ssh, bash -c...

Detect if scripts run on Linux or Windows

A simple one:

if [[ $OSTYPE == linux-* ]]; then
    echo "Running on Linux"
elif [[ $OS == Windows_NT ]]; then
    echo "Running on Windows"
else
    echo "Operating system not detected."
    return 1
fi

Start a new interactive bash sub-shell with some initial command

Again, StackExchange to the rescue:

bash -rcfile <(echo ". $HOME/.bashrc; FOO=foo; export BAR=bar; pwd")

On Debian, this will source /etc/bash.bashrc [31], then source ~/.bashrc, then execute some commands, and remain in the sub-shell. Use exit to leave the subshell.

On other system, /etc/bash.bashrc might need to be sourced explicitly:

bash -rcfile <(echo ". /etc/bash.bashrc; . $HOME/.bashrc; FOO=foo; export BAR=bar; pwd")

Note that the above is strictly equivalent to doing in a shell:

bash
FOO=foo
export BAR=bar
pwd

So, even non-exported variable will be part of the new sub-shell.

The following solution is more compact, but lose the non-exported variables:

bash -c 'FOO=foo; export BAR=bar; pwd; exec bash'

Find non-ascii characters

# Using grep
find -print0 | LANG=C LC_ALL=C xargs -0 grep -Pl "[\x80-\xff]"

# Using ag
ag -l "[\x80-\xff]"

Split list of words as separate lines / filter duplicate words

Say we have

FOO="foo bar baz foo"

We can easily split that into separate lines with xargs

echo $FOO | xargs -n1
# foo
# bar
# baz
# foo

For instance we can use that to filter duplicate words:

echo $FOO | xargs -n1 | sort -u
# bar
# baz
# foo
echo $FOO | xargs -n1 | sort -u | xargs
# bar baz foo

Set IFS / GLOBIGNORE for one assignment only

From SO:

IFS=$'\r\n' GLOBIGNORE='*' command eval  'XYZ=($(cat /etc/passwd))'

Using command eval, the first two variable assignment are only valid for the command execution. Without it, it would appear as three variable assignement that are persistent in the script.

Duplicate stdout to stderr

echo foo | tee /dev/stderr

Pick random line in a text file

We can use sort -R or shuf:

sort -R FILE | head -n 1
shuf -n 1 FILE

Store list of files in a directory in a array

From SO:

# Simpler
A=(*)                        # Also works when filename as space
for f in "${A[@]}"; do ...   # Mind the "..."

# Simple + support empty directory
shopt -s nullglob
A=(*)
for f in "${A[@]}"; do ...

# Patterns
shopt -s nullglob
A=(*.h)
for f in "${A[@]}"; do ...

# More powerful patterns
shopt -s globstar nullglob   # Add dotglob to also scan dot dir
A=( **/*"$input"* )
for f in "${A[@]}"; do ...

Using find whhorks, but only if the paths do not contain spaces

# https://stackoverflow.com/questions/23356779/how-can-i-store-the-find-command-results-as-an-array-in-bash
readarray -d '' array < <(find . -name "$input" -print0)   # Bash 4.4+
array=()
while IFS=  read -r -d $'\0'; do
    array+=("$REPLY")
done < <(find . -name "${input}" -print0)                  # Bash 4.3 or before

# ... or using lastpipe to avoid process substitution
set +m
shopt -s lastpipe
array=()
find . -name "${input}" -print0 | while IFS=  read -r -d $'\0'; do array+=("$REPLY"); done

# Keep only files, starting with A
A=($(find -type f -name A*))     # Only if no space!
for f in "${A[@]}"; do ...       # quotes useless in fact...

Test if files with given pattern exists

No easy way with bash

# https://unix.stackexchange.com/questions/79301/test-if-there-are-files-matching-a-pattern-in-order-to-execute-a-script
shopt -s nullglob
set -- *.txt
if [ "$#" -gt 0 ]; then
  ./script "$@" # call script with that list of files.

Wait for several jobs in background to finish

We use the tip from SO:

set -e    # Exit on first error

job1 &
job2 &
# ...

# Waiting loop
while true; do
  wait -n || {
    code="$?"
    ([[ $code = "127" ]] && exit 0 || exit "$code")
    break
  }
done;

Alternatively, there is also parallel.

Find a file in a list / in an array (exclude pattern)

Without regular expression, space separated:

T="source/foo.c include/foo.h source/parrot.c"
T_EX="source/parrot.c include/parrot.h"
for f in $T; do 
  [[ " $T_EX " =~ " $f " ]] && echo "Excluding: $f" || echo "Processing: $f"
done

With regex + support for spaces:

declare -a T
T+=("source/f o o.c")
T+=("include/f o o.h")
T+=("source/par rot.c")
T_EX=("source/par.*")
T_EX+=("include/par rot.h")

match()
{
  f=$1
  while [ $# -gt 1 ]; do
    shift
    [[ $f =~ $1 ]] && return 0
  done
  return 1
}

for f in "${T[@]}"; do
    match "$f" "${T_EX[@]}" && echo "Ignoring   $f" && continue
    echo "Processing $f"
done

Functions for emulating multi-dimensional associative array

Building up on ideas from SO and SO.

#! /bin/bash

filter_values()
{
    local -n myvar="$1"
    local filter=$2
    local key
    for key in ${!myvar[@]}; do
        [[ $key =~ ^$filter, ]] && echo ${myvar[$key]}
    done
}

filter_keys()
{
    local -n myvar="$1"
    local filter=$2
    local key
    for key in ${!myvar[@]}; do
        [[ $key =~ ^$filter, ]] && echo ${key##$filter,}
    done
}

declare -Ax A=( [id,one]=ONE [id,two]=TWO [tgt,one]=TGTONE [tgt,two]=TGTTWO )

filter_keys A id
# two
# one
# TWO
# ONE

filter_keys A tgt
# one
# two
filter_values A tgt
# TGTONE
# TGTTWO

Display MOTD in bash shell

Message-of-the-Day (motd) provides interesting information, like when firmware upgrade are available. By default, motd is only displayed in the console logins.

Add the snippet below in ~/.bashrc to show motd at least once every 24h.

# ~/.bashrc

#### MOTD
########################
touch -d "yesterday" ~/.yesterday
if [ ~/.last-motd -ot ~/.yesterday ]; then
	touch ~/.last-motd
	[ -f /etc/motd ] && cat /etc/motd
	[ -d /etc/update-motd.d ] && run-parts --lsbsysinit /etc/update-motd.d
fi
rm ~/.yesterday

Keep terminal width in piped command

Some commands adapt their output to the terminal width for nicer formatting. However, when the output is piped in another command (say cat), this property is lost.

To restore this behaviour even in case of piping, we can set the COLUMNS variable:

COLUMNS=$(tput cols) my_command | cat    # tell my_command how many columns are available

To also restore the number of lines, we can use LINES and stty size:

LINES=$(stty size | cut -d' ' -f1) COLUMNS=$(stty size | cut -d' ' -f2) my_command | cat

Modify an array in a function

Say we have an array, and we want to modify its content in a function.

An easy way is to pass the array by reference using namerefs (since Bash 4.3+):

crop() {
    local -n array=$1              # This creates a namerefs to given array
    array=("${array[@]:1}")        # Remove the first element from the array
    echo "new array: ${array[*]}"
}

foo=(123 456 789 101 112)
bar=(234 567 890 123 345)
baz=(345 678 901 234 456)

crop foo
crop bar
crop baz

echo "foo: ${foo[@]}"
echo "bar: ${bar[@]}"
echo "baz: ${baz[@]}"

Modify an array in a function

Pits

A list of frequent gotcha's !

Description	Example
Space! - Don't forget to add spaces whenever necessary, in particular around brace in function definition, or in test conditions for ifs.	`if -space- [ -space- -f /etc/foo -space- ]; then ...` `function myfunc() { -space- echo Hello, World!; }`
Quote - Always quote parameters, variables passed to test in if ... then ... else:	`if [ "$name" -eq 5 ]; then ...`
For loops with file - Use simply * to list files in for loops, not `ls *`:	for file in ; cat "$file"; done # SUCCEEDS, even if white space for file in `ls `; cat "$file"; done # FAILS miserably
Incorrect variable definition NO space around equal sign `var = val` is interpreted as command `var` with param `= val` No dollar `$` prefix!!! So it is MYVAR=value and not ~~MYVAR= value~~ !!!	srcDir = $1 # WRONG - spaces around = sign $srcDir=$1 # WRONG - $ prefix maxW= $(sed -rn '/$^/Q' myfile.txt) # WRONG - SPACE! srcDir=$1 # CORRECT srcDir="$1" # BEST
Semi-colon in find - Semi-colon in find commands must be escaped !	find . -exec echo {} ; # WRONG - semi-colon not escaped find . -exec echo {} \; # CORRECT
Using a bash built-in instead of external program Bash built-in commands override external commands with same name (eg. kill and echo)	$ type kill # kill is a shell builtin $ type /bin/kill # /bin/kill is /bin/kill $ /bin/kill -v # kill (cygwin) 1.14
Wrong redirection order	read pid < $PID_FILE 2> /dev/null # WRONG - error msg if $PID_FILE # doesn't exist read pid 2> /dev/null < $PID_FILE # CORRECT
Variable not exported outside parens	( read pid < $PID_FILE ) 2> /dev/null # WRONG - var pid not kept read pid 2> /dev/null < $PID_FILE # CORRECT
Read and piping Don't pipe to read command, or use parens to preserve subshell! Better yet, use `set`	echo "1 2 3" \| read a b c; echo $a $b $c # WRONG - subshell echo "1 2 3" \| (read a b c; echo $a $b $c) # CORRECT - same subshell set -- $(echo "1 2 3"); echo $1, $2, $3 # BETTER
Don't quote tilde ... nor the following slash!	if [ -a "~/bin/my file" ]; then echo found; fi # WRONG if [ -a ~/bin/"my file" ]; then echo found; fi # CORRECT export FOO=~"/foo bar" # WRONG export FOO=~/"foo bar" # CORRECT
Need quoting when echoing a variable with embedded newlines. This is because echo takes newlines (like any blanks) as parameter separator Moreover command substitution always remove the trailing newlines no matter what Also when	HEADER=$(sed -rn '/$^/Q' myfile.txt) echo "$HEADER" # CORRECT echo $HEADER # WRONG - newline are removed VAR=$'\n\n'; echo "$VAR" # CORRECT, newlines are kept VAR="$(echo; echo)"; echo "$VAR" # WRONG, trailing newlines stripped! VAR="$(echo; echo; echo x"; VAR=${VAR%x}; echo "$VAR" # FIXED
Also when using `eval`:	eval $(somefunc foo bar) # WRONG, if somefunc returns several lines eval "$(somefunc foo bar)" # CORRECT
Always append to /dev/stderr or use >&2 instead. The construct `ls >/dev/stderr` is wrong because if stderr was redirected to a file, then `> /dev/stderr` will overwrite the file content. Better use `ls >>/dev/stderr` or best `>&2` The same way *never redirect stderr* to a file**, but instead to a process using Bash's process substitution trick so that to prevent undesired file reset.	sample() { echo "foo" >/dev/stderr echo "bar" >/dev/stderr } #REFERENCE: sample # both lines #WRONG: sample 2> foobar.txt cat foobar.txt # Only last line #FIX USING PROCESS SUBSTITUTION: sample 2> >(cat >foobar.txt) cat foobar.txt # both lines
exit status of pipelines returns status of last step in pipeline. Use `PIPESTATUS` array to get status of each step separately.	# WRONG - $? will return exit status of 'tee' make \| tee make.log status=$? # CORRECT make \| tee make.log exit ${PIPESTATUS[0]}
`read` does not preserve spaces and backslashes by default.	# WRONG - Use read with default option read -p "password: " passwd echo "$passwd" # CORRECT - Use IFS= and -p to keep blanks / backslashes IFS= read -r -p "password: " passwd echo "$passwd"
Do not give extra quotes in pattern matching. Use `[[ ]]` block.	# WRONG - Extra quotes or wrong block if [[ $NAME == ".c" ]]; then mv $NAME src/; fi if [ $NAME == .c ]; then mv $NAME src/; fi # CORRECT - Use [[ ]] and no extra quotes if [[ $NAME == *.c ]]; then mv $NAME src/; fi
*There are no local* variables in bash**. Variables modified in a child function also affects the parent function, even if parent function uses the keyword `local`. A parent function can't prevent children to modify its variable. It is the opposite, by using the keyword `local`, a function avoids modifying the variable in the parent.	function b() { SRC=overwritten-$1 echo $SRC } function a() { local SRC=$1 # WRONG! what if fct. b redefines SRC? local MYSCRIPT_SRC=$1 # CORRECT. Use unique variable names b $SRC echo $SRC $MYSCRIPT_SRC }
`local` absorbs the return status of any process called within.	local OUT=$(foo BAR) local RC=$? # WRONG! $? will always be 0 local OUT OUT=$(foo BAR) local RC=$? # CORRECT
`set -e` has NO effect when used in a `\|\|`, `&&` list, or any expression following `while`, `until`, `if`, `elif` [32], [33].	function fail() { set -x cat file # Here assume 'file' exists # WRONG! set -x will have no effect if fail # ... # is called in an AND-OR expr or alike. cat file_not_found_as_well set +x } fail && true
Not setting LANG=C LC_ALL=C when dealing with strings	sort myfile.txt # BAD - locale dependent LC_ALL=C sort myfile.txt # OK - Traditional sort somelen=${#line} # BAD - Get "some" length LANG=C LC_ALL=C bytlen=${#line} # OK - Get byte length find -print0 \| LANG=C LC_ALL=C xargs -0 grep -Pl "[\x80-\xff]" # Grep for non-ascii characters
Set variables in a pipeline. From the manpage: Each command in a pipeline is executed as a separate process (i.e., in a subshell). We can use `shopt -o lastpipe` to let last element run by the shell process, but job control must be disabled.	A=1; ( A=2 ) \| ( A=3 ); echo $A # BAD - A is 1!

Bash Tips and Pitfalls: Difference between revisions

Latest revision as of 07:48, 17 July 2024

Reference

Tips for Robust Scripts

Use set -u

Use set -e

Expect space in filenames

More safe shell tips

Use signals to fail cleanly

Create temp file and cleanup using signals

Beware of Race conditions

Use unique variable names in functions

Avoid eval like the plague

Trap EXIT or RETURN for cleanup

Tips for Fast Scripts

Avoid forking

Syntax Tips

Function body

External tools

Template

Minimal safe

Long

Tips

Parsing command-line option parameters (getopt/getopts)

Empty a file keeping permissions

Print multi-lines with echo

Print multi-line variables with echo

Echo with colors

ANSI Color Code Variables

Using tput

Get file size

Read file content into env variable

Get the PID of a new / background process

Get the PID of a running process

Detect if a given process is running

Launch a process in the background

Display the name / body of functions

Return the subnet address

Remove file name extensions

Formatted output / printing using printf

Delete files with special characters

Remove useless invocation of 'cat'

Using Process Substitution

Redirecting stdout and stderr with tee and a pipe

Forcing program to read from standard input instead of file

Finding symbolic link target

Escape special / meta- character in a string

Find intersection between 2 files

Join lines with comma

Join arrays with delimiters

Force single trailing slash in directory

Keep Color with Less

Pad with newlines

Avoid duplicate entries in PATH

Remove directory from PATH

Get directory of a sourced script

Detect spaces in file name

Get SSH hostname from given host name

String and path manipulation

Use if ... =~ ''pattern'' instead of if ( ... | grep ... )

Test whether a variable is set/defined/unset/empty

Use sponge to easily modify a file inplace

Use auto-complete with command starting with 'sudo'

Test if a directory is empty

Be more efficient with Bash console

Sum integers, one per line?

Test existence of an array index or key

How to detect if a script is being sourced

Get ip address of local host / remote host

Expand tilde ~ in variables

Run a command when a file changes

Remove CRLF and trailing whitespace in text files

Detect if script redirected through pipe

Try running a program until it succeeds

Infinite wait in Bash

Functions to manipulate IP addresses

Check if a program exists from a Bash script

Change a relative path into an absolute (aka full) path

Escape positional args for reuse in shell input

Detect if scripts run on Linux or Windows

Avoid `eval` like the plague

Use if `... =~ ''pattern''` instead of `if ( ... | grep ... )`

Use `sponge` to easily modify a file inplace

Expand tilde `~` in variables