Awk: Difference between revisions

Revision as of 16:04, 30 May 2019

References

An Awk Primer (good tutorial on Awk)
gawk User guide
GAWK: Effective AWK Programming (gawk.pdf from package gawk-doc)

Awk Examples

ps al | awk '{print $2}'                                         # Print second field of ps output
arp -n 10.137.3.129|awk '/ether/{print $3}'                      # Print third field of arp output, if line contains 'ether' somewhere
getent hosts unix.stackexchange.com | awk '{ print $1 ; exit }'  # Print only first line, then exit
find /proc -type l | awk -F"/" '{print $3}'                      # Print second folder name (i.e. process pid)

Example of parsing an XML file (and comparing with perl):

cat FILE
#        <configuration buildProperties="" description="" id="some_id.1525790178" name="some_name" parent="some_parent">
awk -F "[= <>\"]+" '/<configuration / { if ($8 == "some_name") print $6 }' FILE
# some_id.1525790178
perl -lne 'print $1 if /<configuration .* id="([^"]*)" name="some_name"/' FILE
# some_id.1525790178

Language reference

Awk program structure

@include "script1"    # gawk extension
pattern {action}
pattern {action}
# ...
function name (args) { ... }

A rule is a pattern and action. Either pattern or action can be omitted.

Patterns

/regular expression/ {  }   # match when input records fits reg. exp.
expression           {  }   # match when expression is nonzero
begpat, endpat       {  }
BEGIN                {  }   # match program begin. All BEGIN rules are merged.
END                  {  }   # match program end. All END rules are merged.
BEGINFILE            {  }   # match begin of each file (merged)
ENDFILE              {  }   # match end of each file (merged)
                     {  }   # empty pattern. Match every input record

Control statement

Block and sequences: Instructions are grouped with braces { ... } and separated by newlines or semi-colons ;;

{ if (NR) { print NR; print "not null" } }

If statement

# multiline
if (x % 2)
    print "x is even"
else
    print "x is odd"

# single line
if (x % 2) print "x is even"; else print "x is odd"

While statement

i = 1; while (i <= 3) { print $i; i++ }

For statement

for (i = 1; i <= 3; i++) print $i

How-To

Execute a system command and capture its output

To run a system command, we use system("cmd"). However to capture its output, we use cmd | getline value [1]. However, we must also close the command, otherwise awk will complain / will not reexecute the command / will produce strange resuts:

Example of program:

/\/\/ test password/ {
    cmd = "openssl rand -hex 16"; 
    cmd | getline r; 
    gsub(/[0-9a-f][0-9a-f]/,"0x&, ",r); 
    print "    { ", r, "}, // test password - DO NOT EDIT THIS COMMENT"; 
    close(cmd); 
    next;
}
{print}'

Tips

Defining environment variable

Using an Awk script and Bash builtin eval

eval $(awk 'BEGIN{printf "MY_VAR=value";}')
echo $MY_VAR

Hexadecimal conversion

Use strtonum to convert parameter:

{
    print strtonum($1);       # decimal, octal or hexa (guessed from prefix)
    print strtonum("0"$2);    # To force octal
    print strtonum("0x"$3);   # To force hexadecimal
}

Using environment variables

Use ENVIRON["NAME"]:

{ print strtonum("0x"ENVIRON["STARTADDR"]); }

Pass command-line parameters

Awk variables can be defined directly on the invocation line:

awk -v myvar=123 'BEGIN { printf "myvar is %d\n",myvar }'     # Use -v (before program text) for var used in BEGIN section
echo foo | awk '{ printf "myvar is %d\n",myvar }' myvar=123   # Otherwise specify var after program text

Pass command-line parameters

Awk defines the variables ARGC and ARGV:

BEGIN {
  for (i = 0; i < ARGC; i++)
  print ARGV[i]
}

`$0` is the whole line

# Concatenate DNS
/^A\?/{print record; record=$0} 
/^A /{record=record " " $0;} 
END {print record}

String concatenation

simply line up the string without operator.

print "The result is " result;

Next line on pattern match

Only match one pattern in a pattern list

/PATTERN1/ {print $1; next}
/PATTERN2/ {print $2; next}
{print $3}

Force int conversion with `x+0`

Say we have a file with numbers collated to non-digit:

( 1 2)
( 1 3)

We can force integer conversion by applying some mathematical operation:

awk '{print $3}' foo
# 2)
# 3)
awk '{print $3+0}' foo
# 2
# 3

Pattern conversion

2014-01     2,277.40
2014-02     2,282.20
2014-03     3,047.90
2014-04     4,127.60
2014-05     5,117.60

Use gsub for regex replacement (here remove the commas ,):

awk '{gsub(/,/,"",$2);sum+=$2}END{printf("%f",sum)}'

Remove duplicates, keeping line order

A simple awk script to remove duplicate lines from a file, keeping original order [2]:

awk '!visited[$0]++' your_file > deduplicated_file

@@ Line 180: / Line 180: @@
 <source lang="bash">
 awk '{gsub(/,/,"",$2);sum+=$2}END{printf("%f",sum)}'
+</source>
+=== Remove duplicates, keeping line order ===
+A simple awk script to remove duplicate lines from a file, keeping original order [https://iridakos.com/how-to/2019/05/16/remove-duplicate-lines-preserving-order-linux.html]:
+<source lang=bash>
+awk '!visited[$0]++' your_file > deduplicated_file
 </source>

Awk: Difference between revisions

Revision as of 16:04, 30 May 2019

Contents

References

Awk Examples

Language reference

Awk program structure

Patterns

Control statement

How-To

Execute a system command and capture its output

Tips

Defining environment variable

Hexadecimal conversion

Using environment variables

Pass command-line parameters

Pass command-line parameters

`$0` is the whole line

String concatenation

Next line on pattern match

Force int conversion with `x+0`

Pattern conversion

Remove duplicates, keeping line order

Navigation menu

Awk: Difference between revisions

Revision as of 16:04, 30 May 2019

References

Awk Examples

Language reference

Awk program structure

Patterns

Control statement

How-To

Execute a system command and capture its output

Tips

Defining environment variable

Hexadecimal conversion

Using environment variables

Pass command-line parameters

Pass command-line parameters

$0 is the whole line

String concatenation

Next line on pattern match

Force int conversion with x+0

Pattern conversion

Remove duplicates, keeping line order

Navigation menu

Search

`$0` is the whole line

Force int conversion with `x+0`