Gawk
gawk is the GNU implementation of the AWK programming language and should be considered more the Swiss Army Knife of Shell scripting than a simple command: it has all the functionalities of many other commands (e.g. it replaces grep completely) and can be used to accomplish almost anything (and probably anything when known thoroughly).
Just a few examples.
1.awk
comprises the functionality of many commands (e.g.grep
, to name one). The next box shows howawk
can be used to mimic the behavior ofgrep
andmatch
a fairly long regular expression:
$ gawk '/^(test|sim)data_[[:digit:]]+\.dat$/ {print}' testdata_1.dat testdata_1.dat testdata_.dat simdata_3200.dat simdata_3200.dat foo testdata_1.dat testdata_1.dat2
1. awk
can manipulate easily columns in a line, referencing them as $1,$2, etc:
$ awk 'NF==2 {print $1,"+",$2,"=",$1+$2}' 1 1 1 + 1 = 2 1.4 5.5 1.4 + 5.5 = 6.9
Although learning to use awk
(thus grasping the basics of the AWK programming language)- which means reading the first 2 chapters of the manual - can take some time, your efforts will be repaid without end and you will find yourself using it many times a day.
For this school gawk will be only occasionally required to print different fields of a set of data.
e.g. the file /proc/meminfo
is nicely formatted in 3 different fields:
MemTotal: 2062312 kB MemFree: 15776 kB Buffers: 24016 kB Cached: 1385160 kB SwapCached: 0 kB Active: 676464 kB Inactive: 1222004 kB SwapTotal: 3164764 kB SwapFree: 3164764 kB Dirty: 624 kB Writeback: 0 kB AnonPages: 489292 kB Mapped: 122996 kB Slab: 88864 kB SReclaimable: 69892 kB SUnreclaim: 18972 kB PageTables: 20500 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 4195920 kB Committed_AS: 1237240 kB VmallocTotal: 34359738367 kB VmallocUsed: 49344 kB VmallocChunk: 34359688695 kB
the first being the quantity displayed, the 2th the memory in kilobytes and the third is the unit of measure (always kB). To print just the first and the second column you need to execute this command:
$ cat /proc/meminfo | awk '{print $1,$2}' MemTotal: 2062312 MemFree: 15932 Buffers: 24500 Cached: 1384292 SwapCached: 0 Active: 676672 Inactive: 1221524 SwapTotal: 3164764 SwapFree: 3164764 Dirty: 1128 Writeback: 0 AnonPages: 489528 (...)
To print the same information, but expressed in Mb, just a little more work is necessary:
$ cat /proc/meminfo | awk '{print $1, $2/1024.0, "Mb"}' MemTotal: 2013.98 Mb MemFree: 16.3203 Mb Buffers: 24.2852 Mb Cached: 1350.59 Mb SwapCached: 0 Mb Active: 661.113 Mb Inactive: 1191.91 Mb SwapTotal: 3090.59 Mb SwapFree: 3090.59 Mb Dirty: 0.855469 Mb (...)
awk has also a number of built in functions that can be used to manipulate the data, e.g. to better format your output.
Here is an example using printf (which will be familiar to C programmer):
$ cat /proc/meminfo | awk '{printf("%15s%15.3f Mb\n",$1,$2/1024.0)}'
awk can also be used to do different operations on the data, if it matches some pattern. In the next example only the line where the first field contains the string 'Mem' are correctly printed:
$ cat /proc/meminfo | gawk '$1 ~ /Mem/ { printf("%15s%15.3f Mb\n",$1,$2/1024.0) next } $1 ~ /Dirty/ { print $1,"needs to be cleaned" next } {print $1,"not interesting"}'
Pattern matching is not the only test, that can be performed to conditionally treat different lines of input. Here we print only the lines where the 2th field is 0:
$ cat /proc/meminfo | awk '$2==0 {print}' Writeback: 0 kB NFS_Unstable: 0 kB Bounce: 0 kB
and so on...
The manual of gawk in different formats can be found at the GNU site:
http://www.gnu.org/software/gawk/manual/
Here you will find some very simple examples (ripped directly from the gawk documentations) about what gawk can do (learning only the basics!):