m-chrzan.xyz
aboutsummaryrefslogtreecommitdiff
path: root/awk.md
blob: ea1bb923a7e5dcc2a99525b11a1d46966d7c5de9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# Awk

## Basics

    awk program file

Program is ;-separated pattern-action statements.

    /pattern/ { action }
    # e.g. (apostrophes to not interpolate shell vars
    awk '/tt/ { print $1 }' awk.md

`$0` is whole line, `$1`, `$2`, ... are fields separated by `FS` (by default,
whitespace).

Set `FS` with `-F <sepstring>`. This can be a regular expression.

Actually, the first part is any *condition*, and a /grep/ pattern is just one
possible one. Another useful example: `awk '$2 > 100 { print $3 }'`: print the
third column, if the line's second column is greater than 100.

## Flags

* `-F`: set the field separator, `FS`
* `-v`: set any variable

## Conditions

* Numerical comparisons: `==`, `!=`, `>`, `>=`, `<`, `<=`
* Regex matches: `~`, `!~`
* Logical operators: `&&`, `||`, `!`

Special conditions:

* `BEGIN`: triggered before processing any lines
* `END`: triggered after processing all lines

If multiple conditions match, each action will be executed for it. In
particular, if multiple actions print, each will print. `next` can come in
useful, skipping all other conditions.

## Built in Variables

* `NR`: number of lines processed so far
* `NF`: number of fields (columns) in current line
* `FS`: input field separator, used to split each line into fields.
* `OFS`: output field separator, i.e. what gets printed between comma-separated
  items in a `print` statement. Default: ' ' (space). Typical pattern: set it to
  something else in a BEGIN block or with the `-v` flag.

Useful when processing multiple files:

* `FNR`: like `NR`, resets to 0 on new file
* `FILENAME`: name of currently processed file (`-` when STDIN)

## Actions

* `next`: skips processing of following conditions
* `exit`: finish processing (`END` will be executed)
* `printf`: formatted printing

## Arrays

* They're really more like hashmaps.
* Index with `[]`.
* Iterate over keys: `for(x in arr) print x, arr[x]`