Hi,
Sure, no problem. There are a few different things to this
/usr/bin/awk '$0 ~ / Institution/ {sub(/\r$/,""); print $NF}' "$tmp" line, so we'll take them in turn.
$0 ~ / Institution/
This is pattern-matching. What we're saying here is that we only want to consider the current input (represented by $0) further if it
contains (the meaning of ~ in this context) the
exact string
" Institution" (that's the word 'Institution' with three spaces in front of it). If that pattern-matching check passes, we move on to the next bit of the line.
sub(/\r$/,"");
Now this was something I didn't actually expect to have to do, and it kind of caught me out. As it turns out, the example file you've provided has Windows-style end-of-lines, rather than UNIX-style. This was catching me out when trying to print the Institution ID numbers, since being the last field on the line, they also included the Windows-style end-of-line characters, and it messed with the output.
So what this awk substitution command is doing is looking for lines that end with a carriage return character, and replacing them with nothing, so we only have the line feed character to mark the end of a line. This makes the end of line "normal", from the perspective of a UNIX-style system.
Now that the line has been sanitised and stripped of all characters we don't need and would interfere with our later output (after already being sure we've found a line with the exact string we're looking for), we move on to the last bit of the awk line.
print $NF
This is the easiest one of the bunch, and prints the last field on the line (which in our case, is the Institution Number).
So the full explanation of this
awk line in English would be:
- Look for lines that contain the exact string " Institution"...
- and then strip them of Windows-style line-ends, leaving UNIX-style line ends...
- and finally print out the last field of the remaining line.
Hope this helps.