Why do awk command line vars behave the way they do?


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Why do awk command line vars behave the way they do?
# 1  
Old 04-12-2013
Why do awk command line vars behave the way they do?

This came up a little in another thread. Can someone explain some why awk (I happen to use gawk) behaves as follows:
Code:
$ cat file
aaa

Code:
$ awk 'BEGIN {print x}' x=1
[Blank line]

Code:
$ awk x=1 'BEGIN {print x}'
awk: fatal: cannot open file `BEGIN {print x}' for reading (No such file or directory)

Code:
$ awk -v x=1 'BEGIN {print x}'
1

Code:
$ awk 'BEGIN {print x} {print x}' x=1 file

1

Code:
$ awk 'BEGIN {print x} {print x}' file x=1
[Blank line]
[Blank line]

Code:
$ awk -v x=1 'BEGIN {print x} {print x}' file
1
1

Code:
$ awk -v x=1 'BEGIN {print x} {print x}' file x=2
1
1

Code:
$ awk -v x=1 'BEGIN {print x} {print x}' x=2 file
1
2

I understand this is "how it is". And it is clear to me what is happening. I'm interested in knowing "why". Any practical example to show the advantage or benefit of why it's set up like this?

The gawk man page says:
Quote:
Command line variable assignment is most useful for dynamically
assigning values to the variables AWK uses to control how
input is broken into fields and records. It is also useful for
controlling state if multiple passes are needed over a single data file.
That's a little helpful, but not much. And I have not previously used "multiple passes", so maybe I'm missing something there.
# 2  
Old 04-12-2013
Well, not being an awk guru, I'd like to comment on some of your questions, certainly not exhaustively:
Quote:
$ awk 'BEGIN {print x}' x=1
[Blank line]
The BEGIN section is executed before any parameters are evaluated, so x is undefined when printed.
Quote:
$ awk x=1 'BEGIN {print x}'
awk: fatal: cannot open file `BEGIN {print x}' for reading (No such file or directory)
Missing the -v option, awk takes x=1 to be the first parameter, expecting it to be the awk program. The second is the file to be opened and worked upon, unless it is a variable assignment. Var assignments are executed (and valid only for the files to follow) in the order they appear on the command line.
This behaviour would also explain all your other examples.
# 3  
Old 04-12-2013
The VAR=1 way of setting variables is the old-fashioned (and arguably more portable) way of doing so. I also think it's more easily readable, so try and do so when I can, but you've discovered its one disadvantage -- BEGIN runs first.

If you wish to set variables before any code runs, you can do this:

Code:
awk -v VAR=1 ...

However, the old way of setting variables has one advantage -- you can set variables on the fly. You can read a file, change a var, read a file, change a var, change a var...

Code:
awk '{A += $1 * N } END { print A }' N=1 file1 N=2 file2 N=3 file3

So for file1, it'd add $1 * 1, for file2 it'd add $1*2, and so forth.
# 4  
Old 04-12-2013
Corona688's final example can be extended to include an assignment after the final file. Such an assignment will take effect after the last line of the last file but before entering the END section. If that last (or only) file is a pipe or stdin, use the filename -.

Regards,
Alister
# 5  
Old 04-12-2013
Thanks for the responses. It doesn't really answer my question as to "why" it's set up this particular way. But I'm perfectly willing to accept that's "the way it is", and make best use of it.

Continuing that thought, from the man page, related to setting variables on the command line:
Quote:
Command line variable assignment is most useful for dynamically
assigning values to the variables AWK uses to control how input
is broken into fields and records. It is also useful for controlling
state if multiple passes are needed over a single data file.
The first sentence is obviously true. For the second sentence, does anyone have experience doing multiple passes over a single data file? It sounds maybe useful, but I don't remember seeing multiple passes before, and using command line variables to control state for multiple passes.
# 6  
Old 04-12-2013
I've demonstrated how it works, and demonstrated that it has some abilities that the more straightforward way doesn't; what more do you want? What it's "for" is whatever you want to do with it, it's just a feature.

I've done multiple passes on the same file on occasion for some difficult comparisons or sorting. Standard deviations is another one where two passes is helpful, you get the average and count on the first pass then use those to get the deviation on the second pass.
# 7  
Old 04-12-2013
Here is an example:
Code:
awk 'f==1{ print "first pass: " $0} f==2 { print "second pass: " $0 }' f=1 infile f=2 infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

awk command not replacing in first line

As per requirement if column 2 is NULL then 'N' ELSE 'Y'. I have written below awk code. But it is not replacing values for first line. :confused: cat temp.txt 1|abc|3 1||4 1|11|c awk -F'|' '{if($2==""){$2="N"}else{$2="Y"} print $0 } {OFS="|"} ' < temp.txt 1 Y 3 ... (4 Replies)
Discussion started by: max_hammer
4 Replies

2. Shell Programming and Scripting

Pass awk field to a command line executed within awk

Hi, I am trying to pass awk field to a command line executed within awk (need to convert a timestamp into formatted date). All my attempts failed this far. Here's an example. It works fine with timestamp hard-codded into the command echo "1381653229 something" |awk 'BEGIN{cmd="date -d... (4 Replies)
Discussion started by: tuxer
4 Replies

3. Shell Programming and Scripting

Command line - awk, sed

My input file gfile values is CTRY=GM&PROJTYPE=SP&PROJECTTYPE=Small+Project If i am giving PROJECTTYPE then it must give Small Project awk -F"&" '{for (i=1; i<=NF; i++) if ($i ~ "^"PAT) {sub ("^"PAT"=", "", $i); sed 's/'+'/""/' $i ; print $i }}' PAT=$1 ... (6 Replies)
Discussion started by: nag_sathi
6 Replies

4. UNIX for Dummies Questions & Answers

Need an awk command to delete a line

Hi all, As of now am using an awk command to check the number of columns in a file that has 10 lakh rows. Is it possible to remove that particular line having an extra column and copy the remaining lines to a new file ? YOUR HELP IS HIGHLY APPRECIATED. THANKS IN ADVANCE (5 Replies)
Discussion started by: Rahul619
5 Replies

5. UNIX for Dummies Questions & Answers

Using current line in a command in AWK

Hi, Im trying to get current line in the AGREP command I use in AWK. My script looks like this: list.txt car bus checklist.txt cer buss cat list.txt | awk -v mycmd="$(agrep -2 -i $0 checklist.txt)" '{print $mycmd}' It doesnt work. How can I get the current line in the $0... (6 Replies)
Discussion started by: m4rty
6 Replies

6. Shell Programming and Scripting

mail command behave odd

hi, The following mail cmd executed successfully. mailx -s 'subject' user@company.com < testfile.dat However When i include this mail cmd in shell script it behave odd. Getting an error message mailx comand not found. (2 Replies)
Discussion started by: zooby
2 Replies

7. Shell Programming and Scripting

Read 2 $vars from file line by line

Need some help with the following Bourne Shell script. The script is only looping one time and then stops. The script should loop as many times as there are entries in the input file = $FILE_LIST line by line. The file has the path to a source file and the destination directory where the file... (1 Reply)
Discussion started by: Muga801
1 Replies

8. Shell Programming and Scripting

awk operating with shell vars only

Hi all How do I use awk such that it does not require an input file? I have a situation where I need to process some shell vars within awk (passed into awk with "-v VAR1=$VALUE1, VAR2=$VALUE2" etc), but that processing does not require/use an input file. Any guidance? TIA JG (2 Replies)
Discussion started by: jgrogan
2 Replies

9. Shell Programming and Scripting

assign a command line argument and a unix command to awk variables

Hi , I have a piece of code ...wherein I need to assign the following ... 1) A command line argument to a variable e.g origCount=ARGV 2) A unix command to a variable e.g result=`wc -l testFile.txt` in my awk shell script When I do this : print "origCount" origCount --> I get the... (0 Replies)
Discussion started by: sweta_doshi
0 Replies

10. Shell Programming and Scripting

AWK -> getting "global" vars

Hello, presently, I'm calling nawk from my main script and have nawk defined in one file. So I call nawk like this nawk -f file input This file defines how to separate mails in /var/mail/$user and show 1 at a time. However, I would also like to do other actions (delete message, forward... (9 Replies)
Discussion started by: petoSVK
9 Replies
Login or Register to Ask a Question