Deciphering AWK code

Login or Register for Dates, Times and to Reply

Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Deciphering AWK code
# 1  
Deciphering AWK code

Dear experts,
I am a relative novice in the Unix and came across a very useful code that I regularly use for my research blindly. I am wondering if any of the professional members could kindly briefly explain to me what the code actually does?

Many thanks in advance

The script is

awk '!(a[$1]) {a[$1]=$0; next} a[$1] {w=$1; $1=""; a[w]=a[w] $0} END {for (i in a) print a[i]}' FS="\t" OFS="\t" A.txt

The original title is : Find duplicates in column 1 and merge their lines (awk?)
# 2  
Here you are:
awk '
!(a[$1])        {a[$1]=$0                       # if array element indexed by $1 is unset or 0, set it to
                                                # the line (i.e. collect first occurrences of $1)
                 next                           # continue with the next input line
a[$1]           {w    = $1                      # if set, save $1 in temp variable
                 $1   = ""                      # and remove it (but leave FS intact)
                 a[w] = a[w] $0                 # then append line to resp. array element
END             {for (i in a) print a[i]        # print all elements containing collected lines
                                                # be aware that the order of elements is unspecified 
' FS="\t" OFS="\t" file

Please note how consistent structuring (e.g. indentation) of the code helps in reading / understanding / seeing patterns in it.
These 4 Users Gave Thanks to RudiC For This Post:
# 3  
A comment:
the existence test a[$1] can give different results on different awk versions, and also it adds an empty array element if there was none.
Better is the test ($1 in a).
I think one should recode the whole thing:
awk '{i=$1; $1=""; a[i]=(a[i] $0)} END {for (i in a) print (i a[i])}' FS="\t" OFS="\t" A.txt

This version stores the $1 (field #1) only as an index, not as a value. Therefore, at the END the index is printed before the value.
# 4  
In for some golf Smilie ?
awk '{$1=A[i=$1]; A[i]=$0} END{for(i in A) print i A[i]}' FS="\t" OFS="\t" A.txt

This User Gave Thanks to Scrutinizer For This Post:
# 5  
Many thanks, everybody!
Your helps are highly appreciated.
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #3
Difficulty: Easy
'Hello World' is a program only for advanced programmers.
True or False?

9 More Discussions You Might Find Interesting

1. Programming

Deciphering a tag character string

I have a string, eg 7f30.3 and I want to store things in the following way npos = 7 decform = true width = 30 ndp = 3 I need to read each character one by one. I am coding in fortran but I can try to code it should answer be given in C in the above way. (2 Replies)
Discussion started by: kristinu
2 Replies

2. Shell Programming and Scripting

Help with AWK Code

hello, I would appreciate a little assistance with a process I'm trying to automate. I have several files that are zipped in central location, all follow the same naming conventions i.e (file 1, file 2, etc). what i would like to do is unzip the files and combined them into one file, basically... (2 Replies)
Discussion started by: mrn970
2 Replies

3. Shell Programming and Scripting

need help deciphering this if statement

I'm going through my bash book and came across this if statment. if *$)" ]; then the book says that the grep expression means "an initial dash followed by a digit" (which I understand) "optionally followed by one or more digits" That's the part I can't figure out -- I know the * is a... (8 Replies)
Discussion started by: Straitsfan
8 Replies

4. Programming

Some help with Perl please (deciphering)

I am trying to simplify the coding in a script I was given, but it was written 7-10 years ago and is pretty complicated. below is a tidbit, if someone can break it down for me I would appreciate it. sub ParseText { my ($line, $key, $value, $sub, $script); foreach $line (@_)... (0 Replies)
Discussion started by: callyvan
0 Replies

5. UNIX for Dummies Questions & Answers

Need help deciphering this

I'm reading about command substitutions and came across this little function in my book: function lsd { date=$1 ls -l |grep -i "^.\{42\}$date"|cut -c55- } it's a little example which is supposed to select files by modification date, given as an argument to the function. I... (3 Replies)
Discussion started by: Straitsfan
3 Replies

6. Shell Programming and Scripting

Deciphering strings or variable values

Hi, I have a script at the moment of which reads in simply what the latest version is within a folder i.e. v001, v002, v003 etc and then stores this latest version in a variable i.e. $LATEST would echo v003. I have then cut this string so that I only consider the 003 part. I would then like to... (3 Replies)
Discussion started by: cyberfrog
3 Replies

7. UNIX for Dummies Questions & Answers

Deciphering the Code

Hi people I am trying to learn this code and see how it relates to the old DOS days. I have a line of code that I am not sure what the first part does. Any help will be greatly appreciated. It is from a Save command that is used to backup files to a directory. It goes like this if ;then... (10 Replies)
Discussion started by: coyote1967
10 Replies

8. Shell Programming and Scripting

Help deciphering FTP get perl script

I found this very useful perl script that will check a remote ftp server, search for files of a specific time and get them. When I run the script it works, but it gave me the following error: Couldn't get filename_12-13-07.txt Bad file number What in this script would cause this? I know... (2 Replies)
Discussion started by: bbbngowc
2 Replies

9. Shell Programming and Scripting

Help deciphering script

There are files on a remote server with the file name ending in "mm-dd-yy.txt". The script I am running is: mls "Daily_Service_Text_File_*" /my/local/dir/Filelisting.txt nawk -F_ -f file.awk /my/local/dir/Filelisting.txt | sort -k1n | cut -f2- | tail -1 It worked up too "12-31-07.txt" but... (3 Replies)
Discussion started by: bbbngowc
3 Replies

Featured Tech Videos