awk command


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk command
# 1  
Old 06-17-2014
awk command

Can anyone please help me to create the output mentioned below.
Input:
Code:
111,201,111156,10,R,0101,2345,456
111,201,111157,10,R,0101,2346,458
111,201,111156,10,x,0101,2345,456
111,201,111158,10,R,0101,2345,456
112,201,111159,10,x,0101,2344,417
112,201,111159,10,R,0101,2344,417
112,201,111149,10,R,0101,2312,487


output:
Code:
111,201,111157,10,R,0101,2346,458
111,201,111158,10,R,0101,2345,456
112,201,111149,10,R,0101,2312,487

if column 5 is 'X' , then look for the corresponding
matching record(all columns except col5 should match) that has column value 'R' and then remove both records.

Moderator's Comments:
Mod Comment Please use CODE tags when displaying sample input, output, and code.

Last edited by Don Cragun; 06-17-2014 at 01:31 AM.. Reason: Add CODE tags.
# 2  
Old 06-17-2014
Your text talked about matching uppercase X, but your sample data contained lowercase x. Assuming you want a case insensitive match for field 5, you don't care about the output order of unmatched records, uppercase R in field 5 is OK in the output even if the input was lowercase r, and uppercase X in field 5 is OK in the output for unmatched x lines even if the input was lowercase x, then the following seems to do what you want:
Code:
awk '
BEGIN {	FS = OFS = ","
}
{	v5 = toupper($5)
	if(v5 == "X") {
		$5 = "@"
		x[$0]
	} else if(v5 == "R") {
		$5 = "@"
		r[$0]
	} else	# Print non-R, non-X records.
		print
}
END {	for(i in x) {
		if(i in r)
			# Match in r[] and x[].
			delete r[i]
		else {	# Print X record with no matching R record.
			sub(/@/, "X", i)
			print i
		}
	}
	for(i in r) {	
		# Print R record with no matching X record.
		sub(/@/, "R", i)
		print i
	}
}' Input

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.

Last edited by Don Cragun; 06-17-2014 at 02:38 AM.. Reason: Fix missing comma.
# 3  
Old 06-17-2014
awk command

Thank you so much Don! Its working fine .
Can you please explain me the below 2 lines
$5 = "@"
sub(/@/, "X", i)
# 4  
Old 06-17-2014
Quote:
Originally Posted by vinus
Thank you so much Don! Its working fine .
Can you please explain me the below 2 lines
$5 = "@"
sub(/@/, "X", i)
Sure. You have lines like:
Code:
111,201,111156,10,R,0101,2345,456
111,201,111156,10,x,0101,2345,456

and you want to check to see if all fields except field 5 are equal. We could do that by comparing field1, field2, field3, field4, field 6, and field7 separately (and having to change our code later if more fields are added later), or we can set field 5 to a common value and compare the whole line. I used the at-sign character as a common value used to replace the contents of field 5 ($5 = "@") and after throwing away matching X and R lines, changed the at-sign back to X (sub(/@/, "X", i)) or R (sub(/@/, "R", i)) before printing the lines that didn't have matches. If @ is a character that can appear as data anywhere in your file, pick any other character that can't appear in your data and use it instead.
# 5  
Old 06-17-2014
awk command

Don,

Thank you so much for the detailed explaination. i need one more help.
Currently we are checking all the fields except field 5 are equal. Instead of this If i want to check only few fields are equal(ex: field1,field2,field6,field7 are equal) how to do this?
# 6  
Old 06-18-2014
Making several assumptions about still wanting X and R in field 5 to cancel each other out and that if there are multiple X lines with the same values in fields 1, 2, 6, and 7 or multiple R lines with the same values in fields1, 2, 6, and 7 you are only interested in the last occurrence of each matching set, the following might do what you want:
Code:
awk -F, '
{	v5 = toupper($5)
	if(v5 == "X")
		x[$1,$2,$6,$7] = $0
	else if(v5 == "R")
		r[$1,$2,$6,$7] = $0
	else	print
}
END {	for(i in x)
		if(i in r)
			delete r[i]
		else
			print x[i]
	for(i in r)
		print r[i]
}' Input

Of course, with no sample input and no sample output, I have absolutely no idea whether or not this does what you want. With the input you provided before, it produces the output:
Code:
111,201,111157,10,R,0101,2346,458
112,201,111149,10,R,0101,2312,487

# 7  
Old 06-18-2014
awk command

Don,
Thats correct. I have tested the code and its working fine perfectly.
Thanks a lot for your help !
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Pass awk field to a command line executed within awk

Hi, I am trying to pass awk field to a command line executed within awk (need to convert a timestamp into formatted date). All my attempts failed this far. Here's an example. It works fine with timestamp hard-codded into the command echo "1381653229 something" |awk 'BEGIN{cmd="date -d... (4 Replies)
Discussion started by: tuxer
4 Replies

2. Shell Programming and Scripting

Multiple command execution inside awk command during xml parsing

below is the output xml string from some other command and i will be parsing it using awk cat /tmp/alerts.xml <Alert id="10102" name="APP-DS-ds_ha-140018-componentFailure-S" alertDefinitionId="13982" resourceId="11427" ctime="1359453507621" fixed="false" reason="If Event/Log Level(ANY) and... (2 Replies)
Discussion started by: vivek d r
2 Replies

3. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

4. Shell Programming and Scripting

awk command in script gives error while same awk command at prompt runs fine: Why?

Hello all, Here is what my bash script does: sums number columns, saves the tot in new column, outputs if tot >= threshold val: > cat getnon0file.sh #!/bin/bash this="getnon0file.sh" USAGE=$this" InFile="xyz.38" Min="0.05" # awk '{sum=0; for(n=2; n<=NF; n++){sum+=$n};... (4 Replies)
Discussion started by: catalys
4 Replies

5. Shell Programming and Scripting

awk command for simple join command but based on 2 columns

input1 a_a a/a 10 100 a1 a_a 20 200 b1 b_b 30 300 input2 a_a a/a xxx yyy a1 a1 lll ppp b1 b_b kkk ooo output a_a a/a 10 100 xxx yyy (2 Replies)
Discussion started by: ruby_sgp
2 Replies

6. UNIX for Dummies Questions & Answers

Basic awk question...getting awk to act on $1 of the command itself

I have a script problem that I am not able to solve due my very limited understanding of unix/awk. This is the contents of test.sh awk '{print $1}' From the prompt if I enter: ./test.sh Hello World I would expect to see "Hello" but all I get is a blank line. Only then if I enter "Hello... (2 Replies)
Discussion started by: JasonHamm
2 Replies

7. Shell Programming and Scripting

awk/sed Command : Parse parameter file / send the lines to the ksh export command

Sorry for the duplicate thread this one is similar to the one in https://www.unix.com/shell-programming-scripting/88132-awk-sed-script-read-values-parameter-files.html#post302255121 Since there were no responses on the parent thread since it got resolved partially i thought to open the new... (4 Replies)
Discussion started by: rajan_san
4 Replies

8. Shell Programming and Scripting

assign a command line argument and a unix command to awk variables

Hi , I have a piece of code ...wherein I need to assign the following ... 1) A command line argument to a variable e.g origCount=ARGV 2) A unix command to a variable e.g result=`wc -l testFile.txt` in my awk shell script When I do this : print "origCount" origCount --> I get the... (0 Replies)
Discussion started by: sweta_doshi
0 Replies
Login or Register to Ask a Question