New help tweaking awk...


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting New help tweaking awk...
# 1  
Old 03-13-2009
New help tweaking awk...

Guys,

My awk is not very good and I'm kind of stuck.
I have a file like so:

1,a,a,ab1234,ab1234,e,f
2,a,b,cd1234,ef5678,e,f
3,a,a,cd3456,gh5678,g,h
4,a,b,ef5678,ef1234,g,h
5,a,a,cd7890,ab5678,e,f
6.a,b,cd7890,jk1234,il

I don't care about any other columns other than col4 and col5.

#1 I can't have col4 = col5 (same line) so need to remove the the line from input file and write the bad record to a 'badrec.txt' (first record in my example).
#2 Similarly, I can't have any value in col4=col5 (two diff lines) and need remove them both from input file and spit both out to a badrec.txt (rec#2 and #4 in my example).
#3 I can't have col4 equal two different things in col5..all must be rejected (rec#5 and #6 are both bad).

I trying to work with this syntax:
awk -F"," '!x[$1,$2,$3,$6,$7]++ {if ( $4 == $5 ) { print $4,$5 } else { print $0 } }' inputfile.txt

I have a problem writing the whole line out to a file? I keep getting "awk: The statement cannot be correctly parsed." The { print $4,$5 } is just temporary. This does give me half the solution for #1 but not #2 when it's on another line.

Can anyone tweak what I have to solve any of my problems?

Thanks.
Gianni
# 2  
Old 03-13-2009
So far this will do #1. I had to put badrec.txt in double quotes:

awk -F"," '!x[$1,$2,$3,$6,$7]++ {if ( $4 == $5 ) { print $0 > "badrec.txt" } else
{ print $0 } }' inputfile.txt > outputfile.txt

Trying to figure out #2 and #3 still.
Thanks.
# 3  
Old 03-13-2009
here's #1 & #2

nawk -f gia.awk myFile

gia.awk:
Code:
BEGIN {
  FS=OFS=","
  badFile="badRec.txt"
}

$4 == $5 {print > badFile; next}

$4 in col5 {
  print > badFile ; print col5[$4] > badFile
  delete col5[$4]
  next;
}

{ col5[$5]=$0 }

END {
  for (i in col5)
     print col5[i]
}

# 4  
Old 03-13-2009
Wow. This is awesome. I was going to handle each rule one by one but this worked very well. I guess #3 has to be done separately then, but this is neat.

Thank you.
# 5  
Old 03-13-2009
actually, this will only work if you had ONE pair of matching records (for #2), but you might have more than ONE pair...
Code:
BEGIN {
  FS=OFS=","
  badFile="badRec.txt"
}

$4 == $5 {print > badFile; next}

$4 in col5 {
  print > badFile ; print col5[$4] > badFile
  #delete col5[$4]
  col5del[$4]
  next;
}

{ col5[$5]=$0 }

END {
  for (i in col5)
     if (!(i in col5del)) print col5[i]
}


Last edited by vgersh99; 03-13-2009 at 07:58 PM.. Reason: ooops - wrong index
# 6  
Old 03-16-2009
Code:
#!/usr/bin/perl
open $fh,"<","a.txt";
my (%h1,%h2);
while(<$fh>){
	chomp;
	my @tmp=split(",",$_);
	next if $tmp[3] eq $tmp[4];
	if(not exists $h1{$tmp[3]}){
		$h1{$tmp[3]}=$_;
	}
	else{
		$h1{$tmp[3]}="";
	}
	$h2{$tmp[4]}=$tmp[3];
}
foreach my $k1 (keys %h1){
	if(exists $h2{$k1}){
		delete $h1{$k1};
		delete $h1{$h2{$k1}};
	}
}
foreach my $key (keys %h1){
	print $h1{$key},"\n" if $h1{$key} ne "";
}

# 7  
Old 03-18-2009
Thanks. I tried the perl command to see what it will do (which one of the rules) and I received:

Too many arguments for open at a.pl line 2, near ""a.txt";"
Execution of a.pl aborted due to compilation errors.

So I tried open(fh, "< a.txt") or die "Can't open input file\n";
It did appeared to have run but I see no output of any kind and no errors.
I'm not a perl programmer so not sure what could be wrong.
Thank you.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk output yields error: awk:can't open job_name (Autosys)

Good evening, Im newbie at unix specially with awk From an scheduler program called Autosys i want to extract some data reading an inputfile that comprises jobs names, then formating the output to columns for example 1. This is the inputfile: $ more MapaRep.txt ds_extra_nikira_usuarios... (18 Replies)
Discussion started by: alexcol
18 Replies

2. IP Networking

Tweaking the DNS response

Hi All, The following is the scenario. I open the browser and request a web page. The DNS query is sent to the DNS server of my company and replies my GNU/Linux machine with a DNS response. I have "insmod"ed a kernel module that picks up the DNS response and over rides the "Addr" field of... (2 Replies)
Discussion started by: rstnsrr
2 Replies

3. Shell Programming and Scripting

Passing awk variable argument to a script which is being called inside awk

consider the script below sh /opt/hqe/hqapi1-client-5.0.0/bin/hqapi.sh alert list --host=localhost --port=7443 --user=hqadmin --password=hqadmin --secure=true >/tmp/alerts.xml awk -F'' '{for(i=1;i<=NF;i++){ if($i=="Alert id") { if(id!="") if(dt!=""){ cmd="sh someScript.sh... (2 Replies)
Discussion started by: vivek d r
2 Replies

4. Shell Programming and Scripting

Perl: Regular expression tweaking?

Hello! I'm trying to tweak my regular expression to take care of this tedious little "blank space" problem. I don't know what's causing the " : 2 times, lines 1, 5," to be printed. Here is what the input looks like: http://i48.tinypic.com/34g0tv8.png Here's what the output is... (6 Replies)
Discussion started by: D2K
6 Replies

5. Shell Programming and Scripting

Tweaking the output of diff

hello everyone, I am trying to compare two files and have the result in a new files. When I used diff I am getting the header, '<' and '>' in my result which I don't want to have it in my output file. :wall: opt/sam/input: diff file1.txt file2.txt 1,20d0 < 16,ZA, < ZJ,08, < Z7,03, Any... (1 Reply)
Discussion started by: siteregsam
1 Replies

6. Shell Programming and Scripting

HELP with AWK one-liner. Need to employ an If condition inside AWK to check for array variable ?

Hello experts, I'm stuck with this script for three days now. Here's what i need. I need to split a large delimited (,) file into 2 files based on the value present in the last field. Samp: Something.csv bca,adc,asdf,123,12C bca,adc,asdf,123,13C def,adc,asdf,123,12A I need this split... (6 Replies)
Discussion started by: shell_boy23
6 Replies

7. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

8. IP Networking

Network Tweaking - Database Query Across Internet

Hi We run a script that queries a database via the internet and we need the fast possible connections to the database server. I have centos server which sends the requests to the database across the internet . it sends upto 800 queries per milliseconds however this using the default... (4 Replies)
Discussion started by: um08
4 Replies

9. UNIX for Dummies Questions & Answers

Changing email header information by tweaking sendmail

How can i tweak sendmail configuration files so that the "Received:" field is removed from email header information? Or else can i change Received: (from enswitch@localhost) in email header to something likeReceived: (from xyz@localhost)? ---------- Post updated at 09:57 PM ---------- Previous... (2 Replies)
Discussion started by: proactiveaditya
2 Replies

10. Shell Programming and Scripting

Question on tweaking the PATH variable to allow the world to run my executable script

All, I am pretty new to Unix and still in the learning curve :) I have a simple requirement for which I did not get an answer yet (Atleast I do not know how to keyword the search for my requirement!!!). I have an executable script my.script1 in a folder /data/misc/scripts/dev, which when... (5 Replies)
Discussion started by: bharath.gct
5 Replies
Login or Register to Ask a Question