The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 03-13-2009
giannicello giannicello is offline
Registered User
  
 

Join Date: Sep 2001
Location: Phoenix
Posts: 169
New help tweaking awk...

Guys,

My awk is not very good and I'm kind of stuck.
I have a file like so:

1,a,a,ab1234,ab1234,e,f
2,a,b,cd1234,ef5678,e,f
3,a,a,cd3456,gh5678,g,h
4,a,b,ef5678,ef1234,g,h
5,a,a,cd7890,ab5678,e,f
6.a,b,cd7890,jk1234,il

I don't care about any other columns other than col4 and col5.

#1 I can't have col4 = col5 (same line) so need to remove the the line from input file and write the bad record to a 'badrec.txt' (first record in my example).
#2 Similarly, I can't have any value in col4=col5 (two diff lines) and need remove them both from input file and spit both out to a badrec.txt (rec#2 and #4 in my example).
#3 I can't have col4 equal two different things in col5..all must be rejected (rec#5 and #6 are both bad).

I trying to work with this syntax:
awk -F"," '!x[$1,$2,$3,$6,$7]++ {if ( $4 == $5 ) { print $4,$5 } else { print $0 } }' inputfile.txt

I have a problem writing the whole line out to a file? I keep getting "awk: The statement cannot be correctly parsed." The { print $4,$5 } is just temporary. This does give me half the solution for #1 but not #2 when it's on another line.

Can anyone tweak what I have to solve any of my problems?

Thanks.
Gianni
  #2 (permalink)  
Old 03-13-2009
giannicello giannicello is offline
Registered User
  
 

Join Date: Sep 2001
Location: Phoenix
Posts: 169
So far this will do #1. I had to put badrec.txt in double quotes:

awk -F"," '!x[$1,$2,$3,$6,$7]++ {if ( $4 == $5 ) { print $0 > "badrec.txt" } else
{ print $0 } }' inputfile.txt > outputfile.txt

Trying to figure out #2 and #3 still.
Thanks.
  #3 (permalink)  
Old 03-13-2009
vgersh99's Avatar
vgersh99 vgersh99 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,119
here's #1 & #2

nawk -f gia.awk myFile

gia.awk:
Code:
BEGIN {
  FS=OFS=","
  badFile="badRec.txt"
}

$4 == $5 {print > badFile; next}

$4 in col5 {
  print > badFile ; print col5[$4] > badFile
  delete col5[$4]
  next;
}

{ col5[$5]=$0 }

END {
  for (i in col5)
     print col5[i]
}
  #4 (permalink)  
Old 03-13-2009
giannicello giannicello is offline
Registered User
  
 

Join Date: Sep 2001
Location: Phoenix
Posts: 169
Wow. This is awesome. I was going to handle each rule one by one but this worked very well. I guess #3 has to be done separately then, but this is neat.

Thank you.
  #5 (permalink)  
Old 03-13-2009
vgersh99's Avatar
vgersh99 vgersh99 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,119
actually, this will only work if you had ONE pair of matching records (for #2), but you might have more than ONE pair...
Code:
BEGIN {
  FS=OFS=","
  badFile="badRec.txt"
}

$4 == $5 {print > badFile; next}

$4 in col5 {
  print > badFile ; print col5[$4] > badFile
  #delete col5[$4]
  col5del[$4]
  next;
}

{ col5[$5]=$0 }

END {
  for (i in col5)
     if (!(i in col5del)) print col5[i]
}

Last edited by vgersh99; 03-13-2009 at 06:58 PM.. Reason: ooops - wrong index
  #6 (permalink)  
Old 03-16-2009
summer_cherry summer_cherry is offline Forum Advisor  
Registered User
  
 

Join Date: Jun 2007
Location: Beijing China
Posts: 1,079
Code:
#!/usr/bin/perl
open $fh,"<","a.txt";
my (%h1,%h2);
while(<$fh>){
	chomp;
	my @tmp=split(",",$_);
	next if $tmp[3] eq $tmp[4];
	if(not exists $h1{$tmp[3]}){
		$h1{$tmp[3]}=$_;
	}
	else{
		$h1{$tmp[3]}="";
	}
	$h2{$tmp[4]}=$tmp[3];
}
foreach my $k1 (keys %h1){
	if(exists $h2{$k1}){
		delete $h1{$k1};
		delete $h1{$h2{$k1}};
	}
}
foreach my $key (keys %h1){
	print $h1{$key},"\n" if $h1{$key} ne "";
}
  #7 (permalink)  
Old 03-18-2009
giannicello giannicello is offline
Registered User
  
 

Join Date: Sep 2001
Location: Phoenix
Posts: 169
Thanks. I tried the perl command to see what it will do (which one of the rules) and I received:

Too many arguments for open at a.pl line 2, near ""a.txt";"
Execution of a.pl aborted due to compilation errors.

So I tried open(fh, "< a.txt") or die "Can't open input file\n";
It did appeared to have run but I see no output of any kind and no errors.
I'm not a perl programmer so not sure what could be wrong.
Thank you.
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 02:52 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0