validating a file based on conditions


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting validating a file based on conditions
# 1  
Old 12-31-2008
validating a file based on conditions

i have a file in unix in which the records are like this

aaa 123 233
aaa 234 222
aaa 242 222
bbb 122 111
bbb 122 123
ccc 124 222

In the output i want only the below records

aaa
ccc

The validation logic is 1st column and 2nd column need to be considered
if both columns values are not same and 1st column values are same
then the record in 1st column need to be picked up

in the records if the first and second column matches then those records need to be dropped

plz. let me know how to do this validation
# 2  
Old 12-31-2008
Use nawk or /usr/xpg4/bin/awk on Solaris:
Code:
awk 'END { 
  for (_ in u) if (u[_])
	  print _
	}
{
  u[$1] = k[$1,$2]++ ? x : 1      
  }' infile


Last edited by radoulov; 12-31-2008 at 09:44 AM.. Reason: refactored
# 3  
Old 12-31-2008
Hi radoulov,

another one of your astonishing awk scripts. Could you go in
some detail how it works.

Regards

Chris
# 4  
Old 12-31-2008
Quote:
Originally Posted by Christoph Spohr
Hi radoulov,

another one of your astonishing awk scripts. Could you go in
some detail how it works.

Regards

Chris
Astonishing Smilie,
thank you!

I'll try to explain.
Code followed by comments.

Code:
{
  u[$1] = k[$1,$2]++ ? x : 1      
  }

While reading the input build an associative array named u (unique) keyed by $1. The values are build/chosen based on the following expression:

Code:
k[$1,$2]++ ? x : 1

If the value of the auto incremented associative array k, build en passant with $1 SUBSEP $2 as keys, is different than 0 (i.e. true in boolean context), i.e. already seen (remember Ed Morton's !arr[val]++?),
then return and assign the value of the variable x (never used and auto initialized -> null -> 0 in numeric context -> false in boolean context, if I had written 0, it would have been clearer Smilie), otherwise return and assign the value 1 (the opposite of the previous).

Code:
END { 
  for (_ in u) if (u[_])
      print _
    }

After reading all the input, print only those u keys whose values are true when evaluated in boolean context (which equal to 1).

Happy holidays!

Last edited by radoulov; 12-31-2008 at 01:18 PM..
# 5  
Old 01-02-2009
validating a input file in unix

hi,
thanks for the response
actually in my message if there are 3 records like

aaa 123 233
aaa 234 222
aaa 242 222

then only ONE aaa

need to be printed
but in the output it is showing all the 3 values

Actually in my input file it will contain nearly 10 fields each separated by pipe symbol
For that thing whether this solution will work (by replacing k[$1,$2]++ with all the fields like $3...) or i have to use another approach
I have to consider the first 2 fields for validation remaining fields i can leave as it is


expecting your reply

thanks
# 6  
Old 01-02-2009
hi, you may try below perl script

Code:
#! /usr/bin/perl
open FH,"<a.txt";
while(<FH>){
	my @tmp=split(" ",$_);
	if(! exists $hash{$tmp[0]}){
		$hash{$tmp[0]}=$tmp[1]." " ;
		next;
	}
	if((exists $hash{$tmp[0]}) && ($hash{$tmp[0]} ne 'DUP')){
		$hash{$tmp[0]}=($hash{$tmp[0]} =~ m/$tmp[1] /)?'DUP':$hash{$tmp[0]}.$tmp[1]." ";		
	}
}
close FH;
print join "\n", grep {$hash{$_} ne 'DUP' } keys %hash;

# 7  
Old 01-02-2009
hi,
I have to use shell script
please suggest some logic in shell script itself
i haven't used perl script

thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search and replace value based on certain conditions in a fixed width file

Hi Forum. I tried searching for a solution using the internet search but I haven't been able to find any solution for what I'm trying to accomplish. I have a fixed width column file where I need to search for any occurrences of "D0" in col pos.#1-2, 10-11, 20-21 and replaced it with "XD". ... (2 Replies)
Discussion started by: pchang
2 Replies

2. Shell Programming and Scripting

awk to update file based on 5 conditions

I am trying to use awk to update the below tab-delimited file based on 5 different rules/conditions. The final output is also tab-delimited and each line in the file will meet one of the conditions. My attemp is below as well though I am not very confident in it. Thank you :). Condition 1: The... (10 Replies)
Discussion started by: cmccabe
10 Replies

3. Shell Programming and Scripting

awk to filter file based on seperate conditions

The below awk will filter a list of 30,000 lines in the tab-delimited file. What I am having trouble with is adding a condition to SVTYPE=CNV that will only print that line if CI= must be >.05 . The other condition to add is if SVTYPE=Fusion, then in order to print that line READ_COUNT must... (3 Replies)
Discussion started by: cmccabe
3 Replies

4. Shell Programming and Scripting

Help with Creating file based on conditions

Can anyone please assist? I have a .txt file(File1.txt) and a property file(propertyfile.txt) . I have to read the vales from the property file and .txt file and create the output file(outputfile.txt) mentioned in the attachment. For each record in .txt file,the below mentioned values shall be... (20 Replies)
Discussion started by: vinus
20 Replies

5. Shell Programming and Scripting

Create new file with increment column based on conditions

Hello, Using bash script, i need to process the following file: 887,86,,2013-11-06,1,10030,5,2,0,200,, 887,86,,2013-11-05,1,10030,5,2,0,199,, 887,138,,2013-11-06,1,10031,6,2,0,1610612736,, 887,164,,2013-11-06,1,10000,0,2,0,36000,, and to create a new file such as the below ... (2 Replies)
Discussion started by: JonhyDeep
2 Replies

6. Shell Programming and Scripting

Split File based on different conditions

I need to split the file Conditions: Ignore any record that either starts with 1 or 9 Split the file at position 404 , if position 404 is abc or def then write all the records in a file > File 1 , the remaining records should go in to a file > File 2 Further I want to split the... (7 Replies)
Discussion started by: protech
7 Replies

7. UNIX for Dummies Questions & Answers

Shell script to extract data from csv file based on certain conditions

Hi Guys, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (1 Reply)
Discussion started by: Vivekit82
1 Replies

8. Shell Programming and Scripting

Extract file records based on some field conditions

Hello Friends, I have a file(InputFile.csv) with the following columns(the columns are pipe-delimited): ColA|ColB|ColC|ColD|ColE|ColF Now for this file, I have to get those records which fulfil the following condition: If "ColB" is NOT NULL and "ColD" has values one of the following... (9 Replies)
Discussion started by: mehimadri
9 Replies

9. UNIX for Dummies Questions & Answers

How to get remove duplicate of a file based on many conditions

Hii Friends.. I have a huge set of data stored in a file.Which is as shown below a.dat: RAO 1869 12 19 0 0 0.00 17.9000 82.3000 10.0 0 0.00 0 3.70 0.00 0.00 0 0.00 3.70 4 NULL LEE 1870 4 11 1 0 0.00 30.0000 99.0000 0.0 0 0.00 0 0.00 0.00 0.00 0 ... (3 Replies)
Discussion started by: reva
3 Replies

10. UNIX for Dummies Questions & Answers

Validating input based on fixed number of fields

Yes, i did... let me state my problem in more detail Inputs: I have one input CSV file And, i have stored no. of comma each line should in a variable. e.g. $ cat cmt.csv this, is a ,comma ,count test1 ,,this, is a ,comma ,count test2 this, is a ,comma ,count test3... (6 Replies)
Discussion started by: Dipali
6 Replies
Login or Register to Ask a Question