Sponsored Content
Top Forums Shell Programming and Scripting Nawk script to compare records of a file based on a particular column. Post 302550294 by usha rao on Thursday 25th of August 2011 08:27:21 AM
Old 08-25-2011
Hi jayan,

This works only if i have one repeating entry in column 1.
Although it prints the rows where first column is repeating which is having sixth column as NON empty.
But its not printing the rows where the first column is not repeating and sixth column is empty.


Suppose my input file is like this :

Code:
 
UM113568840|24-AUG-11|4|man1|Homepage and News: Urge|
UM113570248|24-AUG-11|4|man1|s detected. UMmediate recycle and validation require|
UM113569599|24-AUG-11|4|man1|Rractivequote6|
UM113570248|24-AUG-11|4|man1|s detected. UMmediate recycle and validation require|24-AUG-11
UM113570624|24-AUG-11|4|man1|Alert: Pattern 'E_DCLeDAOException' |24-AUG-11
UM113568737|24-AUG-11|4|man1| Pattern 'E_DCLeDAOException' |
UM113569231|24-AUG-11|4|man1|is detected. UMmediate recycle and validation require|
UM113569231|24-AUG-11|4|man1|is detected. UMmediate recycle and validation require|24-AUG-11
UM113569346|24-AUG-11|4|man1|FEBMARKETAPP321: Receving multiple alerts for e.log|
UM113571032|24-AUG-11|4|man1|FEBMARKETAPP321: LogFile Alert: Pattern '|
UM113568626|24-AUG-11|3|man1-test|Brokerage, Annuity and Mutual fund Account Details are not been displayed on QA4|
UM113570137|24-AUG-11|4|man1|C - PROD - Equity - Pricing - window does not display a lin|
UM113570659|24-AUG-11|4|man1|-LOGIN-CORADIANT: 4 On 2011-0|
UM113569599|24-AUG-11|4|man1|Rractivequote6|24-AUG-11
UM113570624|24-AUG-11|4|man1|Alert: Pattern 'E_DCLeDAOException'|
UM113570841|24-AUG-11|4|man1|t: SWAP Utilization (91%) i|
UM113568909|24-AUG-11|4|man1|OMEPAGE-AND-NEWS-CORADIANT: Coradiant Alert:  Urge|
UM113571191|25-AUG-11|4|man1|ert: Pattern 'E_DCLeDAOException' found in |


This particular statement --->

Code:
 
nawk -F"|" '{ if (!a[$1]) {a[$1]; if ( $6!= "" ) { print $0 } } } ' infile

is giving me the output like this --->

Code:
 
UM113570248|24-AUG-11|4|man1|s detected. UMmediate recycle and validation require|24-AUG-11
UM113570624|24-AUG-11|4|man1|Alert: Pattern 'E_DCLeDAOException' |24-AUG-11
UM113569231|24-AUG-11|4|man1|is detected. UMmediate recycle and validation require|24-AUG-11
UM113569599|24-AUG-11|4|man1|Rractivequote6|24-AUG-11


whereas the desired output is this ---->



Code:
 
UM113568840|24-AUG-11|4|man1|Homepage and News: Urge|
UM113570248|24-AUG-11|4|man1|s detected. UMmediate recycle and validation require|24-AUG-11
UM113570624|24-AUG-11|4|man1|Alert: Pattern 'E_DCLeDAOException' |24-AUG-11
UM113568737|24-AUG-11|4|man1| Pattern 'E_DCLeDAOException' |
UM113569231|24-AUG-11|4|man1|is detected. UMmediate recycle and validation require|24-AUG-11
UM113569346|24-AUG-11|4|man1|FEBMARKETAPP321: Receving multiple alerts for e.log|
UM113571032|24-AUG-11|4|man1|FEBMARKETAPP321: LogFile Alert: Pattern '|
UM113568626|24-AUG-11|3|man1-test|Brokerage, Annuity and Mutual fund Account Details are not been displayed on QA4|
UM113570137|24-AUG-11|4|man1|C - PROD - Equity - Pricing - window does not display a lin|
UM113570659|24-AUG-11|4|man1|-LOGIN-CORADIANT: 4 On 2011-0|
UM113569599|24-AUG-11|4|man1|Rractivequote6|24-AUG-11
UM113570841|24-AUG-11|4|man1|t: SWAP Utilization (91%) i|
UM113568909|24-AUG-11|4|man1|OMEPAGE-AND-NEWS-CORADIANT: Coradiant Alert:  Urge|
UM113571191|25-AUG-11|4|man1|ert: Pattern 'E_DCLeDAOException' found in |


for a duplicate entry in column 1 there are two rows ---> one having entry in column 6 and other empty column 6.
I want to print only the row having NON empty column 6 and the the having duplicate entry in column 1 and empty column 6 should be discarded.

all other rows having distinct entries in column 1 and empty entries in column 6 should be retained.


Kindly let me know if i am unable to describe the problem.

Thanks
Usha
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Select records based on search criteria on first column

Hi All, I need to select only those records having a non zero record in the first column of a comma delimited file. Suppose my input file is having data like: "0","01/08/2005 07:11:15",1,1,"Created",,"01/08/2005" "0","01/08/2005 07:12:40",1,1,"Created",,"01/08/2005"... (2 Replies)
Discussion started by: shashi_kiran_v
2 Replies

2. UNIX for Dummies Questions & Answers

Filtering records of a file based on a value of a column

Hi all, I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g. COL1 COL2 COL3 ............... COL47 1 XX 45 ... (4 Replies)
Discussion started by: risk_sly
4 Replies

3. Shell Programming and Scripting

Compare files column to column based on keys

Here is my situation. I need to compare two tab separated files (diff is not useful since there could be known difference between files). I have found similar posts , but not fully matching.I was thinking of writing a shell script using cut and grep and while loop but after going thru posts it... (2 Replies)
Discussion started by: blackjack101
2 Replies

4. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777... (5 Replies)
Discussion started by: G.K.K
5 Replies

5. Emergency UNIX and Linux Support

[Solved] Extract records based on a repeated column value

Hi guys, I need help in making a command to find some data. I have multiple files in which multiple records are present.. Each record is separated with a carriage return and in each record there are multiple fields with each field separated by "|" what i want is that I want to extract... (1 Reply)
Discussion started by: m_usmanayub
1 Replies

6. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1. https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html Can someone explain this command please? How are there no... (5 Replies)
Discussion started by: cokedude
5 Replies

7. Shell Programming and Scripting

Compare based on column value

Hi Experts, I want to compare 2 text files based on their column values text1 is like prd-1234 yes no yes yes prd-2345 no no no yes prd-6475 yes yes yes no and test 2 is prd-1234 no no no yes prd-2345 yes no no no desired out put as follows prd-1234 1 3 prd-235 1 4 basically it shows... (5 Replies)
Discussion started by: tijomonmathew
5 Replies

8. Shell Programming and Scripting

Compare two files based on column

Hi, I have two files roughly 1200 fields in length for each row, sorted on the 2nd field. I need to compare based on that 2nd column between file1 and file2 and print lines that exist in both files into separate files (I can't guarantee that every line in file1 is in file2). Example: File1: ... (1 Reply)
Discussion started by: origon
1 Replies

9. Shell Programming and Scripting

Compare two csv's with column based

Hi, I am having below two CSV's col_1,col_2,col_3 1,2,4 1,3,6 col_1,col_3,col2,col_5,col_6 1,2,3,4,5 1,6,3,,, I need to compare based on the columns where the mismatch is expected output col_1,col_2,col_3 1,2,4 (3 Replies)
Discussion started by: rohit_shinez
3 Replies

10. UNIX for Beginners Questions & Answers

Filtering records of a csv file based on a value of a column

Hi, I tried filtering the records in a csv file using "awk" command listed below. awk -F"~" '$4 ~ /Active/{print }' inputfile > outputfile The output always has all the entries. The same command worked for different users from one of the forum links. content of file I was... (3 Replies)
Discussion started by: sunilmudikonda
3 Replies
All times are GMT -4. The time now is 12:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy