finding first instance


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers finding first instance
# 8  
Old 08-12-2002
Yeah now its working Peter... The problem was that the file that i used according to your first post had "," as the delimiter...

But now as per your reply, I changed it to "|" and its working fine..

My awk is this.. See if u can throw light on it..


BEGIN {max=11 #max width of the record
NR == 1
while (getline>0)
{
line=substr($0,0,max);
firstcol=substr(line,0,3);
secondcol=substr(line,5,2);
thirdcol=substr(line,6,1);
print (firstcol secondcol thirdcol);
}
}


In this above piece of code, the firstcol, secondcol, thirdcol respectively extracts the three columns of the file. After which i need a comparison between the fields.. and i need to use a loop..
And here is where i am stuck up..

Probably u can execute this code and see if you can help me..


Thanks,
Nisha
# 9  
Old 08-12-2002
This will do what you need in less code - I assume you want this data from every row in the file?

awk ' { max=11; line=substr($0,0,max) ;
firstcol=substr(line,0,3) ; secondcol=substr(line,5,2); thirdcol=substr(line,6,1) ;
print firstcol secondcol thirdcol} '
input_filename > output_filename

So...that will allow you to get the 6 digit string....what are you wanting to do with it after you have it?

Compare between rows - or between columns in a row?

Last edited by peter.herlihy; 08-12-2002 at 06:05 AM..
# 10  
Old 08-12-2002
I want to compare columns between rows.. something like this..
compare the first col of the first and second row and the second col of the first and the second row.. if they are the same, then take the first occurence of the entire row...

Am i right...????



Thanks,
Nisha
# 11  
Old 08-12-2002
Okay...so if I have this right, basically the same as the original question I asked. This will give you the entire row where col1 and col2 together are unique.

i.e.

a,3,xxx
a,3,yyy
a,4,ccc
a,5,ccc
a,5,fff

Will give you:

a,3,xxx
a,4,ccc
a,5,ccc

Code:
awk ' 
{ 
max=11; 
line=substr($0,0,max) ; 
firstcol=substr(line,0,3) ; 
secondcol=substr(line,5,2); 
thirdcol=substr(line,6,1) 
}
{
if ( firstcol != field1 || secondcol != field2 )
{ print firstcol secondcol thirdcol ; field1 = firstcol; field2 = secondcol }
' input_file > output_file

Basically field1 and field2 start empty get set with the first pariing of col1 and 2 - and prints the 3 cols. Then it compares with the next row and where either are different it prints the three cols.

I haven't tried this code but it hould be good.
# 12  
Old 08-13-2002
Hey Peter,

With a small modification here my code also works... Smilie

But I have very important doubts to be cleared.....

Here is the code...

{
max=11;
line=substr($0,0,max) ;
firstcol=substr(line,0,3) ;
secondcol=substr(line,5,3);
thirdcol=substr(line,9,1)

if ( firstcol != field1 || secondcol != field2 )
{ print firstcol "," secondcol "," thirdcol; field1=firstcol; field2=secondcol }
}

And I executed it as awk -f cmp_rec test.dat > log

Ofcourse, i can execute it as u said too....

First let me clarify with you whether my understanding is right...

it is col to col comparison right??

for example,

aaa|bbb|1
aaa|bbb|2 would give me aaa|bbb|1
xxx|yyy|3 xxx|yyy|3
xxx|yyy|4 zzz|rrr|5
zzz|rrr|5

Bcos it compares aaa and bbb of the first row and the aaa and bbb of the second row...

if the first col and the second col of the first row is not equal to the first and second col of the second row, then that first instance is printed..

Now coming to the doubt,

In the above code why shoud i assign

field1=firstcol; field2=secondcol

Are these field1 and field2 key fields???? meaning system defined..???

Can you explain???


Thanks,
Nisha
# 13  
Old 08-13-2002
You understand it perfectly...and I would guess you could see the importance of sorting the file before you start this too.

field1 and field2 are just variables (call them anything you want). At the start of the awk they are empty, so for the very first row of the file when the firstcol and secondcol are compared to them they won't be equal ...therefore triggering the columns to be printed AND triggering the field1 and field2 to be set with the values of firstcol and secondcol respectively.

With field1 and 2 now holding the values you just printed - when you compare to the next line you're looknig to see if the new firstcol is different to the variable field1 OR if the new secondcol is different to the variable field2.

If they are BOTH the same - then the if statement will fail and not execute the print, then awk will move onto the next row and compare again. The if statement will continue to fail until it finds a situation where col1 and col2 do not match the variables - meaning that you have got a different combination - and should print those columns - and again update the variables with the last match.

In logic terms it's simply

x=''
y=''
if (firstcol is not = x OR second col is not = y ) then print your columns, and make x = firstcol and y = secondcol.
Then repeat for the next line.

so for the example again....

aaa|bbb|1
aaa|bbb|2 would give me aaa|bbb|1
xxx|yyy|3 xxx|yyy|3
xxx|yyy|4 zzz|rrr|5
zzz|rrr|5

first time through - (field1 and field 2 are empty)
aaa != field1(empty) and bbb != field2(empty) => so PRINT and make field1 = aaa and field2 =bbb
second row-
aaa = field1(aaa) and bbb = field2(bbb) so do nothing and move onto next row.
thrid row...
xxx !=field1(aaa) and yyy != field2(bbb) => so PRINT and make field1 = xxx and field2 =yyy

et. etc. etc. et.c et.c
# 14  
Old 08-13-2002
Yeah perfect... As u said the sorting is to be by default otherwise results would be ambiguous..

But did you notice something.. unlike other languages, in awk there is no necessity to declare a variable. example is this field1 and field2... and no initialization also is required....


Too good a program.. in my program that i posted i need to change the max variable everytime when the width of the record is changed... and so is the offset for the firstcol, secondcol and the thirdcol...

So I think the one that u sent would be the efficient one..
Thanks a lot for your time and explanation...

Thanks,
Nisha
Smilie
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Grep the only instance name

Hi, I want to get the only application name from the server. Ex: if i give $ ps -ef | grep bw. It will show all BW process with entire path. It will little confuse to list out the process. Can anyone have syntax to get only the instance name. I need this for be, hawk,ems also. Please... (2 Replies)
Discussion started by: ckchelladurai
2 Replies

2. Shell Programming and Scripting

Keep the last instance of the record

Hi All, I have a input file like 1| abc 1| abcd 1| abcde 2| abc 2| abcd 3| abcde I want the output like 1| abcde 2| abcde Any help would be highly appreciated. Thanks in advance. (9 Replies)
Discussion started by: lrkp
9 Replies

3. Red Hat

Apache instance

Hi , Maximum How many instances of apache can we run in one box? (2 Replies)
Discussion started by: krish4linux
2 Replies

4. Shell Programming and Scripting

What does : do in this instance

Guys please see below functions to return a status depending on user input. Both seem to work the same. The second way has a : line which i can't understand or see in a ksh manual anywhere. Instead of doing the variable change if its empty on this line the first function simply does it on the... (7 Replies)
Discussion started by: lavascript
7 Replies

5. Shell Programming and Scripting

matching first instance of FS

Hi All, I have a property in a file as: property=value=a If I use FS="=" then I want only first = to be considered as field separator and remaining as value echo -e "property=value=a" | awk -F= '{print $2}' ie my $2 should be value=a Can anyone please help me with this. I need it in... (3 Replies)
Discussion started by: gurukottur
3 Replies

6. Shell Programming and Scripting

replace first instance(not first instance in line)

Alright, I think I know what I am doing with sed(which probably means I don't). But I cant figure out how to replace just the first occurance of a string. I have tried sed, ed, and grep but can't seem to figure it out. If you have any suggestions I am open to anything! (3 Replies)
Discussion started by: IronHorse7
3 Replies

7. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies

8. Shell Programming and Scripting

Single Instance

Hi, I have a script. I want only one instance of the script to be running at any point of the time. How can I do it. what would be the exact format of the ps command for doing this. For example the name of my script is "Inst.sh" Thanx in advance (2 Replies)
Discussion started by: sendhil
2 Replies

9. Linux

OTRS instance

hi frnds here i m trying to configure OTRS instance but i m getting the following error message while runnning through browser. I m writing the following http://192.168.1.55:8080/otrs2/index.pl " #!/usr/bin/perl -w... (7 Replies)
Discussion started by: naik_mit
7 Replies

10. UNIX for Dummies Questions & Answers

Copy Db Instance

I need to copy my Live Db Instance to my Test Db Instance Can somebody please tell me the easiest way to go about this It is an Informix Database running on HP-UX Thanks (0 Replies)
Discussion started by: cobdeng
0 Replies
Login or Register to Ask a Question