How to extract elements using Awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to extract elements using Awk
# 1  
Old 07-21-2008
How to extract elements using Awk

Hi,

I have this typical extraction problem in AWK.

I have 3 input files..

i) First one is somehow like an oracle of:-
foo 12,23,24
bla 11,34
car 35

ii)Second file is basically detailing the score for each of the second field of first file. Besides, for the first column, it is the position or rank that these score located.

1 0.345 24
2 0.231 1
3 0.220 24
4 0.1090 12

iii) Third file contain other information of the second field of first file.
Line Poll NotPoll
10 3 1
12 1 2
23 3 1
24 2 4

Initially, i try to getline for file 1 to check with the position (1st field of second file). I will only pick the most minimum position. In this case , when it scans the first line of file1, it will check position returned by 12,23 and 24 from second file.

From this second file, it will only pick the smallest position (e.g in this case, it is 1 due to "24"). This have been done. However, now i would like to link the "24" to third file. This is to enable me to match with the first field of the 3rd file. I wanted to extract further information of POll and Not Poll (which is 2 and 4 respectively).

In my below code, I only able to print the smallest position but not the exact corresponding element. Please advise. Appreciate your help.

Code:
#!/usr/bin/awk -f 
{
repo = $1
split($2, search_vals, ",")
delete found

while (getline < (repo "/file2.txt")) {
	min=0;
	max=0;
	sum=0;
	scores[$1]=$2;

	for (k in search_vals) {
		if ($3 == search_vals[k]) {
	       		found[$3] = $1;
	       		flag=1;	
			break;
 		}
	}
}

asort(found)

pos=found[1]

print pos;

while (getline < (repo "/file2.txt"))  {
     if(pos==$1){
        lineno=$3;
	break;
     }
}

close(repo "/file2.txt")

print lineno;

if(flag){
score=scores[found[1]]

for(i in scores){
  if(scores[i]==score)
    newscores[i]=scores[i];
}
	min=found[1];	

   for(i in newscores){
      if(newscores[i]== score)
       {
         if(int(i)<int(min))
            min=i
         if(int(i)>int(max))
            max=i
       }
  }     

	gap=int(max)-int(min)+1;
	
	for(x=int(min); x<=int(max); x++){
   		sum+=x;
   		
	}
	if(gap==1){
	   c=min;

	}
	else{
           c=int(0.5+sum/gap);
    	}
    	
   split("", scores) 
   split("", newscores) 
} 

while (getline < (repo "/3rdfile.txt")) {
  
 
  last=$1;
}

m=c/last*100;

print repo,found[1]>"test.txt"

}

In the above code, I try to implement thre above described but the line after the BOLD which is print lino. is not working. As in the bold form, I added an extra while loop to again extracting the exact number apart from the previous position they obtained. E.g( apart from returning the position, it return the corresponding 3rd field of information.) However, it doesnt return me anything.

Please advise. Appreciate alot.


Many thanks.
# 2  
Old 07-21-2008
Could you post the desired result given your example data files?
# 3  
Old 07-21-2008
Hi,

Supposedly in the test.txt, the desired result would print out:-

foo, 1,2,4

In the above case, we dont consider the score to have any similarities in other lines, so it only checks the minimum position. The 2,4 is basically referring to the 24 2 4. (third file).

Basically, im executing ./awkscript file1.txt.

Please advise. Thanks.
# 4  
Old 07-22-2008
Something like this?

Code:
awk 'NR == FNR {
  if (!(item && pos)) {
    item = $NF
    pos = $1
  } 
  if ($1 < pos) {
    pos = $1
    item = $NF
    next
    }
  }
NR > FNR && $1 == item {
  poll = $2 OFS $3
  next
  }
last { 
  n = split($2, t, ",")
  while (++i <= n)
    if (t[i] == item) {
      print $1, pos, poll
      exit
  }
}' OFS=, file2 file3 last=1 file1

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How can we extract specific elements from XML?

Hi, I have a requirement to extract specific element value dynamically from XML message. Here is the sample message: <File> <List> <main> <dir>doc/store834/archive</dir> <count>5</count> </main> <main> <dir>doc/store834/extract</dir> <count>6</count> </main> <main> ... (3 Replies)
Discussion started by: renukeswar
3 Replies

2. Shell Programming and Scripting

Compare multiple arrays elements using awk

I need your help to discover missing elements for each box. In theory each box should have 4 items: ITEM01, ITEM02, ITEM08, and ITEM10. Some boxes either have a missing item (BOX02 ITEM08) or might have da duplicate item (BOX03 ITEM02) and missing another one (BOX03 ITEM01). file01.txt ... (2 Replies)
Discussion started by: alex2005
2 Replies

3. Shell Programming and Scripting

Awk: Append new elements to an array

Hi all, I'm dealing with a bash script to merge the elements of a set of files and counting how many times each element is present. The last field is the file name. Sample files: head -5 *.tab==> 3J373_P15Ac1y2_01_LS.tab <== chr1 1956362 1956362 G A hom ... (7 Replies)
Discussion started by: lsantome
7 Replies

4. Shell Programming and Scripting

Extract only required elements from XML.

Hi , I have an XML like this. <Request> <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"> <version>v44</version><messageId>7247308192</messageId><timeToLive>72000000000</timeToLive> </Request>. I want to extract on version and messageId. As in my output... (13 Replies)
Discussion started by: chetan.c
13 Replies

5. Shell Programming and Scripting

Looping within the elements of a file using awk

Hi all, I have a file containing 5000 rows and 4 columns. I need to do a loop within the rows based on the values of column 3. my sample data is formatted like the ones below: what i need to do is to make a loop that will allow me to plot the values of x,y,values corresponding to month 1 to month... (10 Replies)
Discussion started by: ida1215
10 Replies

6. Shell Programming and Scripting

printing array elements inside AWK

i just want to dump my array and see if it contains the values i am expecting. It should print as follows, ignore=345fht ignore=rthfg56 . . . ignore=49568g Here is the code. Is this even possible to do? please help termReport.pl < $4 | dos2ux | head -2000 | awk ' BEGIN... (0 Replies)
Discussion started by: usustarr
0 Replies

7. Shell Programming and Scripting

AWK help: how to compare array elements against a variable

i have an array call ignore. it is set up ignore=34th56 ignore=re45ty ignore=rt45yu . . ignore=rthg34 n is a variable. I have another variable that i read from a different file. It is $2 and it is working the way i expect. array ignore read and print correct values. in the below if... (2 Replies)
Discussion started by: usustarr
2 Replies

8. Shell Programming and Scripting

awk - array elements as condition

Hi, can I use array elements ( all ) in conditional statements? the problem is ,the total number of elements is not known. e.g A is an array with elements - 1,2,3 now if i want to test if the 1 st field of input record is either 1,2 or 3, i can do something like this if ( $1 ~... (1 Reply)
Discussion started by: shellwell
1 Replies

9. Shell Programming and Scripting

How to transpose data elements in awk

Hi, I have an input data file :- Test4599,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,2,2,Rain Test90,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,Not Rain etc.... I wanted to transpose these data to:-... (2 Replies)
Discussion started by: ahjiefreak
2 Replies

10. Shell Programming and Scripting

How to extract elements in a field using a number

Hi, I face difficulty where the number which I grep before I would like to use it as number to grep again in another file. For example in file 1, I extract the second field and assign to variable "char" in a while loop. And then, I grep again this char to get i and j. char=`echo "${LINE}"|... (17 Replies)
Discussion started by: ahjiefreak
17 Replies
Login or Register to Ask a Question