The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
graphical elements under Control-M ftrapnell UNIX and Linux Applications 0 07-15-2008 04:26 PM
just want certail elements frenchface Shell Programming and Scripting 1 05-10-2008 10:38 PM
Map - printing all elements - why? dhanamurthy High Level Programming 0 04-14-2008 02:19 PM
How to extract elements in a field using a number ahjiefreak Shell Programming and Scripting 17 12-13-2007 04:13 AM
How to list mirrored elements? (pv, vg, lv) cactux AIX 4 11-29-2006 10:21 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 07-21-2008
ahjiefreak ahjiefreak is offline
Registered User
  
 

Join Date: Nov 2007
Posts: 132
How to extract elements using Awk

Hi,

I have this typical extraction problem in AWK.

I have 3 input files..

i) First one is somehow like an oracle of:-
foo 12,23,24
bla 11,34
car 35

ii)Second file is basically detailing the score for each of the second field of first file. Besides, for the first column, it is the position or rank that these score located.

1 0.345 24
2 0.231 1
3 0.220 24
4 0.1090 12

iii) Third file contain other information of the second field of first file.
Line Poll NotPoll
10 3 1
12 1 2
23 3 1
24 2 4

Initially, i try to getline for file 1 to check with the position (1st field of second file). I will only pick the most minimum position. In this case , when it scans the first line of file1, it will check position returned by 12,23 and 24 from second file.

From this second file, it will only pick the smallest position (e.g in this case, it is 1 due to "24"). This have been done. However, now i would like to link the "24" to third file. This is to enable me to match with the first field of the 3rd file. I wanted to extract further information of POll and Not Poll (which is 2 and 4 respectively).

In my below code, I only able to print the smallest position but not the exact corresponding element. Please advise. Appreciate your help.

Code:
#!/usr/bin/awk -f 
{
repo = $1
split($2, search_vals, ",")
delete found

while (getline < (repo "/file2.txt")) {
	min=0;
	max=0;
	sum=0;
	scores[$1]=$2;

	for (k in search_vals) {
		if ($3 == search_vals[k]) {
	       		found[$3] = $1;
	       		flag=1;	
			break;
 		}
	}
}

asort(found)

pos=found[1]

print pos;

while (getline < (repo "/file2.txt"))  {
     if(pos==$1){
        lineno=$3;
	break;
     }
}

close(repo "/file2.txt")

print lineno;

if(flag){
score=scores[found[1]]

for(i in scores){
  if(scores[i]==score)
    newscores[i]=scores[i];
}
	min=found[1];	

   for(i in newscores){
      if(newscores[i]== score)
       {
         if(int(i)<int(min))
            min=i
         if(int(i)>int(max))
            max=i
       }
  }     

	gap=int(max)-int(min)+1;
	
	for(x=int(min); x<=int(max); x++){
   		sum+=x;
   		
	}
	if(gap==1){
	   c=min;

	}
	else{
           c=int(0.5+sum/gap);
    	}
    	
   split("", scores) 
   split("", newscores) 
} 

while (getline < (repo "/3rdfile.txt")) {
  
 
  last=$1;
}

m=c/last*100;

print repo,found[1]>"test.txt"

}
In the above code, I try to implement thre above described but the line after the BOLD which is print lino. is not working. As in the bold form, I added an extra while loop to again extracting the exact number apart from the previous position they obtained. E.g( apart from returning the position, it return the corresponding 3rd field of information.) However, it doesnt return me anything.

Please advise. Appreciate alot.


Many thanks.
  #2 (permalink)  
Old 07-21-2008
radoulov's Avatar
radoulov radoulov is offline Forum Staff  
addict
  
 

Join Date: Jan 2007
Location: Варна, България / Milano, Italia
Posts: 2,869
Could you post the desired result given your example data files?
  #3 (permalink)  
Old 07-21-2008
ahjiefreak ahjiefreak is offline
Registered User
  
 

Join Date: Nov 2007
Posts: 132
Hi,

Supposedly in the test.txt, the desired result would print out:-

foo, 1,2,4

In the above case, we dont consider the score to have any similarities in other lines, so it only checks the minimum position. The 2,4 is basically referring to the 24 2 4. (third file).

Basically, im executing ./awkscript file1.txt.

Please advise. Thanks.
  #4 (permalink)  
Old 07-22-2008
radoulov's Avatar
radoulov radoulov is offline Forum Staff  
addict
  
 

Join Date: Jan 2007
Location: Варна, България / Milano, Italia
Posts: 2,869
Something like this?

Code:
awk 'NR == FNR {
  if (!(item && pos)) {
    item = $NF
    pos = $1
  } 
  if ($1 < pos) {
    pos = $1
    item = $NF
    next
    }
  }
NR > FNR && $1 == item {
  poll = $2 OFS $3
  next
  }
last { 
  n = split($2, t, ",")
  while (++i <= n)
    if (t[i] == item) {
      print $1, pos, poll
      exit
  }
}' OFS=, file2 file3 last=1 file1
Closed Thread

Bookmarks

Tags
awk, multiple files

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 02:09 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0