filtering one file based on results from other- AGAIN

12-15-2008

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

You want the previous output + the total, or only the total?

This will give you only the total:

Code:

awk 'END {
  while (++i <= c)
    if (u[o[i]] > 1)
	  printf "%s %.2f\n", o[i], t[o[i]]  
  }
NR == FNR {
  _[$1] = $2 
  next
  }
$1 in _ && $4 <= _[$1] {
  o[u[$1]++ ? c : ++c] = $1
  t[$1] += $NF 
  u[$1]++
}' file1 file2

Last edited by radoulov; 12-15-2008 at 09:26 AM..

radoulov

View Public Profile for radoulov

Find all posts by radoulov

12-15-2008

Registered User

30, 0

Join Date: Nov 2008

Last Activity: 24 February 2013, 1:57 AM EST

Posts: 30

Thanks Given: 0

Thanked 0 Times in 0 Posts

Rehrase

Actually let me rephrase my question. I hope I am not annoying you with this. I actually put my question wrongly last time.

I actually want (based on the first column) to have the difference of the first and last value (it does not have to be in the same script as you originally wrote, this can be a second script that I would run after running your script that did the magic for me earlier). It outputs the result as follows;

DATA_444_0 299659.88 2686034.50 -5222.89
DATA_444_0 299646.31 2686026.00 -5226.55
DATA_444_0 299634.50 2686018.50 -5229.11
DATA_444_0 299622.41 2686010.75 -5230.46
DATA_451_0 299369.53 2684876.00 -5191.90
DATA_451_0 299357.28 2684869.25 -5194.87
DATA_451_0 299332.78 2684855.50 -5197.94

I want the difference of the last and the 1st value in the last (4th) column for records based on the first column. Thus from the second script (if you can write it as another script) to give me the following result
DATA_444_0 -7.57 # (-5230.46 - ( -5222.8))
DATA_451_0 -6.04 # (-5197.94 - (-5191.0))

............
............

i.e. each time the value changes in the first column, i want the difference of the last and the first record in the 4th column for each value in the first column and output as above.
I hope I am able to describe the question appropriately. I would remember you in my prayers for helping me on this.

digipak

View Public Profile for digipak

Find all posts by digipak

12-15-2008

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

No problem.
If you want to run two separate scripts:

Code:

awk 'END {
  print k, l - f 
  }
!_[$1]++ {
  if (k) 
    if(l) print k, l - f
  k = $1; f = $NF
  }
{ l = $NF }'

So, given your sample data, it would be something like this.

- the first one:

Code:

$ awk 'END {
  for(i=1; i<=c; i++) {
    split(r[i], t)
if (u[t[1]] > 1)
  print r[i]
  }
  }
NR == FNR {
  _[$1] = $2
  next
  }
$1 in _ && $4 <= _[$1] {
  r[++c] = $0
  u[$1]++
}' file1 file2
DATA_444_0 299659.88 2686034.50 -5222.89
DATA_444_0 299646.31 2686026.00 -5226.55
DATA_444_0 299634.50 2686018.50 -5229.11
DATA_444_0 299622.41 2686010.75 -5230.46
DATA_451_0 299369.53 2684876.00 -5191.90
DATA_451_0 299357.28 2684869.25 -5194.87
DATA_451_0 299332.78 2684855.50 -5197.94

- both:

Code:

$ awk 'END {
  for(i=1; i<=c; i++) {
    split(r[i], t)  
if (u[t[1]] > 1)                     
  print r[i]
  }                                
  }         
NR == FNR { 
  _[$1] = $2 
  next 
  }                     
$1 in _ && $4 <= _[$1] {   
  r[++c] = $0  
  u[$1]++                  
}' file1 file2|awk 'END {
  print k, l - f
  }
!_[$1]++ {
  if (l) print k, l - f
  k = $1; f = $NF
  }
{ l = $NF } '
DATA_444_0 -7.57
DATA_451_0 -6.04

And of course, you can put all in one script.

Last edited by radoulov; 12-16-2008 at 09:40 AM.. Reason: corrected

radoulov

View Public Profile for radoulov

Find all posts by radoulov

12-16-2008

Registered User

30, 0

Join Date: Nov 2008

Last Activity: 24 February 2013, 1:57 AM EST

Posts: 30

Thanks Given: 0

Thanked 0 Times in 0 Posts

Thank you AWK GURU

Dear RADOULOV,

If awk has a GOD, it is you...

How can I thank you for helping me. I dont have words.
Will you kill me or just ignore me if I ask you to help me with the last formatting request? Please ....

Your code that you wrote together i.e. the one that does both jobs in one go works perfectly. Unfortunately I overlooked and forgot to ask you that the output has to be changed slightly. The first part of your script (If using your code that does both things together) outputs following result.

DATA_444_0 299659.88 2686034.50 -5222.89
DATA_444_0 299646.31 2686026.00 -5226.55
DATA_444_0 299634.50 2686018.50 -5229.11
DATA_444_0 299622.41 2686010.75 -5230.46 <= 2nd and 3rd col from the last row are required in the final result
DATA_451_0 299369.53 2684876.00 -5191.90
DATA_451_0 299357.28 2684869.25 -5194.87
DATA_451_0 299332.78 2684855.50 -5197.94 <= 2nd and 3rd col from the last row are required in the final result

The second part of your your code produces this result, which is perfect (based on my previous request)

DATA_444_0 -7.57 # (-5230.46 - ( -5222.8))
DATA_451_0 -6.04 # (-5197.94 - (-5191.0))

however I want to include the last row's 2nd and 3rd column to this final output for every unique record in the first column. Thus I would want the result instead of above to be like this;

DATA_444_0 299622.41 2686010.75 -7.57 # (2nd and 3rd column comes from the last row of each unique record in the first column)
DATA_451_0 299332.78 2684855.50 -6.04 # (2nd and 3rd column comes from the last row of each unique record in the first column

Note tha thte 2nd and 3rd column in the above result are the last row of each unique record in the first column. Also it would be great if I can have the formatted output of the final result. (not necessary as I can do it as after I get the aove result.... that is all the awk I know...) I hope you will help me as you have done before and I would pray for a healthy, safe and prosperous life for you.

A very newbie to Linux / Unix...

digipak

View Public Profile for digipak

Find all posts by digipak

12-16-2008

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

If I'm not missing something:

Code:

awk 'END {
  split(l, t)
  print k, t[2], t[3], t[4] - f
  }
!_[$1]++ {
  if (l) {
    split(l, t)
    print k, t[2], t[3], t[4] - f
    }
  k = $1; f = $NF
  }
{ l = $0 }'

So the result would be:

Code:

$ awk 'END {
  for(i=1; i<=c; i++) {
    split(r[i], t)
if (u[t[1]] > 1)
  print r[i]
  }
  }
NR == FNR {
  _[$1] = $2
  next
  }
$1 in _ && $4 <= _[$1] {
  r[++c] = $0
  u[$1]++
}' file1 file2|awk 'END {
  split(l, t)
  print k, t[2], t[3], t[4] - f
  }
!_[$1]++ {
  if (l) {
    split(l, t)
print k, t[2], t[3], t[4] - f
}
  k = $1; f = $NF
  }
{ l = $0 }'
DATA_444_0 299622.41 2686010.75 -7.57
DATA_451_0 299332.78 2684855.50 -6.04

radoulov

View Public Profile for radoulov

Find all posts by radoulov

12-17-2008

Registered User

30, 0

Join Date: Nov 2008

Last Activity: 24 February 2013, 1:57 AM EST

Posts: 30

Thanks Given: 0

Thanked 0 Times in 0 Posts

Thank you very much

Thanks a lot. Your script does the trick for me. I greatly appreciate you patience to help me out. Once again you are a real awk guru and a really nice person. I wish you all the success in life. God Bless.
If you dont mind, I want to ask where are you located in the world....?

digipak

View Public Profile for digipak

Find all posts by digipak

12-17-2008

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

Thank you for the nice words but I'm not an AWK guru

I know programmers that know and use AWK far better than me.

Look at the Location above: I'm Bulgarian, but I live and work in Italy.

Regards

radoulov

View Public Profile for radoulov

Find all posts by radoulov

Shell Programming and Scripting

filtering one file based on results from other- AGAIN

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Filtering records of a csv file based on a value of a column

Discussion started by: sunilmudikonda

2. UNIX for Beginners Questions & Answers

Filtering based on column values

Discussion started by: daashti

3. Shell Programming and Scripting

Filtering duplicates based on lookup table and rules

Discussion started by: ritakadm

4. Shell Programming and Scripting

Filtering first file columns based on second file column

Discussion started by: ks_reddy

5. UNIX for Dummies Questions & Answers

Filtering records from 1 file based on some manipulation doen on second file

Discussion started by: mintu41

6. Shell Programming and Scripting

Perl: filtering lines based on duplicate values in a column

Discussion started by: polsum

7. Shell Programming and Scripting

filtering records based on numeric field value in 8th position

Discussion started by: indusri

8. Shell Programming and Scripting

filtering one file based on results from other

Discussion started by: digipak

9. UNIX for Dummies Questions & Answers

Filtering records of a file based on a value of a column

Discussion started by: risk_sly

10. Shell Programming and Scripting

filtering list results

Discussion started by: fxvisions