Visit Our UNIX and Linux User Community


substract column based on some criteria


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting substract column based on some criteria
# 1  
Old 03-18-2010
Bug substract column based on some criteria

Please guide if you know how to solve this.

I have a tab delimited INPUT FILE where each record is separated by -----
Code:
-----
ABC      4935402        4936680          Pattern=Cheers07080.1
ABC      4932216        4932368          Pattern=Cheers07080.1
ABC      4931932        4932122          Pattern=Cheers07080.1
-----
ABC      4675209        4676057          Pattern=Cheers06520.1
ABC      4676269        4676713          Pattern=Cheers06520.1
ABC      4682346        4682510          Pattern=Cheers06520.1
ABC      4682606        4682796          Pattern=Cheers06520.1
-----
ABC      48341587       48344548         Pattern=Cheers45590.1
-----
ABC      34297519       34298743         Pattern=Cheers31410.1
ABC      34298957       34299678         Pattern=Cheers31410.1
-----

The OUTPUT file required is :
Code:
-----
Xyz    (4935402-4932368)-1    Pattern=Cheers07080.1
Xyz    (4932216-4932122)-1    Pattern=Cheers07080.1
-----
Xyz    (4676269-4676057)-1    Pattern=Cheers06520.1
Xyz    (4682346-4676713)-1    Pattern=Cheers06520.1
Xyz    (4682606-4682510)-1    Pattern=Cheers06520.1
-----
Xyz    0    Pattern=Cheers45590.1
-----
Xyz    (34298957-34298743)-1    Pattern=Cheers31410.1
-----

Output is based on this criteria:

In a record, If column2(row1) > column2(row2) then subtract row2(column3) from row1(column2) and so on till the rows are found. But if column2(row1) < column2(row2) then subtract row1(column3) from row2(column2) and so on.
If there is only 1 row in a record then print 'Xyz 0 value of Column4'

(4935402-4932368)-1 has been written only for clarity but the value of this expression is required.

Thanks in advance.

Last edited by Franklin52; 03-18-2010 at 06:14 AM.. Reason: Please use code tags!
# 2  
Old 03-18-2010
Something like this?
Code:
$ cat file
-----
ABC      4935402        4936680          Pattern=Cheers07080.1
ABC      4932216        4932368          Pattern=Cheers07080.1
ABC      4931932        4932122          Pattern=Cheers07080.1
-----
ABC      4675209        4676057          Pattern=Cheers06520.1
ABC      4676269        4676713          Pattern=Cheers06520.1
ABC      4682346        4682510          Pattern=Cheers06520.1
ABC      4682606        4682796          Pattern=Cheers06520.1
-----
ABC      48341587       48344548         Pattern=Cheers45590.1
-----
ABC      34297519       34298743         Pattern=Cheers31410.1
ABC      34298957       34299678         Pattern=Cheers31410.1
-----
$
$ awk '/-----/{
  if(f){
    print "Xyz\t0"  "\t" s
  }
  print; getline
  a=$2; s=$NF; f=1
  next
}
/ABC/{
  print "Xyz\t" a-$3-1 "\t" $NF
  a=$2; f=0
}' file
-----
Xyz     3033    Pattern=Cheers07080.1
Xyz     93      Pattern=Cheers07080.1
-----
Xyz     -1505   Pattern=Cheers06520.1
Xyz     -6242   Pattern=Cheers06520.1
Xyz     -451    Pattern=Cheers06520.1
-----
Xyz     0       Pattern=Cheers45590.1
-----
Xyz     -2160   Pattern=Cheers31410.1
-----
$

# 3  
Old 03-18-2010
Thanks for your response Franklin. I'll take care of the text formatting. But there is some problem with the output as there are negative values in the output, whereas a smaller number has to be subtracted from a larger number each time.
# 4  
Old 03-18-2010
Can you post the desired output from the given input file?
# 5  
Old 03-19-2010
The desired OUTPUT File is :
Code:
-----
Xyz    3033    Pattern=Cheers07080.1
Xyz    93        Pattern=Cheers07080.1
-----
Xyz    211       Pattern=Cheers06520.1
Xyz    5632    Pattern=Cheers06520.1
Xyz    97        Pattern=Cheers06520.1
-----
Xyz    0          Pattern=Cheers45590.1
-----
Xyz    213      Pattern=Cheers31410.1
-----

The difference between records is - numbers in row 2 are either in descending order or ascending order and the subtraction varies accordingly.

Thanks.
# 6  
Old 03-19-2010
If I understand your question then this should be the criteria:

Quote:
If column2(row1) > column2(row2) then print row1(column2)-row2(column3)-1
If column2(row1) < column2(row2) then print row2(column2)-row1(column3)-1
In that case you can't get the desired output as you posted.

This command uses the criteria above:
Code:
awk '/-----/{
  if(f){
    print "Xyz\t0"  "\t" s
  }
  print; getline
  a=$2; b=$3; s=$NF; f=1		# a = column2(row1), b = row1(column3)
  next
}
/ABC/{
  if(a>$2){				# if column2(row1) > column2(row2)
    print "Xyz\t" a-$3-1 "\t" $NF	# + print row1(column2)-row2(column3)-1
  }
  else {
    print "Xyz\t" $2-b-1 "\t" $NF	# else print row2(column2)-row1(column3)-1
  }
   
  a=$2; f=0
}' file

and the output is:

Code:
-----
Xyz     3033    Pattern=Cheers07080.1
Xyz     93      Pattern=Cheers07080.1
-----
Xyz     211     Pattern=Cheers06520.1
Xyz     6288    Pattern=Cheers06520.1
Xyz     6548    Pattern=Cheers06520.1
-----
Xyz     0       Pattern=Cheers45590.1
-----
Xyz     213     Pattern=Cheers31410.1
-----

Regards
# 7  
Old 03-19-2010
I have tried to simplify my problem. Please see if you can help.
Now there is only increasing numbers in column.

INPUT FILE
Code:
 -----
ABC      4675209        4676057          Pattern01
ABC      4676269        4676713          Pattern01
ABC      4682346        4682510          Pattern01
ABC      4682606        4682796          Pattern01
-----
ABC      48341587       48344548         Pattern09
-----
ABC      34297519       34298743         Pattern10
ABC      34298957       34299678         Pattern10
-----

OUTPUT FILE
Code:
-----
Xyz    212 [4676269 - 4676057]	  Pattern01
Xyz    5633 [4682346 - 4676713]	  Pattern01
Xyz    96 [4682606 - 4682510]      Pattern01
-----
Xyz    0           Pattern09
-----
Xyz    214 [34298957 - 34298743]   Pattern10
-----

values written in [ ] are only for explanation purpose.

Thanks in advance.

Previous Thread | Next Thread
Test Your Knowledge in Computers #402
Difficulty: Medium
The term 3D printing originally referred to a powder bed process employing standard and custom inkjet print heads.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need a Linux command for find/replace column based on specific criteria.

I'm new to shell programming, I have a huge text file in the following format, where columns are separated by single space: ACA MEX 4O_ $98.00 $127.40 $166.60 0:00 0:00 0 ; ACA YUL TS_ $300.00 $390.00 $510.00 0:00 0:00 0 ; ACA YYZ TS_ $300.00 $390.00 $510.00 0:00 0:00 0 ; ADZ YUL TS_ $300.00... (3 Replies)
Discussion started by: transat
3 Replies

2. Shell Programming and Scripting

Delete duplicate row based on criteria

Hi, I have an input file as shown below: 20140102;13:30;FR-AUD-LIBOR-1W;2.495 20140103;13:30;FR-AUD-LIBOR-1W;2.475 20140106;13:30;FR-AUD-LIBOR-1W;2.495 20140107;13:30;FR-AUD-LIBOR-1W;2.475 20140108;13:30;FR-AUD-LIBOR-1W;2.475 20140109;13:30;FR-AUD-LIBOR-1W;2.475... (2 Replies)
Discussion started by: shash
2 Replies

3. Shell Programming and Scripting

Match based on criteria to file

Trying to match $1 of target.txt to $5 of file.txt. If there is a match then in an output.txt file $1,$1 (row underneath),$6,$4,$7 from file.txt are printed on the same line as $1 of target.txt. The input is from excel and the output should be tab-deliminated. Thank you :). target.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

4. Shell Programming and Scripting

Select lines from a file based on a criteria

Hi I need to select lines from a txt file, I have got a line starting with ZMIO:MSISDN= and after a few line I have another line starting with 'MOBILE STATION ISDN NUMBER' and another one starting with 'VLR-ADDRESS' I need to copy these three lines as three different columns in a separate... (3 Replies)
Discussion started by: Tlcm sam
3 Replies

5. Shell Programming and Scripting

How to substract selective values in multi row, multi column file (using awk or sed?)

Hi, I have a problem where I need to make this input: nameRow1a,text1a,text2a,floatValue1a,FloatValue2a,...,floatValue140a nameRow1b,text1b,text2b,floatValue1b,FloatValue2b,...,floatValue140b look like this output: nameRow1a,text1b,text2a,(floatValue1a - floatValue1b),(floatValue2a -... (4 Replies)
Discussion started by: nricardo
4 Replies

6. UNIX for Dummies Questions & Answers

How to fetch files right below based on some matching criteria?

I have a requirement where in i need to select records right below the search criteria qwertykeyboard white 10 20 30 30 40 50 60 70 80 qwertykeyboard black 40 50 60 70 90 100 qwertykeyboard and white are headers separated by a tab. when i execute my script..i would be searching... (4 Replies)
Discussion started by: vinnu10
4 Replies

7. UNIX for Dummies Questions & Answers

How to select files based on a criteria?

I have a file..... xxx 2345 455 abc 345 555 cdf 456 777 fff 555 888 Now my requirement is, Say if, i want to select only those records prior to the record fff 555 888... how do i go about doing this in unix.... The fff would be hardcoded as it wud be fixed and everytime when i... (7 Replies)
Discussion started by: saggiboy10
7 Replies

8. Shell Programming and Scripting

Merging Lines based on criteria

Hello, Need help with following scenario. A file contains following text: {beginning of file} New: This is a new record and it is not on same line. Since I have lost touch with script take this challenge and bring all this in one line. New: Hello losttouch. You seem to be struggling... (4 Replies)
Discussion started by: losttouch
4 Replies

9. UNIX for Advanced & Expert Users

need to get a portion of entries in file based on a criteria --- Help please

All, Below is the file, what i need to do is take the text in between the /*-- and --*/ , i mean the jobs. Then i have grep for system name . If the job is there in system 1 i have to print to a file. Basically i want to take all the jobs that are in system1 to another file . because... (7 Replies)
Discussion started by: arunkumar_mca
7 Replies

10. UNIX for Dummies Questions & Answers

Select records based on search criteria on first column

Hi All, I need to select only those records having a non zero record in the first column of a comma delimited file. Suppose my input file is having data like: "0","01/08/2005 07:11:15",1,1,"Created",,"01/08/2005" "0","01/08/2005 07:12:40",1,1,"Created",,"01/08/2005"... (2 Replies)
Discussion started by: shashi_kiran_v
2 Replies

Featured Tech Videos