Delete records within a file upon a condition


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Delete records within a file upon a condition
# 1  
Old 02-28-2013
Delete records within a file upon a condition

Hi Friends,

I have the following file,

Code:
cat input

chr1 1000 2000
chr1 600 699
chr1 701 1000
chr1 600 1710
chr2 900 1800

Now, I would like to see the difference of

Code:
Record1.Col2 - Record2.Col2
Record1.Col2 - Record2.Col3
Record1.Col3 - Record2.Col2
Record1.Col3 - Record2.Col3

So each record's col2 and col3 values being subtracted from all records' col2 and col3 within the file after matching the first column.

If the difference is 300, remove the matched records.

Now, my output will be

Code:
cat output
chr1 600 699
chr2 900 1800

Script Flowchart:

Each record against each record will give me

Code:
chr1 1000 2000 chr1 600 699
chr1 1000 2000 chr1 701 1000
chr1 1000 2000 chr1 600 1710
chr1 1000 2000 chr2 900 1800

Now, if we do the subtraction between
Code:
$2-$5, $2-$6, $3-$5 and $3-$6

, after matching on column1 and column4

Code:
chr1 1000 2000 chr1 600 (400&1400) 699(301&1301) - This one will qualify, because the values are greater than 300.
chr1 1000 2000 chr1 701(299&1299) 1000(0&1000) - This record should be deleted, because the values are less than 300.
chr1 1000 2000 chr1 600(400&1400) 1710(-710&290) - This is same as above. You can ignore the negative sign while calculating.
chr1 1000 2000 chr2 900 1800 - This one will remain because the col1 and col4 don't match.

If two records match, I would like to delete two records, not one.
# 2  
Old 02-28-2013
Code:
awk ' NR == 1 {
                c1 = $2
                c2 = $3
                p  = $1
                next

} p == $1 {
                d1 = c1 - $2
                d2 = c1 - $3

                d3 = c2 - $2
                d4 = c2 - $3

                d1 = (d1 < 0)?d1*-1:d1
                d2 = (d2 < 0)?d2*-1:d2
                d3 = (d3 < 0)?d3*-1:d3
                d4 = (d4 < 0)?d4*-1:d4

                if ( d1 > 300 && d2 > 300 && d3 > 300 && d4 > 300 )
                        print
} p != $1 {
                c1 = $2
                c2 = $3
                p  = $1
                print
} ' input

This User Gave Thanks to Yoda For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Filter duplicate records from csv file with condition on one column

I have csv file with 30, 40 columns Pasting just three column for problem description I want to filter record if column 1 matches CN or DN then, check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345 and if column 2 contains 6789, 6789... (5 Replies)
Discussion started by: as7951
5 Replies

2. Shell Programming and Scripting

Delete lines from file based on condition

I want to keep last 2 days data from a file and want to delete others data from the file. Please help me. Sample Input # cat messages-2 Apr 15 11:25:03 test1 kernel: imklog 4.6.2, log source = /proc/kmsg started. Apr 15 11:25:03 test1 rsyslogd: (re)start Apr 16 19:42:03 test1 kernel:... (2 Replies)
Discussion started by: makauser
2 Replies

3. Shell Programming and Scripting

Delete File in a Directory Using a Condition

Hello, I have a directory with many files whose creation time is distributed all over the day. I need ANY 20 files per hour. So, I need 20 files for hour 00 20 files for hour 01 ... 20 files for hour 23 What I have done so far is not great. Here is the code: # get the Month a=`echo... (8 Replies)
Discussion started by: shekhar2010us
8 Replies

4. UNIX for Dummies Questions & Answers

Delete records from a big file based on some condition

Hi, To load a big file in a table,I have a make sure that all rows in the file has same number of the columns . So in my file if I am getting any rows which have columns not equal to 6 , I need to delete it . Delimiter is space and columns are optionally enclosed by "". This can be ... (1 Reply)
Discussion started by: hemantraijain
1 Replies

5. Shell Programming and Scripting

Need unix commands to delete records from one file if the same record present in another file...

Need unix commands to delete records from one file if the same record present in another file... just like join ... if the record present in both files.. delete from first file or delete the particular record and write the unmatched records to new file.. tried with grep and while... (6 Replies)
Discussion started by: msathees
6 Replies

6. Shell Programming and Scripting

Apply condition on fixed width file and filter records

Dear members.. I have a fixed width file. Requirement is as below:- 1. Scan each record from this fixed width file 2. Check for value under field no "6" equals to "ABC". If yes, then filter this record into the output file Please suggest a unix command to achieve this, my guess awk might... (6 Replies)
Discussion started by: sureshg_sampat
6 Replies

7. UNIX for Dummies Questions & Answers

How can you delete records in a file matching a pattern?

I am curious if the following can be done in a file in unix. Let's say I have a flat file with the following data AAA,12,2,,,, BBB,3,1,,,, CCC,,,,, DDD,2,,,,, SQQ,,,,, ASJ,,3,5 I only want to capture the data with values into a new file. If the data contains the pattern ,,,,, as in... (2 Replies)
Discussion started by: mode09
2 Replies

8. UNIX for Dummies Questions & Answers

Use records from one file to delete records in another file

file_in_1: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 file_in_2: 9 10 11 12 21 22 23 24 1 2 3 4 17 18 19 20 file_out: (5 Replies)
Discussion started by: kenneth.mcbride
5 Replies

9. Shell Programming and Scripting

Delete Duplicate records from a tilde delimited file

Hi All, I want to delete duplicate records from a tilde delimited file. Criteria is considering the first 2 fields, the combination of which has to be unique, below is a sample of records in the input file 1620000010338~2446694087~0~20061130220000~A00BCC1CT... (5 Replies)
Discussion started by: irshadm
5 Replies

10. Shell Programming and Scripting

delete records from a file

I have a big file with "|" delimiter. I want to delete all the records that have 'abc' in the 2nd field. How can i do that? I am not abe to open it in VI that is why i need to do it from outside. Please suggest (6 Replies)
Discussion started by: dsravan
6 Replies
Login or Register to Ask a Question