Delete complete row according to condition


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Delete complete row according to condition
# 1  
Old 08-16-2014
Delete complete row according to condition

Gents,

Please can you help me.

In the range 4-24 column the values are duplicate some times and I will like to delete the fist occurrence and keep the last only. The file is not sorted and I can sorted because from column 75 to the end the file is increase by time..

I have a file like this
Code:
S  21301.00  21481.00  2               0       915802.1 1846679.3  48.1227 23141
S  21083.00  21397.00  1               0       916712.0 1840909.0  55.7227 42035
S  21081.00  21619.00  2               0       921533.2 1843642.2  72.2227 52203
S  21299.00  22041.00  2               0       927954.1 1853627.3  96.7227 65151
S  21309.00  21861.00  2               0       923928.7 1851604.8  77.3227  2105
S  21313.00  21353.00  2               0       912876.9 1845343.2  36.2227 30120
S  21095.00  21469.00  4               0       918111.1 1842071.9  55.0227 44452
S  21309.00  21861.00  2               0       923411.6 1851708.4  79.2227    40
S  21115.00  21869.00  1               0       926530.0 1847499.1  82.3227    58
S  21321.00  21845.00  1               0       923431.7 1851669.1  79.1227   135
S  21115.00  21871.00  1               0       926560.4 1847521.8  83.3227   153
S  21113.00  21871.00  1               0       926596.1 1847485.5  83.3227   251
S  21115.00  21871.00  1               0       923473.9 1851689.8  77.9227   309
S  21113.00  21873.00  1               0       926640.2 1847501.4  83.2227   403
S  21323.00  21847.00  1               0       923455.8 1851729.7  78.0227   439

and i would like to delete the following lines

Code:
S  21309.00  21861.00  2               0       923928.7 1851604.8  77.3227  2105
S  21115.00  21871.00  1               0       926560.4 1847521.8  83.3227   153

So, my output file should be like this.

Code:
S  21301.00  21481.00  2               0       915802.1 1846679.3  48.1227 23141
S  21083.00  21397.00  1               0       916712.0 1840909.0  55.7227 42035
S  21081.00  21619.00  2               0       921533.2 1843642.2  72.2227 52203
S  21299.00  22041.00  2               0       927954.1 1853627.3  96.7227 65151
S  21313.00  21353.00  2               0       912876.9 1845343.2  36.2227 30120
S  21095.00  21469.00  4               0       918111.1 1842071.9  55.0227 44452
S  21309.00  21861.00  2               0       923411.6 1851708.4  79.2227    40
S  21115.00  21869.00  1               0       926530.0 1847499.1  82.3227    58
S  21321.00  21845.00  1               0       923431.7 1851669.1  79.1227   135
S  21113.00  21871.00  1               0       926596.1 1847485.5  83.3227   251
S  21115.00  21871.00  1               0       923473.9 1851689.8  77.9227   309
S  21113.00  21873.00  1               0       926640.2 1847501.4  83.2227   403
S  21323.00  21847.00  1               0       923455.8 1851729.7  78.0227   439

Thanks in advance Smilie
# 2  
Old 08-16-2014
How much is unique in the records you want to delete? As soon as you have a unique string, you can use grep to remove them.

Have a go and let us know how you get on so we can assist more if needed.



Robin
# 3  
Old 08-16-2014
Your directions aren't clear as to what is supposed to happen if the 21 characters starting in column 4 appear in more than two lines. Assuming you just want to keep the last one, this seems to do what you want:
Code:
awk '
FNR == NR {
	c[substr($0, 4, 21)]++
	next
}
c[substr($0, 4, 21)]-- == 1' file file

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.
This User Gave Thanks to Don Cragun For This Post:
# 4  
Old 08-17-2014
Don Cragun
I try like this
Code:
awk 'FNR == NR {c[substr($0, 4, 21)]++; next} c[substr($0, 4, 21)]-- == 1' file newfile

But dint give my any output Smilie

Last edited by jiam912; 08-17-2014 at 12:28 PM..
# 5  
Old 08-17-2014
You have to supply the original file twice, as the proposal needs to run through it once to count the repetitions, and once to print lines & skip the repeting ones. If you want/need, redirect stdout to a new file.
This User Gave Thanks to RudiC For This Post:
# 6  
Old 08-17-2014
Slight variation:
Code:
awk '{i=substr($0,4,21)} NR==FNR{P[i]=FNR; next} P[i]==FNR' file file

or perhaps:
Code:
awk '{i=$2 FS $3} NR==FNR{P[i]=FNR; next} P[i]==FNR' file file

or
Code:
awk '{i=$2 FS $3} P[i]==FNR; NR==FNR{P[i]=FNR}' file file

This User Gave Thanks to Scrutinizer For This Post:
# 7  
Old 08-17-2014
@Scrutinizer: nice approach! But shouldn't you include $4 as well, because requestor talked of char pos. 4 - 24 to be the key?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Convert Second Column Date Into EPOCH Time And Print Complete Row

Hello Team, I am stuck in getting the required output in the following case. Please help. My input file is aa|08/01/2016 bb|08/15/2016 I wish to convert the file into aa|epoch time bb|epoch time I am using following code: (3 Replies)
Discussion started by: angshuman
3 Replies

2. Shell Programming and Scripting

Awk, appending a number in the first column of a row with a condition

Hi everyone, I have a data file in which the data is stored in event blocks. What I would like to get is that the same file with every data row starting with the number of event block. So here is two event blocks from my file: <event> -2 -1 0 0 0 501 0.00000000000E+00 ... (2 Replies)
Discussion started by: hayreter
2 Replies

3. Shell Programming and Scripting

PHP : Highlight Certain Row within Condition

I have database "Students" and table "absen" absen 99.3% 98.8% 99.3% 99.1% 97.3% 99.0% 98.8% 98.9% 99.1% 99.3% 97.9% ... (0 Replies)
Discussion started by: radius
0 Replies

4. Shell Programming and Scripting

Parse tab delimited file, check condition and delete row

I am fairly new to programming and trying to resolve this problem. I have the file like this. CHROM POS REF ALT 10_sample.bam 11_sample.bam 12_sample.bam 13_sample.bam 14_sample.bam 15_sample.bam 16_sample.bam tg93 77 T C T T T T T tg93 79 ... (4 Replies)
Discussion started by: empyrean
4 Replies

5. Shell Programming and Scripting

Replacing the text in a row based on certain condition

Hi All, I felt tough to frame my question. Any way find my below input. (.CSV file) SNo, City 1, Chennai 2, None 3, Delhi 4,None Note that I have many rows ans also other columns beside my City column. What I need is the below output. SNo, City 1, Chennai 2, Chennai_new 3, Delhi... (2 Replies)
Discussion started by: ks_reddy
2 Replies

6. Linux

perl program to delete the complete record

Hi all, I want a perl program to delete the record and its contents from a file if there is no particular line in the record given that all records are separated by a blank line. For example: #100 abcd efgh hijk 123 klm #200 abcd efgh hijk klm So, the pattern here is 123. If... (0 Replies)
Discussion started by: kaav06
0 Replies

7. Shell Programming and Scripting

Combining multiple rows in single row based on certain condition using awk or sed

Hi, I'm using AIX(ksh shell). > cat temp.txt "a","b",0 "c",bc",0 "a1","b1",0 "cc","cb",1 "cc","b2",1 "bb","bc",2 I want the output as: "a","b","c","bc","a1","b1" "cc","cb","cc","b2" "bb","bc" I want to combine multiple lines into single line where third column is same. Is... (1 Reply)
Discussion started by: samuelray
1 Replies

8. Shell Programming and Scripting

How to reject row from one file to another depending upon some condition

Hi I have a input file say abc.dat which contains data as below name~address~email~ID john~newyork~john@gmail.com~1500D steve~sydney~steve@abc.com~3451E Now if length of name is > 50 or ID is null then the row should rejected to another file say reject.dat along with reason for rejection... (2 Replies)
Discussion started by: saurav2602
2 Replies

9. Shell Programming and Scripting

Delete first row last column

Hi All, I am having following file and I want to delete 1 row last column. Current File Content: ================ procedure test421 put_line procedure test321 test421 procedure test521 test321 procedure test621 test521 Expected File Content: =========================== procedure... (3 Replies)
Discussion started by: susau_79
3 Replies

10. UNIX for Advanced & Expert Users

Delete a word and complete line

Hi Canone please provide me solution how can achieve the result below: File1.txt $ sweet appleŁ1 scotish green $ This is a test1 $ sweet mangoŁ2 asia yellow $ This is a test 2 $ sweet apple red (there is no pound symbol here) germany green (1 Reply)
Discussion started by: Aejaz
1 Replies
Login or Register to Ask a Question