Script to delete rows in a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Script to delete rows in a file
# 1  
Old 11-20-2013
Script to delete rows in a file

Hi All,

I am new to UNIX . Please help me in writing code to delete all records from the file where all columns after cloumn 5 in file is either 0, #MI or NULL.
Initial 5 columns are string

e.g.
Code:
"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" #Mi #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 0 0 0
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5


expected output is
Code:
"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5


The file is around 1-2 GB large.
I have written a code but it is taking 45-50 min to execute the script.
Code:
grep -EHv ([1-9]/s) file.txt > file2.txt

can some one please suggest alternate code where we are selectively deleting the records containing 0/#Mi/NULL after column 5

Thanks

Last edited by Franklin52; 11-20-2013 at 03:16 PM.. Reason: Please use code tags per the Forum Rules
# 2  
Old 11-20-2013
I don't think that grep you show is correct. For example the -H on the grep command you show prints the filename per match. I don't understand why you have included it. I ran your example and it does not work. Use awk.
# 3  
Old 11-20-2013
Quote:
Originally Posted by alok2082
Hi All,

I am new to UNIX . Please help me in writing code to delete all records from the file where all columns after cloumn 5 in file is either 0, #MI or NULL.
Initial 5 columns are string

e.g.

"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" #Mi #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 0 0 0
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5


expected output is

"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5


The file is around 1-2 GB large.
I have written a code but it is taking 45-50 min to execute the script.

grep -EHv ([1-9]/s) file.txt > file2.txt

can some one please suggest alternate code where we are selectively deleting the records containing 0/#Mi/NULL after column 5

Thanks
Try :

Code:
$ cat file
"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" #Mi #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 0 0 0
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

$ awk '$7~/[0-9]/ && $7 !=0' file
"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

# 4  
Old 11-20-2013
Hi,
With sed:
Code:
$ sed '/"[^"]*[1-9][^"]*$/!d' file.txt
"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

Regards.
# 5  
Old 11-20-2013
Your requirement says after 5th column, but it looks like it is after the last " character.
Further it looks like a non-zero digit should be reason enough to print.
Then with awk it becomes
Code:
awk '{for (f=NF; f>=1 && $f!~/"/; f--) if ($f~/[1-9]/) {print; next}}' file

like was done in the previous sed solution.
The advantage of awk is, you have more means to modify your search.

---------- Post updated at 12:07 PM ---------- Previous update was at 12:01 PM ----------

While the previous sed could be abbreviated
Code:
sed '/[1-9][^"]*$/!d' file

that is equivalent to
Code:
grep '[1-9][^"]*$' file

This User Gave Thanks to MadeInGermany For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX and Linux Applications

Script to delete few rows from a file and then update header

HJKL1Name00014300010800000418828124201 L201207022012070228XAM 00000000031795404 001372339540000000000000000000000 COOLTV KEYA Zx00 xI-50352202553 00000000 00000000 G000000000000 00000000 ... (10 Replies)
Discussion started by: mirwasim
10 Replies

2. Shell Programming and Scripting

Delete unique rows - optimize script

Hi all, I have the following input - the unique row key is 1st column cat file.txt A response C request C response D request C request C response E request The desired output should be C request (7 Replies)
Discussion started by: varu0612
7 Replies

3. Shell Programming and Scripting

Delete rows from big file

Hi all, I have a big file (about 6 millions rows) and I have to delete same occurrences, stored in a small file (about 9000 rews). I have tried this: while read line do grep -v $line big_file > ok_file.tmp mv ok_file.tmp big_file done < small_file It works, but is very slow. How... (2 Replies)
Discussion started by: Tibbeche
2 Replies

4. Shell Programming and Scripting

Delete rows in text file

Hi I do have a text file with 1000's of lines with 1 row and column with a specific pattern. 1102 1 1 1 1 1234 1 1 1 1 1009 1 1 1 1 1056 1 (3 Replies)
Discussion started by: Lucky Ali
3 Replies

5. UNIX for Advanced & Expert Users

Delete rows from a file...!!

Say i have a file with X rows and Y columns....i see that in some of the rows,some columns are blank (no value set)...i wish to delete such rows....how can it be done? e.g 181766 100 2009-06-04 184443 2009-06-04 10962 151 2009-06-04 161 2009-06-04... (7 Replies)
Discussion started by: ak835
7 Replies

6. Shell Programming and Scripting

delete rows in a file based on the rows of another file

I need to delete rows based on the number of lines in a different file, I have a piece of code with me working but when I merge with my C application, it doesnt work. sed '1,'\"`wc -l < /tmp/fileyyyy`\"'d' /tmp/fileA > /tmp/filexxxx Can anyone give me an alternate solution for the above (2 Replies)
Discussion started by: Muthuraj K
2 Replies

7. Shell Programming and Scripting

[HELP] - Delete rows on a CSV file

Hello to all members, I am very new in unix stuff (shell scripting), but a want to learn a lot. I am a ex windows user but now i am absolutely Linux super user... :D So i am tryng to made a function to do this: I have two csv files only with numbers, the first one a have: 1 2 3 4 5... (6 Replies)
Discussion started by: Sadarrab
6 Replies

8. Shell Programming and Scripting

how to delete duplicate rows in a file

I have a file content like below. "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","","" "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","","" "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""... (5 Replies)
Discussion started by: vamshikrishnab
5 Replies

9. Shell Programming and Scripting

How to delete particular rows from a file

Hi I have a file having 1000 rows. Now I would like to remove 10 rows from it. Plz give me the script. Eg: input file like 4 1 4500.0 1 5 1 1.0 30 6 1 1.0 4500 7 1 4.0 730 7 2 500000.0 730 8 1 785460.0 45 8 7 94255.0 30 9 1 31800.0 30 9 4 36000.0 30 10 1 15000.0 30... (5 Replies)
Discussion started by: suresh3566
5 Replies

10. Shell Programming and Scripting

Delete repeated rows from a file

Hi everybody: Could anybody tell me how I can delete repeated rows from a file?, this is, for exemple I have a file like this: 0.490 958.73 281.85 6.67985 0.002481 0.490 954.833 283.991 8.73019 0.002471 0.590 950.504 286.241 6.61451 0.002461 0.690 939.323 286.112 6.16451 0.00246 0.790... (8 Replies)
Discussion started by: tonet
8 Replies
Login or Register to Ask a Question