Script to delete rows in a file

11-20-2013

Registered User

9, 0

Join Date: Nov 2013

Last Activity: 17 March 2015, 2:16 AM EDT

Posts: 9

Thanks Given: 2

Thanked 0 Times in 0 Posts

Script to delete rows in a file

Hi All,

I am new to UNIX . Please help me in writing code to delete all records from the file where all columns after cloumn 5 in file is either 0, #MI or NULL.
Initial 5 columns are string

e.g.

Code:

"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" #Mi #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 0 0 0
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

expected output is

Code:

"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

The file is around 1-2 GB large.
I have written a code but it is taking 45-50 min to execute the script.

Code:

grep -EHv ([1-9]/s) file.txt > file2.txt

can some one please suggest alternate code where we are selectively deleting the records containing 0/#Mi/NULL after column 5

Thanks

Last edited by Franklin52; 11-20-2013 at 03:16 PM.. Reason: Please use code tags per the Forum Rules

alok2082

View Public Profile for alok2082

Find all posts by alok2082

11-20-2013

Registered User

545, 114

Join Date: Jul 2013

Last Activity: 5 January 2020, 9:33 PM EST

Location: Dallas, Texas

Posts: 545

Thanks Given: 14

Thanked 114 Times in 111 Posts

I don't think that grep you show is correct. For example the -H on the grep command you show prints the filename per match. I don't understand why you have included it. I ran your example and it does not work. Use awk.

blackrageous

View Public Profile for blackrageous

Find all posts by blackrageous

11-20-2013

Moderator

1,837, 668

Join Date: Nov 2012

Last Activity: 30 June 2020, 12:07 PM EDT

Posts: 1,837

Thanks Given: 180

Thanked 668 Times in 590 Posts

Quote:

Originally Posted by alok2082

Hi All,

I am new to UNIX . Please help me in writing code to delete all records from the file where all columns after cloumn 5 in file is either 0, #MI or NULL.
Initial 5 columns are string

e.g.

"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" #Mi #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 0 0 0
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

expected output is

"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

The file is around 1-2 GB large.
I have written a code but it is taking 45-50 min to execute the script.

grep -EHv ([1-9]/s) file.txt > file2.txt

can some one please suggest alternate code where we are selectively deleting the records containing 0/#Mi/NULL after column 5

Thanks

Try :

Code:

$ cat file
"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" #Mi #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 0 0 0
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

$ awk '$7~/[0-9]/ && $7 !=0' file
"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

Akshay Hegde

View Public Profile for Akshay Hegde

Find all posts by Akshay Hegde

11-20-2013

Registered User

503, 195

Join Date: Sep 2013

Last Activity: 22 January 2021, 1:52 PM EST

Location: France

Posts: 503

Thanks Given: 43

Thanked 195 Times in 176 Posts

Hi,
With sed:

Code:

$ sed '/"[^"]*[1-9][^"]*$/!d' file.txt
"alsod" "1FEV2" "wjwroe" " wsse" "hd3" 1 2 34 #Mi
"malasl" "wses" "trwwwe" " wsse" "hd3" 1 2 0 #Mi
"alsod" "1FEV2" "asd" " wsse" "hd3" 1 2 3 4 5

Regards.

disedorgue

View Public Profile for disedorgue

Find all posts by disedorgue

11-20-2013

Registered User

5,091, 1,931

Join Date: May 2012

Last Activity: 15 July 2020, 4:46 AM EDT

Location: Simplicity

Posts: 5,091

Thanks Given: 565

Thanked 1,931 Times in 1,668 Posts

Your requirement says after 5th column, but it looks like it is after the last " character.
Further it looks like a non-zero digit should be reason enough to print.
Then with awk it becomes

Code:

awk '{for (f=NF; f>=1 && $f!~/"/; f--) if ($f~/[1-9]/) {print; next}}' file

like was done in the previous sed solution.
The advantage of awk is, you have more means to modify your search.

---------- Post updated at 12:07 PM ---------- Previous update was at 12:01 PM ----------

While the previous sed could be abbreviated

Code:

sed '/[1-9][^"]*$/!d' file

that is equivalent to

Code:

grep '[1-9][^"]*$' file

This User Gave Thanks to MadeInGermany For This Post:

MadeInGermany

View Public Profile for MadeInGermany

Find all posts by MadeInGermany

Shell Programming and Scripting

Script to delete rows in a file

10 More Discussions You Might Find Interesting

1. UNIX and Linux Applications

Script to delete few rows from a file and then update header

Discussion started by: mirwasim

2. Shell Programming and Scripting

Delete unique rows - optimize script

Discussion started by: varu0612

3. Shell Programming and Scripting

Delete rows from big file

Discussion started by: Tibbeche

4. Shell Programming and Scripting

Delete rows in text file

Discussion started by: Lucky Ali

5. UNIX for Advanced & Expert Users

Delete rows from a file...!!

Discussion started by: ak835

6. Shell Programming and Scripting

delete rows in a file based on the rows of another file

Discussion started by: Muthuraj K

7. Shell Programming and Scripting

[HELP] - Delete rows on a CSV file

Discussion started by: Sadarrab

8. Shell Programming and Scripting

how to delete duplicate rows in a file

Discussion started by: vamshikrishnab

9. Shell Programming and Scripting

How to delete particular rows from a file

Discussion started by: suresh3566

10. Shell Programming and Scripting

Delete repeated rows from a file

Discussion started by: tonet