find and remove rows from file where multi occurrences of character found

11-28-2008

Registered User

5, 0

Join Date: Nov 2008

Last Activity: 15 July 2009, 3:44 PM EDT

Posts: 5

Thanks Given: 0

Thanked 0 Times in 0 Posts

find and remove rows from file where multi occurrences of character found

I have a '~' delimited file of 6 - 7 million rows. Each row should contain 13 columns delimited by 12 ~'s. Where there are 13 tildes, the row needs to be removed. Each row contains alphanumeric data and occasionally a ~ ends up in a descriptive field and therefore acts as a delimiter, resulting in the row looking like it has 14 columns instead of 13. I have tried a combination of grep and awk but it is running very slowly. I suspect it is the way I am using it.

tried this to print the bad rows with line numbers to a file:
grep -n '~.*~.*~.*~.*~.*~.*~.*~.*~.*~.*~.*~.*~' inputfile | awk {print} > outputfile

also tried this to create a file with only the good rows in it:

grep -v '~.*~.*~.*~.*~.*~.*~.*~.*~.*~.*~.*~.*~' inputfile > outputfile

Both are extremely slow. The input file is approx. 800 meg

thanks

kpd

View Public Profile for kpd

Find all posts by kpd

11-28-2008

Registered User

7,747, 559

Join Date: Feb 2007

Last Activity: 20 April 2020, 11:28 AM EDT

Location: The Netherlands

Posts: 7,747

Thanks Given: 139

Thanked 559 Times in 520 Posts

No duplicate or cross-posting, read the rules.

Proceed here:

https://www.unix.com/unix-advanced-ex...#post302262764

Thread closed.

Franklin52

View Public Profile for Franklin52

Find all posts by Franklin52

UNIX for Dummies Questions & Answers

find and remove rows from file where multi occurrences of character found

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to remove newline character if it is the only character in the entire file.?

Discussion started by: rak Kundra

2. Shell Programming and Scripting

Character screening and paste into new file in columns instead of rows

Discussion started by: Ankit Vyas

3. UNIX for Advanced & Expert Users

Find 2 occurrences of a word and print file names

Discussion started by: cokedude

4. Shell Programming and Scripting

Replace a character of specified column(s) of all rows in a file

Discussion started by: njny

5. Shell Programming and Scripting

Find and Remove rows

Discussion started by: Shanks

6. Shell Programming and Scripting

remove all occurrences of a character at the beginning of a string

Discussion started by: gigagigosu

7. Shell Programming and Scripting

Combining rows in a text file with a character limit

Discussion started by: justinb_155

8. UNIX for Dummies Questions & Answers

Remove Occurrences created with Uniq -c

Discussion started by: hobbiecat

9. UNIX for Dummies Questions & Answers

Remove rows from file

Discussion started by: cv313x

10. UNIX for Advanced & Expert Users

remove lines from file where > 13 occurrences of character

Discussion started by: kpd