remove bad records.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting remove bad records.
# 1  
Old 07-25-2012
remove bad records.

HI I have a problem in a file .The file was generated with the wrong data in it.
Code:
 
 
MAL 005158UK473BBTICK1120722 A9999999ADASCD 1120722ADD_SECURIADD_SECURI
MAL 005158UK473BBU 1120722 A9999999FF000EA0B9C 1120722ADD_SECURIADD_SECURI
MAL 005158UK473ISN 1120722 A9999999US005158UK43 1120722ADD_SECURIADD_SECURI
MAL 005158UK473LC 1120722 A9999999005158UK4 1120722ADD_SECURIADD_SECURI
MAL 005158UK473TE 1120722 A999999915691087 1120722ADD_SECURIADD_SECURI
MCM 005158UK460001 A<FMRDMS> 002 A 003 AOTHER 004 A
MCM 005158UK460005 AGEN_INSTR 006 A 007 ASECURITY ADDED VIA BULK ADD 008 A
MCM 005158UK460009 A<FMRDMS> 010 A 011 AOTHER 012 A
MCM 005158UK460013 AGEN_TRD_ENT 014 A 015 ASECURITY ADDED VIA BULK ADD 016 A

this is a portion of file.So the lines starting with MCM are all the wrong data.So I would need to filter it out .

when I execute the following commad

Code:
 
grep MCM file_name | wc -l

it gave me 2894.i.e so many bad record are there.

Also I would need to remove the lines with special characters like @ #$ etc.

I would need to take this file as in put and generate a new file with out bad records .can any one suggest a command for this?
# 2  
Old 07-25-2012
Code:
 
awk '!/^MCM/ && !/\@/ && !/\#/ && !/\$/' input.txt > output.txt

# 3  
Old 07-25-2012
You can also use:

Code:
grep -Ev "^MCM|\\$|\\#|\\@" file_name > file_fixed

# 4  
Old 07-25-2012
Another awk solution:
Code:
awk '!/^MCM/ && !/[@$#]/' file

# 5  
Old 07-25-2012
thanks all for your replies.

Code:
 
grep -Ev "^MCM|\\$|\\#|\\@" file_name > file_fixed

in the above command my interpretion is

^MCM will mean lines thatthat start with MCM
| will behave as an "or ".
why we are including \\ ?
I just wanted to understand the command so that I can remember that.

---------- Post updated at 12:52 AM ---------- Previous update was at 12:19 AM ----------

Quote:
Originally Posted by Chubler_XL
You can also use:

Code:
grep -Ev "^MCM|\\$|\\#|\\@" file_name > file_fixed


Hi When I execute the above command I am getting the error as
Code:
 
 
% grep -Ev "^MCM|\\$|\\#|\\@" as400.gensec_20120724_20120724184540.L3_BKP > my_file
Variable name must contain alphanumeric characters.

could u please advise?
# 6  
Old 07-25-2012
Quote:
Originally Posted by ptappeta
thanks all for your replies.

Code:
 
grep -Ev "^MCM|\\$|\\#|\\@" file_name > file_fixed

in the above command my interpretion is

^MCM will mean lines thatthat start with MCM
| will behave as an "or ".
why we are including \\ ?
I just wanted to understand the command so that I can remember that.
Inside a double-quoted string, the shell replaces \\ with a single backslash. That single backslash is then seen by grep which uses it to remove any special meaning from the character that follows it in the extended regular expression.

That said, that's a terrible example to learn from (sorry, Chubler_XL). First of all, there's no need to use double quotes. Nothing within the string requires expansion (the source of the error message is almost certainly the unintended expansion of $|). Further, neither # nor @ are special characters in the regular expression, so they should not be backslashed. Technically, those sequences are undefined and a grep implementation is allowed to reject them as syntax errors (though most will just throwaway the backslash).

An equivalent command:
Code:
grep -Ev '^MCM|\$|#|@' file_name > file_fixed

Regards,
Alister

Last edited by alister; 07-25-2012 at 03:07 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicate records

Hi, i am working on a script that would remove records or lines in a flat file. The only difference in the file is the "NOT NULL" word. Please see below example of the input file. INPUT FILE:> CREATE a ( TRIAL_CLIENT NOT NULL VARCHAR2(60), TRIAL_FUND NOT NULL... (3 Replies)
Discussion started by: reignangel2003
3 Replies

2. Solaris

Bad exchange descriptor : not able to remove files under zpool

Hi , One of my zone went down and when i booted it up i could see the pool in degraded state with some check sum errors . we have brought the pool online after scrubbing. But few files are showing this error Bad exchange descriptor Please let me know how to remove these files (2 Replies)
Discussion started by: chidori
2 Replies

3. Shell Programming and Scripting

Remove bad records from file and move them into a file then send those via email

Hi my requirement is that i want pull the bad records from input file and move those records in to a seperate file. that file has to be sent via email.. any suggentions please (1 Reply)
Discussion started by: sxk4999
1 Replies

4. Shell Programming and Scripting

Shell script remove bad character

I was curious to know how to write into my shell script to remove a character. The character I want to remove is  within a .html file. (18 Replies)
Discussion started by: graphicsman
18 Replies

5. UNIX for Dummies Questions & Answers

Need to remove certain records off a file.

New to unix. I have a couple files of 5 million records. I have a key field on those records. I have about 300 keys that I need to remove off the file, and I don't want to write a program to do it. I have used grep -v in the past and that works great for a few records, but I can't see myself... (2 Replies)
Discussion started by: jclanc8
2 Replies

6. Shell Programming and Scripting

remove records which have 2 same fields

how can i remove records which have 2 same fields? my file: saeed 1 2 sa vahid 2 3 45 reza 212 33 sa amir 1 1 ui reza 21 33 sa i want to remove records which first and 3rd field of that are as the same, here line 3 and 5 must be removed. (3 Replies)
Discussion started by: saeed.soltani
3 Replies

7. Shell Programming and Scripting

how to remove particular records from a file???

I need to remove header(H) and trailer(T) from a file keeping other records as such. The source file will look as below I have to remove H|20120203_000500|20120203_000500 and T| 10111246 from the above file. Please let me know how to do... (6 Replies)
Discussion started by: siteregsam
6 Replies

8. Shell Programming and Scripting

Remove Duplicate Records

Hi frinds, Need your help. item , color ,desc ==== ======= ==== 1,red ,abc 1,red , a b c 2,blue,x 3,black,y 4,brown,xv 4,brown,x v 4,brown, x v I have to elemnet the duplicate rows on the basis of item. the final out put will be 1,red ,abc (6 Replies)
Discussion started by: imipsita.rath
6 Replies

9. Shell Programming and Scripting

Removing bad records from a text file

Hi, I have an requirement where i need to remove few bad records(bad records I mean email id's are part of the 1st field, where a numeric value expected) from the text file delimited by ",". file1.txt --------- 1234,,DAVID,MAX abc@email.com,,JOHN,SMITH 234,,ROBERT,SEN I need to remove... (3 Replies)
Discussion started by: naveen_sangam
3 Replies

10. Shell Programming and Scripting

problem with bad records

I have a data file with around 1 million records and i have 12 data fileds in each record seperated by 11 pipes. The file also has some bad records where there is only one pipe in some of the records. I want to print all this records with only one pipe in them. These bad records are broken... (4 Replies)
Discussion started by: dsravan
4 Replies
Login or Register to Ask a Question