07-10-2009
filtering records based on numeric field value in 8th position
I have a ";" delimited file.Whcih conatins a number fileds of length 4 charcters in 8th position
But there is a alphanumeric charcters like :
space, ";" , "," , "/" , "23-1" ,
"23 1" , "aqjhdj" , "jun-23" , "APR-04" , "4:00AM" , "-234" , "56784 ", "." , "+"
"_" , "&" , "*" , "^" , "%" , "!" , "45/3" , "78*a" etc
which all gives the pl/sql numeric error.
But the actual value of filed should be in between 1-9999 only
How can I separate those records which are in the wrong format?
I need to separtae thme from orginal as proper format
and improper format.
I am able to filter some record with awk as given below.
awk -F";" '$8 !~ /[-A-Z ]/ { print $0 >>"proper.txt";next } { print $0 >>"improper.txt"} ' File_name.txt
Please see my input file.
C;4498;qwa;cghy;;;;40;;222122
C;4498;sample;city;;;;34 2;;222123
C;4498;qwe;xcbv;;;;34-2;;222124
C;4498;jj;sffz;;;;41;;222120
C;4498;jj;sffz;;;;41/a;;222120
C;4498;jj;;;;;30;;277789
C;4498;eert;qwq;;;;34 A;;222125
C;4498;;;;;;23;;22
C;4498;jj;szxzzd;;;;34*a;;222127
C;4498;jj;szxzzd;;;;a;;222127
C;4498;jj;szxzzd;;;;57864;;222127
C;4498;jj;szxzzd;;;;34;;222127
C;4498;jj;;;;;30+;;277789
C;4498;jj;;;;;4:00AM;;277789
C;4498;jj;;;;;JUN-04;;277789
C;4498;jj;szxzzd;;;;45;;222
C;4498;jj;szxzzd;;;;34.;;222127
C;4498;jj;sffz;;;;heruru;;222120
I need files as
proper file:
C;4498;qwa;cghy;;;;40;;222122
C;4498;jj;sffz;;;;41;;222120
C;4498;jj;;;;;30;;277789
C;4498;;;;;;23;;22
C;4498;jj;szxzzd;;;;34;;222127
C;4498;jj;szxzzd;;;;45;;222
Improper file:
C;4498;sample;city;;;;34 2;;222123
C;4498;qwe;xcbv;;;;34-2;;222124
C;4498;jj;sffz;;;;41/a;;222120
C;4498;jj;;;;;30;;277789
C;4498;eert;qwq;;;;34 A;;222125
C;4498;jj;szxzzd;;;;34*a;;222127
C;4498;jj;szxzzd;;;;a;;222127
C;4498;jj;szxzzd;;;;57864;;222127
C;4498;jj;;;;;30+;;277789
C;4498;jj;;;;;4:00AM;;277789
C;4498;jj;;;;;JUN-04;;277789
C;4498;jj;szxzzd;;;;34.;;222127
C;4498;jj;sffz;;;;heruru;;222120
Thanks alot for your help.
Please provide the solution ASAP.
Last edited by indusri; 07-10-2009 at 04:41 AM..
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi all,
I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g.
COL1 COL2 COL3 ............... COL47
1 XX 45 ... (4 Replies)
Discussion started by: risk_sly
4 Replies
2. Shell Programming and Scripting
Hi All,
I have two files say file1 and file2.
I want to check the number of records in file1 and if its atleast 2 (i.e., 2 or greater than 2 ) then I have to check records in file2 .If records in file2 is atleast 1 (i.e. if its not empty ) i have to set some conditions .
Could you pls... (3 Replies)
Discussion started by: mavesum
3 Replies
3. Shell Programming and Scripting
I have a file which is having fileds separtaed by delimiter.
Ex:
C;4498;qwa;cghy;;;;40;;222122
C;4498;sample;city;;;;34 2;;222123
C;4498;qwe;xcbv;;;;34-2;;222124
C;4498;jj;sffz;;;;41;;222120
C;4498;eert;qwq;;;;34 A;;222125
C;4498;jj;szxzzd;;;;34;;222127
out of these records I... (3 Replies)
Discussion started by: indusri
3 Replies
4. Shell Programming and Scripting
Hello Friends,
I have a file(InputFile.csv) with the following columns(the columns are pipe-delimited):
ColA|ColB|ColC|ColD|ColE|ColF
Now for this file, I have to get those records which fulfil the following condition:
If "ColB" is NOT NULL and "ColD" has values one of the following... (9 Replies)
Discussion started by: mehimadri
9 Replies
5. Shell Programming and Scripting
Hi Friends,
I have many files like below.
total,0.7%,0.0%,0.2%,0.0%,0.2%,0.7%,98.0%
total,1.9%,0.0%,0.4%,0.0%,0.0%,6.8%,90.6%
total,0.9%,0.0%,0.4%,0.0%,0.0%,0.0%,98.5%
total,1.4%,0.0%,0.7%,0.0%,0.2%,2.9%,94.5%
total,0.7%,0.0%,0.4%,0.0%,0.0%,0.9%,97.7%... (13 Replies)
Discussion started by: SunilB2011
13 Replies
6. UNIX for Dummies Questions & Answers
Hi,
I am looking for an awk script which should help me to meet the following requirement:
File1 has records in following format
INF: FAILEd RECORD AB1234
INF: FAILEd RECORD PQ1145
INF: FAILEd RECORD AB3215
INF: FAILEd RECORD AB6114
............................ (2 Replies)
Discussion started by: mintu41
2 Replies
7. Shell Programming and Scripting
Hello Gurus,
I have a requirement where I have to filter a value from some field which has 99% or greater than '99%'..
For ex:
The Date (file -- sample.csv) will be like below
Field1,Field2,Field3,Field4
860440512,844284992,16155520,99%
860440512,844284992,16155520,94%... (4 Replies)
Discussion started by: raghu.iv85
4 Replies
8. Shell Programming and Scripting
Hi,
I have a file with multiple lines(fixed width dat file). I want to search for '02' in the positions 45-46 and if available, in that lines, I need to replace value in position 359 with blank. As I am new to unix, I am not able to figure out how to do this. Can you please help me to achieve... (9 Replies)
Discussion started by: Pradhikshan
9 Replies
9. Shell Programming and Scripting
Hi,
I have following records in a file
more file1.txt
setting applicaction ABC for user
setting applicaction CDE for user
setting applicaction XXX for user
logging applicaction XXX for user
I need to filter out records which have strings " setting... (5 Replies)
Discussion started by: manid
5 Replies
10. UNIX for Beginners Questions & Answers
Hi,
I tried filtering the records in a csv file using "awk" command listed below.
awk -F"~" '$4 ~ /Active/{print }' inputfile > outputfile
The output always has all the entries.
The same command worked for different users from one of the forum links.
content of file I was... (3 Replies)
Discussion started by: sunilmudikonda
3 Replies
BM(PUBLIC) BM(PUBLIC)
NAME
bm - search a file for a string
SYNOPSIS
/usr/public/bm [ option ] ... [ strings ] [ file ]
DESCRIPTION
Bm searches the input files (standard input default) for lines matching a string. Normally, each line found is copied to the standard out-
put. It is blindingly fast. Bm strings are fixed sequences of characters: there are no wildcards, repetitions, or other features of regu-
lar expressions. Bm is also case sensitive. The following options are recognized.
-x (Exact) only lines matched in their entirety are printed
-l The names of files with matching lines are listed (once) separated by newlines.
-c Only a count of the number of matches is printed
-e string
The string is the next argument after the -e flag. This allows strings beginning with '-'.
-h No filenames are printed, even if multiple files are searched.
-n Each line is preceded by the number of characters from the beginning of the file to the match.
-s Silent mode. Nothing is printed (except error messages). This is useful for checking the error status.
-f file
The string list is taken from the file.
Unless the -h option is specified the file name is shown if there is more than one input file. Care should be taken when using the charac-
ters $ * [ ^ | ( ) and in the strings (listed on the command line) as they are also meaningful to the Shell. It is safest to enclose the
entire expression argument in single quotes ' '.
Bm searches for lines that contain one of the (newline-separated) strings, using the Boyer-Moore algorithm. It is far superior in terms of
speed to the grep (egrep, fgrep) family of pattern matchers for fixed-pattern searching, and its speed increases with pattern length.
SEE ALSO
grep(1)
DIAGNOSTICS
Exit status is 0 if any matches are found, 1 if none, 2 for syntax errors or inaccessible files.
AUTHOR
Peter Bain (pdbain@wateng), with modifications suggested by John Gilmore
BUGS
Only 100 patterns are allowed.
Patterns may not contain newlines.
If a line (delimited by newlines, and the beginning and end of the file) is longer than 8000 charcters (e.g. in a core dump), it will not
be completely printed.
If multiple patterns are specified, the order of the ouput lines is not necessarily the same as the order of the input lines.
A line will be printed once for each different string on that line.
The algorithm cannot count lines.
The -n and -c work differently from fgrep.
The -v, -i, and -b are not available.
4th Berkeley Distribution 8 July 1985 BM(PUBLIC)