looking for some AWK help... hmmm maybe SED...???


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting looking for some AWK help... hmmm maybe SED...???
# 1  
Old 01-28-2009
looking for some AWK help... hmmm maybe SED...???

ladies and gents..

im trying to do a surgical search and replace though a 3 GB file;

any timestamp fields that are spaces need to be changed to null
any numerics that are spaces need to be changed to null


so...

say i have a flat file such as this

numeric, char, numeric, date, char

pk|fname|weight|bday|city

1|john| |10-JAN-08|raleigh
2|steph|12| |denver
3|jane|8|21-MAR-01|atlanta
4|eric||06-APR-82|atherton


for any numeric or date field that contains all spaces, i.e.:
row 1 field 3, row 2 field 4

replace the spaces to make a null, such as:

1|john||10-JAN-08|raleigh
2|steph|12||denver
3|jane|8|21-MAR-01|atlanta
4|eric||06-APR-82|atherton




i was thinking something like:

BEGIN {
RS="\n"
FS="|"
}

{
if ($3==" ") {
$3="";
}
if ($4==" ") {
$4="";
}
print $1","$2","$3","$4","$5
}


is awk the way to go with this, or sed with a regex the way i should do it...


thanks much..
# 2  
Old 01-28-2009
Quote:
for any numeric or date field that contains all spaces, i.e.:
row 1 field 3, row 2 field 4
If it is ok to remove any blanks in any fields, you could simply use:
Code:
tr -d ' ' < infile > outfile

This might be fastest with a 3 GB input file.

If it is definetly only field 3 and 4 that may be touched, try this:
Code:
awk -F"|" '{sub(/ /,"",$3); sub(/ /,"",$4); print; next}' OFS="|" infile > outfile

# 3  
Old 01-28-2009
thanks zaxxon..

ill tinker around with your second idea today

much appreciated
# 4  
Old 01-28-2009
I don't see a problem with putting all similar character spaces to null, also?
What would the problem with that be? A single space in a character field is
essentially the definition of no-data...

So, with that premise, why wouldn't this work fine?

Code:
sed -e 's/| |/||/g' file_in > file_out

# 5  
Old 01-29-2009
What if for a varchar field for example has a name that consists of two words divided by a blank? It would be deleted and the value will be simply wrong.
As we only have a short snippet of his data to be processed, we can only guess and when he states it is important to work only on field 3 and 4 when he already got a cmd to work the whole line which is apparently not wanted...
# 6  
Old 01-29-2009
zaxxon..

i went with your suggestion and it worked great...

but the more i dug into the data, the more i saw how crappy it was..

lots of leading and trailing whitespace on most of the fields, so i ended up using this approach..

awk 'BEGIN{FS=OFS="\t"};{ for (i=NF; i>0; i--) gsub(/^[ \t]+|[ \t]+$/, "",$i); print}' < SALES_data.dat> SALES_data_cleansed.dat

thanks for the help
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed and awk giving error ./sample.sh: line 13: sed: command not found

Hi, I am running a script sample.sh in bash environment .In the script i am using sed and awk commands which when executed individually from terminal they are getting executed normally but when i give these sed and awk commands in the script it is giving the below errors :- ./sample.sh: line... (12 Replies)
Discussion started by: satishmallidi
12 Replies

2. Shell Programming and Scripting

Is this possible using SED and AWK?

Dear Geeks, I want to manipulate a file with certain modifications for that using sed or AWK how to do this process for one file i have this type of data. Input File: "Restricted and Reserved names .ANISH",3798,"TEST.CO",1201208,6/16/10 0:00,6/16/13 0:00,,,"CO","2nd"^M "Restricted and... (4 Replies)
Discussion started by: anishkumarv
4 Replies

3. UNIX for Dummies Questions & Answers

sed/awk or help please

I have a file that contain the data below: B1 1 2 3 B2 20 30 40 B3 7 8 B4 100 B5 21 22 23How can I retrieve the data for B1 into a seperate file. (8 Replies)
Discussion started by: bobo
8 Replies

4. Shell Programming and Scripting

Need help using awk or sed.

Hi All, Is there a way of comparing two columns in the same file and deleting the row if the values of the columns match. I have the sample data file as below. M024900|175309.00|968.00|17 M025001|19861.79|97.90|148 M025002|431.70|159.00|3 M025003|912.30|159.90|6 ... (6 Replies)
Discussion started by: nua7
6 Replies

5. Shell Programming and Scripting

Using sed or awk?

What if I wanted to add a word such as IT after the first character and if theres 3 characters, after the 2nd character? output would be: G, it H G, H it P G, H, P it L I'm thinking that AWK would be the easiest way to do this... Currently looking it up. Right now I'm using awk but I... (13 Replies)
Discussion started by: puttster
13 Replies

6. UNIX for Dummies Questions & Answers

sed or awk?

I've got an inventory database with eight columns with things like product name, manufacturer, UPC code, etc. on each line. Our PO (purchase order) number is in the first column. I can grep the date and get the full line of data but I would like to strip out everything but the PO number in the... (5 Replies)
Discussion started by: NetJones
5 Replies

7. UNIX for Advanced & Expert Users

Awk or Sed help

Hi, I have a data file with 5 columns - like this: "20080401 09:43:08.770798 +0100s","TEST 1","R 1","A TEST","Nov 27 2007","1" "20080401 09:43:08.770798 +0100s","THIS IS A TEST","R 2","B TEST","Nov 30 2007","10" "20080401 09:43:08.770798 +0100s","ANOTHER TEST","R 3","B TEST","Nov 05... (7 Replies)
Discussion started by: MrG-San
7 Replies

8. UNIX for Advanced & Expert Users

sed in awk ? or nested awk ?

Hey all, Can I put sed command inside the awk action ?? If not then can i do grep in the awk action ?? For ex: awk '$1=="174" { ppid=($2) ; sed -n '/$ppid/p' tempfind.txt ; }' tempfind.txt Assume: 174 is string. Assume: tempfind.txt is used for awk and sed both. tempfind.txt... (11 Replies)
Discussion started by: varungupta
11 Replies

9. Shell Programming and Scripting

sed,awk

Hi, I know sed is stream text editor and not a bit more than that. Can anyone explain its usage and advantages? How is awk different from sed? I donno i am a bit confused about it. But i have coded in awk and shell. Thanks, Nisha :confused: (7 Replies)
Discussion started by: Nisha
7 Replies

10. Shell Programming and Scripting

awk / sed

I have many messages such as the test message below: 00:00000:00021:2002/05/13 13:57:00.51 ERROR:- Test error, my test error!!! I am writing a script in which I need to get everything from the word "ERROR:-" onwards. I normally use awk for these things, but I am not an expert at it so i am... (6 Replies)
Discussion started by: baileyr1
6 Replies
Login or Register to Ask a Question