Edit a file using awk ?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Edit a file using awk ?
# 1  
Old 02-02-2012
Edit a file using awk ?

Hey guys,

I'm trying to learn a bit of awk/sed and I'm using different sites to learn it from, and i think I'm starting to get confused (doesn't take much!).

Anyway, say I have a csv file which has something along the lines of the following in it:
Code:
"test","127.0.0.1","startup timestamp",,,,"1327702381482",
"test","127.0.0.1","cpu combined","cpu 0",,,,"0.0900"
"test","127.0.0.1","cpu idle","cpu 0",,,,"0.9100"
"test","127.0.0.1","cpu nice","cpu 0",,,,"0.0000"
"test","127.0.0.1","cpu sys","cpu 0",,,,"0.0360"
"test","127.0.0.1","cpu user","cpu 0",,,,"0.0540"
"test","127.0.0.1","cpu wait","cpu 0",,,,"0.0010"

Basically, what I want, is to edit the file via a script rather than manually. For the first line for instance, all I want left on that line is startup timestamp,1327702381482 and on the second line cpu combined 0.0900 etc etc so that the file now looks something like:
Code:
startup timestamp,1327702381482
cpu combined,0.0900
cpu idle,0.9100
cpu nice,0.0000
cpu sys,0.0360
cpu user,0.0540
cpu wait,0.0010

Anyway, while learning, I've tried various different commands to do this, so, say for the first line, I tried the following (and it didn't work!):
Code:
 awk '{if (NR==1) print{"$19"} print{"$24"} print{"$26"}}' myfile.csv > mynewfile.csv

Something tells me I'm hopeless at this! Any help would be gratefully received before it drives me insane!

Also, I read that the O'Reilly sed&awk (second edition) book is worth buying, any you guys recommend it ? I looked it up on Amazon and it was published in 1997 but seems the current edition. Or, if you guys could recommend another book on sed/awk, there would be much thanks!

Cheers

Jim
# 2  
Old 02-02-2012
Try:
Code:
awk -F, -vOFS="," '{print $3,$8}' myfile.csv > mynewfile.csv

This User Gave Thanks to bartus11 For This Post:
# 3  
Old 02-02-2012
First off, awk doesn't know that it's supposed to split on commas unless you tell it. It won't know to print out commas unless you tell it, either.

Code:
# -F controls the input separator.  You want ,
# -v can set any variable inside the program before it starts running.
# This includes special variables like OFS, the output separator.
awk -F, -v OFS="," ...

Second, instead of one big statement of if/then, you can make different statements triggered by different situations. Putting a condition in front of a code block controls when it gets run. For each line, they're processed in order.

So:

Code:
$ awk -F, -v OFS="," '
# This statement runs first, every line.  Deletes all quotation marks.
{ for(N=1; N<=NF; N++) gsub(/"/,"", $N); }
# This statement runs only on the first line.  Print fields 3 and 7.
NR==1{ print $3, $7 }
# This only runs for every line thereafter.  Print fields 3 and 8.
NR>1 { print $3, $8 }' data

startup timestamp,1327702381482
cpu combined,0.0900
cpu idle,0.9100
cpu nice,0.0000
cpu sys,0.0360
cpu user,0.0540
cpu wait,0.0010

$

This User Gave Thanks to Corona688 For This Post:
# 4  
Old 02-02-2012
Thanx guys, that was exactly the sort of information I was after!

Yous wouldn't happen to know of any decent tutorials on the net about awk ? Think the ones I've been reading might not be that great.

Jim
# 5  
Old 02-02-2012
The Linux manual page for it is a wealth of information, documenting a lot of syntax, all the special built-in variables, every built-in function, etc. It's far too much information to give someone who's never used it before, since it's gibberish out of context, but now that you know a little of the basics I think it'd be very helpful. It's also a good reference.

One thing I might want to clear up is that $ does NOT mean variable. variables in awk are just names, like abc=32;. $ is actually an operator which means "turn a number into a field".

$3 turns the number 3 into field number 3. You could also set X=3 then use $X to get the third field. That's what I'm doing in my while loop, why I'm using N in most places but $N in one particular place. You can even do expressions in it, like $(X+3), which would get you field 6 if X was 3. NF is the number of fields, and since fields start at 1, $NF is the very last field.

You can write to fields, too. $1="asdf" is perfectly fine to do.

$0 means the entire current line. You can write to it too. Changes made to $1, etc turn up in $0 where you'd expect them to, and vice versa.

Technically, awk isn't "line-based", just "record-based". By default it uses \n as its record separator, but that's a special variable too, RS, which you can set as you please. Setting it blank makes it split upon blank lines.

Last edited by Corona688; 02-02-2012 at 04:44 PM..
# 6  
Old 02-02-2012
Thanx Corona688, that helps a lot.

I'll check out the man page/s for it in a little while. I'm coming from a Windows background, and been learning Linux (to enhance career opportunities) and I wish I had converted sooner - Linux is so much more powerful and fascinating. These forums, to me, are invaluable.

One question about the awk script you posted above. Say, for instance, in another script there were more commas separting the data on a different line. Could I use an 'if NR>10 && NR<20' in there too, or would that not work ? If it wouldn't work what would be the best approach ?

Just been reading this blog article about cleaning csv data with awk, but seems to be confusing me even more! Think I need sleep. Smilie

Cheers

Jim
# 7  
Old 02-02-2012
Quote:
Originally Posted by jimbob01
One question about the awk script you posted above. Say, for instance, in another script there were more commas separting the data on a different line. Could I use an 'if NR>10 && NR<20' in there too, or would that not work ? If it wouldn't work what would be the best approach ?
It would work fine. You can put expressions of any complexity you want in there, including brackets, variables, and regular expressions.

You might want a >= or <= in there somewhere so you don't leave out line 10 or 20 by accident.

Quote:
Just been reading this blog article about cleaning csv data with awk, but seems to be confusing me even more! Think I need sleep. Smilie
Think of BEGIN as just another statement. The code block with BEGIN in front of it gets run whenever it's true, and it's true when the program finishes loading, but no lines are yet processed. It's handy for setting things up. There's an equal and opposite END one, too, so you can have an awk script that does Z += $3 for a thousand lines, then print the total in the END section.

The FS variable is what you're setting with the -F flag. You can set that in BEGIN as easily as anywhere else. Up to your preferences. It defaults to spaces and tabs.

You can use regular expressions, too. awk '/asdf/ { print $3 }' for instance would only print the third field in lines containing 'asdf'.

And if you leave off the code block completely, it becomes a control for when lines are printed. awk '/asdf/' is equivalent to grep 'asdf' for example. So if you're doing grep | awk, you can probably just put the entire thing in awk somehow...

Last edited by Corona688; 02-02-2012 at 05:38 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Edit distance using perl or awk

Dear all, I am working on a large Sindhi lexicon which I hope to complete by 2017 and place in open source. The database is in Arabic script in two columns delimited by an equal to sign. Column 1 contains a word or words without the short vowel and also some extraneous information which is... (0 Replies)
Discussion started by: gimley
0 Replies

2. Shell Programming and Scripting

Inline edit using sed / awk

Hi, I have file with all the lines as following format <namebindings:StringNameSpaceBinding xmi:id="StringNameSpaceBinding" name="ENV_CONFIG_PATH" nameInNameSpace="COMP/HOD/MYSTR/BACKOFFICE/ENV_CONFIG_PATH" stringToBind="test"/> I want to replace (all the lines) value of... (8 Replies)
Discussion started by: shuklaa02
8 Replies

3. Shell Programming and Scripting

edit field using sed or awk

please help me to edit the second field using awk or sed i have input file below aa1001 000001 bb1002 000002 cc1003 000003 so i want the output file like below aa1001 01 bb1002 02 cc1003 03 (38 Replies)
Discussion started by: zulabc
38 Replies

4. Shell Programming and Scripting

How to get awk to edit in place and join all lines in text file

Hi, I lack the utter fundamentals on how to craft an awk script. I have hundreds of text files that were mangled by .doc format so all the lines are broken up so I need to join all of the lines of text into a single line. Normally I use vim command "ggVGJ" to join all lines but with so many... (3 Replies)
Discussion started by: n00ti
3 Replies

5. Shell Programming and Scripting

use awk to edit a file..pls help

hey i want to over write the fourth field of a ':' delimited file by first finding the required row by using grep. i have done the following cat file | grep no. | awk -F ':' { $4=count; print $1:$2:$3:$4;} the correct values are being printed but nothin is bein added to the file..please... (5 Replies)
Discussion started by: dhe.arora
5 Replies

6. Shell Programming and Scripting

edit fields awk

Hi there, i need some help please... I have this text, it's name data.txt that contains the following information: Mark Owen: 6999999888 6999999888 +302310999999 2310999999 Steve Blade Pit: +30691111222 2310888777 6999999888 John Rose: 2310777555 310544565 +302310999999 Mary Stuart:... (7 Replies)
Discussion started by: Mark_orig
7 Replies

7. Shell Programming and Scripting

search and edit in the same file using awk

Hi, I am having a user.txt contains the name of users and passwd.txt file contains as passwd.txt $cat usr.txt root bin daemon cap $cat passwd.txt root:x:0:0:root:/root:/usr/bin/ksh bin:x:1:1:bin:/bin:/sbin/csh daemon:x:2:2:daemon:/sbin:/usr/bin/ksh adm:x:3:4:adm:/var/adm:/sbin/nologin... (4 Replies)
Discussion started by: Manabhanjan
4 Replies

8. Shell Programming and Scripting

Sed or Awk or both to edit file

What is an efficient way to remove all lines from the input file which contain a file name? inputfile: ======================= # comment # comment # comment 5 8 10 /tmp 5 8 10 /var/run 5 8 10 /etc/vfstab 5 8 9 /var/tmp 5 8 10 /var/adm/messages... (7 Replies)
Discussion started by: Arsenalman
7 Replies

9. Shell Programming and Scripting

File edit with awk or sed

I have the follwoing file: This looks to be : seperated. For the first field i want only the file name without ".txt" and also i want to remove "+" sign if the second field starts with "+" sign. Input file: Output file: Appreciate your help (9 Replies)
Discussion started by: pinnacle
9 Replies

10. Shell Programming and Scripting

edit entire column from a fixed-width file using awk or sed

Col1 Col2 Col3 Col4 12 Completed 08 0830 12 In Progress 09 0829 11 For F U 07 0828 Considering the file above, how could i replace the third column the most efficient way? The actual file size is almost 1G. I am... (10 Replies)
Discussion started by: tamahomekarasu
10 Replies
Login or Register to Ask a Question