Please help me format with AWK or SED


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Please help me format with AWK or SED
# 1  
Old 01-25-2011
Please help me format with AWK or SED

INPUT FILE:
Code:
9780743565219    "GODS OF NEWPORT"    "JAKES, JOHN"  2006

OUTPUT FILE I NEED to CREATE FROM INPUT FILE:
Code:
cd /data/audiobooks/9780743565219
~/Desktop/mp3-to-m4b 9780743565219-GODS OF NEWPORT "GODS OF NEWPORT" "JAKES, JOHN" 2006 n

---------- Post updated at 04:19 PM ---------- Previous update was at 04:17 PM ----------

There will be multiple rows on the input file, I only provided one line for the example..

For the output, I need the newline after:
"cd /data/audiobooks/9780743565219"

I NEED all quotes to remain like the example shows, I color coded the variable data to make it easier to see what changes.
# 2  
Old 01-25-2011
Code:
nawk -F'"' '{gsub(" ","",$1);print "cd /data/audiobooks/" $1; print "~/Desktop/mp3-to-m4b" OFS $1 "-" $2 OFS FS $2 FS OFS FS $4 FS OFS $5 OFS "n"}' myFile

This User Gave Thanks to vgersh99 For This Post:
# 3  
Old 01-25-2011
VERY CLOSE!! VGERSH!! only one problem, there is too much of a space between the 97834343434 number and the hyphen, and the word after the hyphen.. they should all be touching..

Here is output from your code

cd /data/audiobooks/9780743566629
~/Desktop/mp3-to-m4b 9780743566629 -ME TO WE "ME TO WE" "KIELBURGER, CRAIG"

I need
cd /data/audiobooks/9780743566629
~/Desktop/mp3-to-m4b 9780743566629-ME TO WE "ME TO WE" "KIELBURGER, CRAIG"
# 4  
Old 01-25-2011
given:
Code:
9780743565219    "GODS OF NEWPORT"    "JAKES, JOHN"  2006

using:
Code:
nawk -F'"' '{gsub(" ","",$1);print "cd /data/audiobooks/" $1; print "~/Desktop/mp3-to-m4b" OFS $1 "-" $2 OFS FS $2 FS OFS FS $4 FS OFS $5 OFS "n"}' myFile

I get:
Code:
cd /data/audiobooks/9780743565219
~/Desktop/mp3-to-m4b 9780743565219-GODS OF NEWPORT "GODS OF NEWPORT" "JAKES, JOHN"   2006 n

Is something wrong?
Provide a sample data which is 'broken' - please use code tags, when posting data/code samples.
# 5  
Old 01-25-2011
Quote:
Originally Posted by vgersh99
given:
Is something wrong?
Provide a sample data which is 'broken' - please use code tags, when posting data/code samples.
I just tested vgersh99's code and it works as expected for me. Check to ensure that when the code was cut and pasted, or retyped, that "-" (no spaces inside of the quotes) was coded rather than " -" (has a leading space before the dash). The space might not be obvious in the code and would cause the problem you illustrated.

If that seems not to be the problem, then post the command you are running.
# 6  
Old 01-27-2011
I apologize, my text editor was displaying incorrectly, I am not sure why. The code is good, but can you please explain it a little?
# 7  
Old 01-27-2011
Quote:
Originally Posted by glev2005
I apologize, my text editor was displaying incorrectly, I am not sure why. The code is good, but can you please explain it a little?
Glad to hear that you've figured it out. Here is a bit of explaination:

First some comments inline with the code.
Code:
nawk -F '"' '
{
    gsub( " ", "", $1); # remove all blanks from first token
                        # literally for each blank (" ") substitute "" (nothing)

    print "cd /data/audiobooks/" $1;    # print the cd command (print automatically adds a new line)

    # print the mp3-to-m4b command with the necessary parameters
    print "~/Desktop/mp3-to-m4b" OFS $1 "-" $2 OFS FS $2 FS OFS FS $4 FS OFS $5 OFS "n"
}'

The -F '"' parameter tells awk to use the double quote (") as the field seperator rather than the default (whitespace). This allows the tokens within double quotes to be seen as a single token, but does have some side effects. The obvious side effect is the trailing spaces on $1 that are removed with the gsub() call.

There are other ways to pick up tokens within double quotes; this is probably the easiest and given the input data, the side effects aren't impossible to work with.

In awk, the contents of two variables, and/or strings, can be concatenated simply by placing them next to each other. Hence, print a "-" b will print the values contained in a and b with a dash character between them; no spaces.

The OFS variable is set by awk to the output field seperator which is a space by default. The FS is set as the input field seperator (set to a double quote on the command line). The way FS is used on the print statement serves to cause output to be enclosed in double qoutes.

I prefer to use commas on a print statement, which has the same effect of separating the parameters with the OFS value. In this case I'd also use escaped quote marks as I believe it makes the code easier to read -- both are a matter of personal preference, or may be specified in coding standards depending on where you are working:

Code:
print "~/Desktop/mp3-to-m4b", $1 "-" $2, "\""$2"\"", "\""$4"\"", $5, "n"

Some will argue that the escaped quotes is confusing, but to me it's more obvious than using FS. An alternative is to use printf() which is my ultimate preference because of the formatting control:

Code:
printf( "~/Desktop/mp3-to-m4b %s-%s \"%s\" \"%s\" %s n\n", $1, $2, $2, $4, $5 );

A HTML based primer for awk is available here:
Awk - A Tutorial and Introduction - by Bruce Barnett

I think it's a great reference.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Format the text using sed or awk

I was able to figure out how to format a text. Raw Data: $ cat test Thu Aug 23 15:43:28 UTC 2018, hostname01, 232.02, 3, 0.00 Thu Aug 23 15:43:35 UTC 2018, hostname02, 231.09, 4, 0.31 Thu Aug 23 15:43:37 UTC 2018, hostname03, 241.67, 4, 0.43 (5 Replies)
Discussion started by: kenshinhimura
5 Replies

2. Shell Programming and Scripting

Sed/awk command to convert number occurances into date format and club a set of lines

Hi, I have been stuck in this requirement where my file contains the below format. 20150812170500846959990854-25383-8.0.0 "ABC Report" hp96880 "4952" 20150812170501846959990854-25383-8.0.0 End of run 20150812060132846959990854-20495-8.0.0 "XYZ Report" vg76452 "1006962188"... (6 Replies)
Discussion started by: Chinmaya Kabi
6 Replies

3. Shell Programming and Scripting

Datestamp format 2nd change in csv file (awk or sed)

I have a csv file formatted like this: 2014-08-21 18:06:26,A,B,12345,123,C,1232,26/08/14 18:07and I'm trying to change it to MM/DD/YYYY HH:MM for both occurances. I have got this: awk -F, 'NR <=1 {print;next}{"date +%d/%m/%Y\" \"%H:%m -d\""$1 "\""| getline dte;$1=dte}1' OFS="," test.csvThis... (6 Replies)
Discussion started by: say170
6 Replies

4. Shell Programming and Scripting

Help on Log File format using sed or awk

Hello Gurus, First, i would like to know is there any way to solve my problem. i have a log file like this: INFO - ABCDRequest :: processing started for the record <0> TransactionNo <Txn#1> recordID <recID#1> INFO - ABCDRequest :: processing started for the record <0> TransactionNo... (9 Replies)
Discussion started by: VasuKukkapalli
9 Replies

5. Shell Programming and Scripting

Need awk/sed to format a file

My content of source file is as below scr1 a1 scr2 a2 b2 scr3 a3 b3 c3 I need a awk/sed command (to be used in C shell)to format it to something like below scr1 $a1 >file1 scr2 $a2 $b2 >file2 scr3 $a3 $b3 $c3 >file3 (12 Replies)
Discussion started by: animesharma
12 Replies

6. Shell Programming and Scripting

awk or sed to format text file

hi all, i have a text file which looks like the below 01 02 abc Top 40 music Kidz Only! MC 851 MC 852 MC 853 7NOW Arch_Diac xyz2 abc h211 Commacc1 Commacc2 Commacc3 (4 Replies)
Discussion started by: posner
4 Replies

7. Shell Programming and Scripting

Using sed to format several fields

I have data that is tab delimited and looks like: /dev/dsk/c0t0d0s1 - - swap - no - /dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0 / ufs 1 no - /dev/dsk/c0t0d0s6 /dev/rdsk/c0t0d0s6 /usr ufs 1 no -... (5 Replies)
Discussion started by: bradg
5 Replies

8. Shell Programming and Scripting

scripting/awk help : awk sum output is not comming in regular format. Pls advise.

Hi Experts, I am adding a column of numbers with awk , however not getting correct output: # awk '{sum+=$1} END {print sum}' datafile 2.15291e+06 How can I getthe output like : 2152910 Thank you.. # awk '{sum+=$1} END {print sum}' datafile 2.15079e+06 (3 Replies)
Discussion started by: rveri
3 Replies

9. Shell Programming and Scripting

AWK CSV to TXT format, TXT file not in a correct column format

HI guys, I have created a script to read 1 column in a csv file and then place it in text file. However, when i checked out the text file, it is not in a column format... Example: CSV file contains name,age aa,11 bb,22 cc,33 After using awk to get first column TXT file... (1 Reply)
Discussion started by: mdap
1 Replies

10. Shell Programming and Scripting

sed format query

hi how to do the following. If the input is ACCOUNT= 400 QT = 65 CAPT =85 NT = 5 the output should be ACCOUNT= 40 QT = 15 CAPT =37 NT = 90 I want the corresponding values to be changed to new values. hlp (1 Reply)
Discussion started by: gopsman
1 Replies
Login or Register to Ask a Question