Problems with Sed/awk/grep and line endings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Problems with Sed/awk/grep and line endings
# 1  
Old 05-30-2010
Problems with Sed/awk/grep and line endings

Hello
I have created the following script, which is designed to manipulate a text document:
Code:
#!/bin/sh
# Get 3 lines, (last of which is "Quantity"); adjust order; put all three on one line with tabs.
FILENAME=~/Desktop/email.txt
LIST=$(grep -B2 "Quantity" ${FILENAME} |awk 'BEGIN { FS = "\n"; RS = "--"; } 
{ if ($4 != "")
 printf ( $4 "\t" $2 "\t" $3 "\n") 
 else { printf ($3 "\t" $1 "\t" $2 "\n") }
 } 
 END { }');

# Remove "Quantity :" and "Price : ".
LIST=$(echo -e $LIST |sed s/"Quantity: "//g);
LIST=$(echo -e $LIST |sed s/"Price: "//g);
echo -e $LIST > email2.txt;

#Remove asterisks; get range of text between strings; remove up to colon on each line.
ADDRESS=$(cat $FILENAME |sed s/\*//g);
ADDRESS=$(echo -e $ADDRESS |sed -n '/BILLING DETAILS/,/DELIVERY DETAILS/p');
ADDRESS=$(echo -e $ADDRESS |awk -F: '{ printf $2 "\n" }');
echo -e $ADDRESS >> email2.txt;

I have a number of problems with it.
1. The \t tabs and \n newlines in the printf section of awk don't get written to the file, only spaces.
2. The last three commands doesn't seem to work in the script, though they seem to work individually on the command line. Again, echoing to the Terminal displays linefeeds, but echoing to the file in the script does not produce line feeds, just spaces.
3. Some of the text being processed is a few lines of "***********". When these lines are present, I end up with a directory listing in my final output file.

Can anyone explain why these problems are happening and how to stop them? Thanks.

The script processes two parts of a text file in two different ways. The second half (between two strings) should just be written to a new file without everything up to an including a colon on each line.
The first half takes three lines (the 3rd of which starts "Quantity: "), rearranges their order and then removes some text. I seem to have an off-by-one error for the first item, which is why there's an if..then.

Hope this makes sense! (Oh yes, I'm running OS X 10.6.3)

Last edited by benwiggy; 05-30-2010 at 02:16 PM..
# 2  
Old 05-30-2010
Quote:
Originally Posted by benwiggy
Hello
I have created the following script, which is designed to manipulate a text document:
Code:
#!/bin/sh
# # # # # # #
# Remove "Quantity :" and "Price : ".
LIST=$(echo -e $LIST |sed s/"Quantity: "//g);
LIST=$(echo -e $LIST |sed s/"Price: "//g);
echo -e $LIST > email2.txt;


Always quote variable references, and you only need one call to sed:
Code:
echo -e "$LIST" | sed -e 's/Quantity: //g' -e 's/"Price: "//g' > email2.txt

Quote:
Code:
#Remove asterisks; get range of text between strings; remove up to colon on each line.
ADDRESS=$(cat $FILENAME |sed s/\*//g);


UUOC.
Code:
ADDRESS=$(sed 's/*//g' "$FILENAME" );

Quote:
Code:
ADDRESS=$(echo -e $ADDRESS |sed -n '/BILLING DETAILS/,/DELIVERY DETAILS/p');
ADDRESS=$(echo -e $ADDRESS |awk -F: '{ printf $2 "\n" }');
echo -e $ADDRESS >> email2.txt;

Code:
echo -e "$ADDRESS" |awk -F: '/BILLING DETAILS/,/DELIVERY DETAILS/ { printf $2 "\n" }') >> email2.txt

This User Gave Thanks to cfajohnson For This Post:
# 3  
Old 05-30-2010
Many thanks. That's made the code simpler for starters.
It's also sorted out the problems of the linebreaks in the second bit; but the lack of tabs and linebreaks in the awk command is still there.

Any thoughts on why?
Code:
#!/bin/sh
# Get Items from first half of email and sort the groups of lines
FILENAME=~/Desktop/email.txt

LIST=$(grep -B2 "Quantity" ${FILENAME} |awk 'BEGIN { FS = "\n"; RS = "--"; } 
{ if ($4 != "")
 printf ( $4 "\t" $2 "\t" $3 "\n") 
 else { printf ($3 "\t" $1 "\t" $2 "\n") }
 } 
 END { }');
 
echo -e "$LIST" | sed -e 's/Quantity: //g' -e 's/"Price: "//g' > email2.txt
echo -e $LIST > email2.txt;
 
#Get Addresses from second half
ADDRESS=$(sed 's/*//g' "$FILENAME" );
echo -e "$ADDRESS" |awk -F: '/BILLING DETAILS/,/DELIVERY DETAILS/ { printf $2 "\n" }' >> email2.txt

Can you also explain the significance of askerisks in the text file being turned into a directory listing?

UUOC? Ah. Useless use of cat. Smilie
# 4  
Old 05-30-2010
Quote:
Originally Posted by benwiggy
Many thanks. That's made the code simpler for starters.
It's also sorted out the problems of the linebreaks in the second bit; but the lack of tabs and linebreaks in the awk command is still there.

Any thoughts on why?
Code:
#!/bin/sh
# Get Items from first half of email and sort the groups of lines
FILENAME=~/Desktop/email.txt

LIST=$(grep -B2 "Quantity" ${FILENAME} |awk 'BEGIN { FS = "\n"; RS = "--"; }


The braces around FILENAME don't do anything; the variable should be quoted: "$FILENAME"
Quote:
Code:
{ if ($4 != "")
 printf ( $4 "\t" $2 "\t" $3 "\n") 
 else { printf ($3 "\t" $1 "\t" $2 "\n") }
 } 
 END { }');
 
echo -e "$LIST" | sed -e 's/Quantity: //g' -e 's/"Price: "//g' > email2.txt
echo -e $LIST > email2.txt;


Why is that second echo there? Why is $LIST unquoted?
Quote:
Code:
 
#Get Addresses from second half
ADDRESS=$(sed 's/*//g' "$FILENAME" );
echo -e "$ADDRESS" |awk -F: '/BILLING DETAILS/,/DELIVERY DETAILS/ { printf $2 "\n" }' >> email2.txt

Can you also explain the significance of askerisks in the text file being turned into a directory listing?

An unquoted asterisk is expanded to all files in the current directory.
This User Gave Thanks to cfajohnson For This Post:
# 5  
Old 05-31-2010
Quote:
Originally Posted by cfajohnson

The braces around FILENAME don't do anything; the variable should be quoted: "$FILENAME"
Ah. OK. I have seen this in other scripts, so thought it was necessary. Interestingly, removing the braces has fixed the lack of tabs and linefeeds, so perhaps braces in this form DO do something.

Quote:
Originally Posted by cfajohnson

Why is that second echo there? Why is $LIST unquoted?
Sorry, That's a vestigial command from days gone by. I've removed it.

The script now seems to be working as it should now. Many thanks.

Quote:
Originally Posted by cfajohnson

An unquoted asterisk is expanded to all files in the current directory.
So, short of removing asterisks in the target file as I have done, how do you escape this behaviour, to deal with text files/strings that contain asterisks?

One last thing: Do I need to make any changes if the input $FILENAME isn't the name of a file, but is simply a string containing all the data?

Last edited by benwiggy; 05-31-2010 at 06:39 AM..
# 6  
Old 05-31-2010
Quote:
Originally Posted by benwiggy
So, short of removing asterisks in the target file as I have done, how do you escape this behaviour, to deal with text files/strings that contain asterisks?
Quote them.
Quote:
One last thing: Do I need to make any changes if the input $FILENAME isn't the name of a file, but is simply a string containing all the data?
The same way as you used "$LIST".
# 7  
Old 06-02-2010
It all seems to be working now. Athough I'm still having trouble caused by asterisks in the text. How can I stop the shell from interpreting "*********" as an instruction to display a folder listing?

Last edited by benwiggy; 06-03-2010 at 05:02 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Tip to remove line endings and spaces on a pre-formatted text file?

Hi, At the moment, using Notepad++ to do a search and replace, manually section by section which is real painful. Yeah, so copying each section of the line of text and putting into a file and then search and replace, need at least 3-operations in Notepad++. Here's hoping I will be able to... (1 Reply)
Discussion started by: newbie_01
1 Replies

2. Shell Programming and Scripting

sed or awk grep, that will only get the line with more characters.

Is there a command for sed and awk that will only sort the line with more characters? #cat file 123 12345 12 asdgjljhhho bac ss Output: asdgjljhhho #cat file2 11.2 12345.00 21.222 12345678.10 (2 Replies)
Discussion started by: invinzin21
2 Replies

3. Shell Programming and Scripting

Sed/grep: check if line exists, if not add line?

Hello, I'm trying to figure out how to speed up the following as I want to use multiple commands to search thousands of files. is there a way to speed things up? Example I want to search a bunch of files for a specific line, if this line already exists do nothing, if it doesn't exist add it... (4 Replies)
Discussion started by: f77hack
4 Replies

4. Shell Programming and Scripting

awk or sed or grep filter a line and/or between strings

Hi, I have multiple files on a directory with the following content: blahblah blahblah hostname server1 blahblah blahblah ---BEGIN--- aaa bbb ccc ddd ---END--- blahblah blahblah blahblah I would like to filter all the files with awk or sed or something else so I can get below... (6 Replies)
Discussion started by: bayupw
6 Replies

5. Shell Programming and Scripting

Sendmail ignoring line endings

Mails from Sendmail are ignoring line endings, when I try to send email with attachment. I have tried to specify the font in the html but line endings are still ignored. I also tried unix2dos, still no luck. #!/usr/bin/ksh ###Send Email MAILTO=`cat mail2.list | tr -s '\n' ','` SUBJECT="bla bla... (3 Replies)
Discussion started by: aydj
3 Replies

6. Shell Programming and Scripting

Use less pipe for grep or awk sed to print the line not include xx yy zz

cat file |grep -v "xx" | grep -v "yy" |grep -v "zz" (3 Replies)
Discussion started by: yanglei_fage
3 Replies

7. Shell Programming and Scripting

sed command to grep multiple pattern present in single line and delete that line

here is what i want to achieve.. i have a file with below contents cat fileName blah blah blah . .DROP this REJECT that . --sport 7800 -j REJECT --reject-with icmp-port-unreachable --dport 7800 -j REJECT --reject-with icmp-port-unreachable . . . more blah blah blah --dport 3306... (14 Replies)
Discussion started by: vivek d r
14 Replies

8. UNIX for Advanced & Expert Users

vimrc help with line endings

I was reading this and thought I could put this in my vimrc and it would convert the line endings to unix. Am I doing something wrong or am I missing something? set ff=unixManaging/Munging Line-Endings with Vi/Vim | Jeet Sukumaran I used this command and it confirms that my global option is... (2 Replies)
Discussion started by: cokedude
2 Replies

9. UNIX for Advanced & Expert Users

line endings help of non-ASCII files

When you are dealing with ASCII files it easy to check on line endings type. You can just use the file command. You are not always lucky enough to be dealing with ASCII files. So in the cases that you don't have ASCII files how can you check what type of line endings you have? Please list all... (5 Replies)
Discussion started by: cokedude
5 Replies

10. UNIX for Advanced & Expert Users

Vi line endings conversions

I was reading these 2 articles. Why does the wikia one think :e ++ff=dos? Or am I just misunderstanding it? :e ++ff=unix :e ++ff=dos File format - Vim Tips Wiki Managing/Munging Line-Endings with Vi/Vim | Jeet Sukumaran (1 Reply)
Discussion started by: cokedude
1 Replies
Login or Register to Ask a Question