Sed and Awk Help


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Sed and Awk Help
# 8  
Old 06-19-2008
This is a slice of what I am starting with.

Code:
<tr>
<td><a target="testFrame" href="somepath.html">InitialEnvironment Setup<HR></HR></a></td>
</tr>

<tr>
<td><a target="testFrame" href="someotherpath.html">Create Entity<HR></HR></a></td>
</tr>
<tr>
<td><a target="testFrame" href="anotherpath.html">Create Company<HR></HR></a></td>
</tr>
<tr>
<td><a target="testFrame" href="yetanotherpath.html" >1.[1]0001_Creation_of_Custom_Lists_to.html<HR></HR></a></td>
</tr>

In the end, I need an output like

Code:
InitialEnvironment Setup
Create Entity
Create Company
Creation of Custom Lists to

No HTML, no numbers, no underscores.

My long term goal is to put all of this into a nice table with some other information. I am pretty sure I can get the rest figured out, I just need helping getting the the above extracted from the HTML.
# 9  
Old 06-19-2008
Try:
sed 's/<[^>]*>//g' file.html

edit:
This should work
grep href webfile.html|sed -e 's/<[^>]*>//g' -e 's/[0-9]*_/ /g' -e 's/[0-9]..*\] //' -e 's/\(.*\)\..*/\1/'

Last edited by NYankz; 06-19-2008 at 04:05 PM..
# 10  
Old 06-19-2008
I get this error when using either of those:

Code:
The filename, directory name, or volume label syntax is incorrect.

I checked and the filename is fine. Do you know why I would get this error?
# 11  
Old 06-19-2008
not sure why, you should also be able to cat the file first to make sure it displays:

cat file.html |grep href | sed -e 's/<[^>]*>//g' -e 's/[0-9]*_/ /g' -e 's/[0-9]..*\] //' -e 's/\(.*\)\..*/\1/'

I was bored at work and put your example into a tempfile and it worked when i ran the above statement.
# 12  
Old 06-19-2008
OK, then try this,

Code:
awk  'NF!=0{gsub(/\.|[0-9]+|html/,"");gsub(/\[\]/,"");gsub(/_/," ");$1=$1;print}'  RS="<[^<>]+>" filename > final.txt

Output,

Code:
InitialEnvironment Setup
Create Entity
Create Company
Creation of Custom Lists to

There are also dots and square brackets in your input that need to be taken care of. Modify the command if other characters need to be removed.

I'd suggest to redirect the output of the first command, plain, with no sub functions and other statements in it, into a temporary file first, then do all processing with the tool you're most comfortable with, sed, grep, awk..., of that temp file, when finished redirect the output, to a final .txt file.

Last edited by rubin; 06-19-2008 at 08:24 PM.. Reason: typo
# 13  
Old 06-23-2008
Thanks everyone for all of the help.

Rubin, when I try and run yours, I get this:

Code:
'[0-9]+' is not recognized as an internal or external command,
operable program or batch file.

When I try and run NYankz's, I get this:

Code:
The filename, directory name, or volume label syntax is incorrect.

I am doing all of this in windows, using gawk, sed, and grep.
# 14  
Old 06-23-2008
did you try putting in the full path for the file?
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed and awk giving error ./sample.sh: line 13: sed: command not found

Hi, I am running a script sample.sh in bash environment .In the script i am using sed and awk commands which when executed individually from terminal they are getting executed normally but when i give these sed and awk commands in the script it is giving the below errors :- ./sample.sh: line... (12 Replies)
Discussion started by: satishmallidi
12 Replies

2. Shell Programming and Scripting

Is this possible using SED and AWK?

Dear Geeks, I want to manipulate a file with certain modifications for that using sed or AWK how to do this process for one file i have this type of data. Input File: "Restricted and Reserved names .ANISH",3798,"TEST.CO",1201208,6/16/10 0:00,6/16/13 0:00,,,"CO","2nd"^M "Restricted and... (4 Replies)
Discussion started by: anishkumarv
4 Replies

3. UNIX for Dummies Questions & Answers

sed/awk or help please

I have a file that contain the data below: B1 1 2 3 B2 20 30 40 B3 7 8 B4 100 B5 21 22 23How can I retrieve the data for B1 into a seperate file. (8 Replies)
Discussion started by: bobo
8 Replies

4. Shell Programming and Scripting

Need help using awk or sed.

Hi All, Is there a way of comparing two columns in the same file and deleting the row if the values of the columns match. I have the sample data file as below. M024900|175309.00|968.00|17 M025001|19861.79|97.90|148 M025002|431.70|159.00|3 M025003|912.30|159.90|6 ... (6 Replies)
Discussion started by: nua7
6 Replies

5. Shell Programming and Scripting

Using sed or awk?

What if I wanted to add a word such as IT after the first character and if theres 3 characters, after the 2nd character? output would be: G, it H G, H it P G, H, P it L I'm thinking that AWK would be the easiest way to do this... Currently looking it up. Right now I'm using awk but I... (13 Replies)
Discussion started by: puttster
13 Replies

6. UNIX for Dummies Questions & Answers

sed or awk?

I've got an inventory database with eight columns with things like product name, manufacturer, UPC code, etc. on each line. Our PO (purchase order) number is in the first column. I can grep the date and get the full line of data but I would like to strip out everything but the PO number in the... (5 Replies)
Discussion started by: NetJones
5 Replies

7. UNIX for Advanced & Expert Users

Awk or Sed help

Hi, I have a data file with 5 columns - like this: "20080401 09:43:08.770798 +0100s","TEST 1","R 1","A TEST","Nov 27 2007","1" "20080401 09:43:08.770798 +0100s","THIS IS A TEST","R 2","B TEST","Nov 30 2007","10" "20080401 09:43:08.770798 +0100s","ANOTHER TEST","R 3","B TEST","Nov 05... (7 Replies)
Discussion started by: MrG-San
7 Replies

8. UNIX for Advanced & Expert Users

sed in awk ? or nested awk ?

Hey all, Can I put sed command inside the awk action ?? If not then can i do grep in the awk action ?? For ex: awk '$1=="174" { ppid=($2) ; sed -n '/$ppid/p' tempfind.txt ; }' tempfind.txt Assume: 174 is string. Assume: tempfind.txt is used for awk and sed both. tempfind.txt... (11 Replies)
Discussion started by: varungupta
11 Replies

9. Shell Programming and Scripting

sed,awk

Hi, I know sed is stream text editor and not a bit more than that. Can anyone explain its usage and advantages? How is awk different from sed? I donno i am a bit confused about it. But i have coded in awk and shell. Thanks, Nisha :confused: (7 Replies)
Discussion started by: Nisha
7 Replies

10. Shell Programming and Scripting

awk / sed

I have many messages such as the test message below: 00:00000:00021:2002/05/13 13:57:00.51 ERROR:- Test error, my test error!!! I am writing a script in which I need to get everything from the word "ERROR:-" onwards. I normally use awk for these things, but I am not an expert at it so i am... (6 Replies)
Discussion started by: baileyr1
6 Replies
Login or Register to Ask a Question