sed/tr/grep help

04-17-2012

Registered User

1, 0

Join Date: Apr 2012

Last Activity: 17 April 2012, 4:13 AM EDT

Posts: 1

Thanks Given: 0

Thanked 0 Times in 0 Posts

sed/tr/grep help

So I have a html file with a bunch of words inside tags and I need to extract just the words, and I'm not sure exactly what the best way to do this is. The format is as follows:

Code:

<tr>
    <td>word 1</td>
    <td>word 2</td>
</tr>

And all I want to extract is the 'word 2'. First I tried eliminating all other html garbage with

Code:

egrep '<tr>|<td>' filename

but after that I really had no clue. I tried using sed to find all the <tr> tags and delete it, plus the following line, but there has to be a better way to do this.

The other question I have, is what command do you use to find a phrase, and solely delete that phrase? For example:

Code:

wordswordswo<b>rdswords</b>words...

How would one go about just deleting the bold tags? It's pretty simple to delete a line, but what about JUST the matched pattern?

One last request... instead of just giving me some code/commands, could you kind of explain what is going on with the code? Regular expressions are new to me, as well as shell scripting and it's really really confusing and frustrating. Any helpful websites describing how to do similar types of operations would be great, because frankly there are a lot of crappy ones out there on the web. Trust me, I've read about half of them. Thanks so much in advance.

flightskoo

View Public Profile for flightskoo

Find all posts by flightskoo

04-17-2012

Registered User

15, 3

Join Date: Apr 2012

Last Activity: 30 May 2012, 10:53 PM EDT

Location: Mumbai

Posts: 15

Thanks Given: 3

Thanked 3 Times in 3 Posts

Hey try these:

For Q1:

Code:

 
 
user1@linuxbox:/home/user1> cat data
<tr>
    <td>word 1</td>
    <td>word 2</td>
</tr>

user1@linuxbox:/home/user1> sed -n '/\/tr/{g;1!p;};h' data
    <td>word 2</td>

And for the Q2:

Code:

 
 
echo "part1<b>part2</b>part3" | sed -n 's/\(.*\)<b>\(.*\)<\/b>\(.*\)/\1\2\3/p'
part1part2part3

Hope this helps!!

asterisk-ix_use

View Public Profile for asterisk-ix_use

Find all posts by asterisk-ix_use

04-17-2012

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

Try:

Code:

awk -F'<|>' 'NF>3{print $3}' infile

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

04-17-2012

Registered User

36, 5

Join Date: Apr 2012

Last Activity: 8 July 2012, 7:16 AM EDT

Posts: 36

Thanks Given: 0

Thanked 5 Times in 5 Posts

Hi,

try this:

Quote:

cat YOURFILE|sed 's/>$.*$</--S \1 --E/g;s/.*--S//g;s/--E.*//g;s/<.*>//g;/^$/d'

pokerino

View Public Profile for pokerino

Find all posts by pokerino

Shell Programming and Scripting

sed/tr/grep help

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed and awk usage to grep a pattern 1 and with reference to this grep a pattern 2 and pattern 3

Discussion started by: breezevinay

2. UNIX for Dummies Questions & Answers

Help with sed/grep

Discussion started by: SarahS

3. Shell Programming and Scripting

Help with sed/grep

Discussion started by: wsn

4. Linux

sed and grep

Discussion started by: ravisingh

5. Shell Programming and Scripting

help with SED + GREP

Discussion started by: 2001.arun

6. UNIX for Dummies Questions & Answers

sed or grep?

Discussion started by: MastaFue

7. UNIX for Dummies Questions & Answers

Grep or Sed

Discussion started by: jazz8146

8. Shell Programming and Scripting

using sed to grep

Discussion started by: wxornot

9. UNIX for Dummies Questions & Answers

grep sed

Discussion started by: ericelysia

10. UNIX for Dummies Questions & Answers

SED and it used with | and grep

Discussion started by: Lem2003