sed/tr/grep help


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sed/tr/grep help
# 1  
Old 04-17-2012
sed/tr/grep help

So I have a html file with a bunch of words inside tags and I need to extract just the words, and I'm not sure exactly what the best way to do this is. The format is as follows:

Code:
<tr>
    <td>word 1</td>
    <td>word 2</td>
</tr>

And all I want to extract is the 'word 2'. First I tried eliminating all other html garbage with

Code:
egrep '<tr>|<td>' filename

but after that I really had no clue. I tried using sed to find all the <tr> tags and delete it, plus the following line, but there has to be a better way to do this.

The other question I have, is what command do you use to find a phrase, and solely delete that phrase? For example:

Code:
wordswordswo<b>rdswords</b>words...

How would one go about just deleting the bold tags? It's pretty simple to delete a line, but what about JUST the matched pattern?

One last request... instead of just giving me some code/commands, could you kind of explain what is going on with the code? Regular expressions are new to me, as well as shell scripting and it's really really confusing and frustrating. Any helpful websites describing how to do similar types of operations would be great, because frankly there are a lot of crappy ones out there on the web. Trust me, I've read about half of them. Thanks so much in advance.
# 2  
Old 04-17-2012
Hey try these:

For Q1:


Code:
 
 
user1@linuxbox:/home/user1> cat data
<tr>
    <td>word 1</td>
    <td>word 2</td>
</tr>

user1@linuxbox:/home/user1> sed -n '/\/tr/{g;1!p;};h' data
    <td>word 2</td>


And for the Q2:

Code:
 
 
echo "part1<b>part2</b>part3" | sed -n 's/\(.*\)<b>\(.*\)<\/b>\(.*\)/\1\2\3/p'
part1part2part3

Hope this helps!!
# 3  
Old 04-17-2012
Try:
Code:
awk -F'<|>' 'NF>3{print $3}' infile

# 4  
Old 04-17-2012
Hi,

try this:
Quote:
cat YOURFILE|sed 's/>\(.*\)</--S \1 --E/g;s/.*--S//g;s/--E.*//g;s/<.*>//g;/^$/d'
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed and awk usage to grep a pattern 1 and with reference to this grep a pattern 2 and pattern 3

Hi , I have a file where i have modifed certain things compared to original file . The difference of the original file and modified file is as follows. # diff mir_lex.c.modified mir_lex.c.orig 3209c3209 < if(yy_current_buffer -> yy_is_our_buffer == 0) { --- >... (5 Replies)
Discussion started by: breezevinay
5 Replies

2. UNIX for Dummies Questions & Answers

Help with sed/grep

Hello Everyone! I'm kind of new to parsing and would like extract a partial part of my nmap scan output so I can convert it to csv/excel: My current file has two types of lines like this: Nmap scan report for dns1 (1.1.1.1) Nmap scan report for dns2 (2.2.2.2) Nmap scan report for 3.3.3.3 ... (3 Replies)
Discussion started by: SarahS
3 Replies

3. Shell Programming and Scripting

Help with sed/grep

Hi, I have a file with reoccurring patterns and I want extract the 3rd line after the match, then delete another pattern from that third line. For example the file is in the following format: Hello Name: Abc Number: 123 Hello Name: FQE Number: 543 This occurs more than 100... (4 Replies)
Discussion started by: wsn
4 Replies

4. Linux

sed and grep

I am stranded with a problem. Please solve. How will you remove blank lines from a file using sed and grep? ( blank line contains nothing or only white spaces). I run the below commands of sed and grep but grep isn't giving output as desired. Why? sed '/^*$/d' blank grep -v "^*$" blank... (3 Replies)
Discussion started by: ravisingh
3 Replies

5. Shell Programming and Scripting

help with SED + GREP

HI all, i have a line in a file it contains Code: one;two_1_10;two_2_10;two_3_10;three~ now i need to get the output as Code: one;two_1_abc_10;two_2_abc_10;two_3_abc_10;three~ ( 1 should be replaced with 1_abc for two__abc_10 , and one more thing the number of occurances of... (6 Replies)
Discussion started by: 2001.arun
6 Replies

6. UNIX for Dummies Questions & Answers

sed or grep?

hello everybody! I have a html file which is not properly formatted meaning that the whole content is in one line. I want to to cut out certain parts of that file. Those parts are between ' #" ' and ' " ' and always start with ' sec_ ' and after the ' sec_ ' any number of characters and ' _... (2 Replies)
Discussion started by: MastaFue
2 Replies

7. UNIX for Dummies Questions & Answers

Grep or Sed

Hi All, I have created a bourne script that basically wants to split a file up in to different parts. I have this working if the file has all the information on different lines but if it doesn't then it doesn't work. i.e. If this is the file hello 12345 good bye 6789 I could grep all the... (5 Replies)
Discussion started by: jazz8146
5 Replies

8. Shell Programming and Scripting

using sed to grep

I have a file that contains many instances of double dollar signs. I want to use sed to get the first occurrence. for example, given the following data. #Beginning of file AB 34 $$ AB $$ AB 98 $$ I only want to pull out: AB 34 $$ (1 Reply)
Discussion started by: wxornot
1 Replies

9. UNIX for Dummies Questions & Answers

grep sed

OK, I am trying to become more familiar with grep and sed. I have a file that is storing some records. I am allowing a user to search for a keyword in the file with this: grep -i "$keyword" testFile|sed -n -e 's/^/\ /' -e 's/:/\ /gp' ... (15 Replies)
Discussion started by: ericelysia
15 Replies

10. UNIX for Dummies Questions & Answers

SED and it used with | and grep

I am really lost I don't know what this line does. Please help I'm very lost. Thanks in advance. cat CPROGRAMS.c |sed 's// /g'|tr ' ' '\012' |grep '' |sed 's/^*/ /' |grep '($'|sort -u|tr -d "("` (4 Replies)
Discussion started by: Lem2003
4 Replies
Login or Register to Ask a Question