So I have a html file with a bunch of words inside tags and I need to extract just the words, and I'm not sure exactly what the best way to do this is. The format is as follows:
And all I want to extract is the 'word 2'. First I tried eliminating all other html garbage with
but after that I really had no clue. I tried using sed to find all the <tr> tags and delete it, plus the following line, but there has to be a better way to do this.
The other question I have, is what command do you use to find a phrase, and solely delete that phrase? For example:
How would one go about just deleting the bold tags? It's pretty simple to delete a line, but what about JUST the matched pattern?
One last request... instead of just giving me some code/commands, could you kind of explain what is going on with the code? Regular expressions are new to me, as well as shell scripting and it's really really confusing and frustrating. Any helpful websites describing how to do similar types of operations would be great, because frankly there are a lot of crappy ones out there on the web. Trust me, I've read about half of them. Thanks so much in advance.
I am really lost I don't know what this line does. Please help I'm very lost. Thanks in advance.
cat CPROGRAMS.c
|sed 's// /g'|tr ' ' '\012'
|grep ''
|sed 's/^*/ /'
|grep '($'|sort -u|tr -d "("` (4 Replies)
OK, I am trying to become more familiar with grep and sed.
I have a file that is storing some records. I am allowing a user to
search for a keyword in the file with this:
grep -i "$keyword" testFile|sed -n -e 's/^/\
/' -e 's/:/\
/gp'
... (15 Replies)
I have a file that contains many instances of double dollar signs. I want to use sed to get the first occurrence. for example, given the following data.
#Beginning of file
AB
34
$$
AB
$$
AB
98
$$
I only want to pull out:
AB
34
$$ (1 Reply)
Hi All,
I have created a bourne script that basically wants to split a file up in to different parts. I have this working if the file has all the information on different lines but if it doesn't then it doesn't work.
i.e.
If this is the file
hello
12345
good bye
6789
I could grep all the... (5 Replies)
hello everybody!
I have a html file which is not properly formatted meaning that the whole content is in one line.
I want to to cut out certain parts of that file. Those parts are between ' #" ' and ' " ' and always start with ' sec_ ' and after the ' sec_ ' any number of characters and ' _... (2 Replies)
HI all,
i have a line in a file it contains
Code:
one;two_1_10;two_2_10;two_3_10;three~
now i need to get the output as
Code:
one;two_1_abc_10;two_2_abc_10;two_3_abc_10;three~ ( 1 should be replaced with 1_abc for two__abc_10 , and one more thing the number of occurances of... (6 Replies)
I am stranded with a problem. Please solve.
How will you remove blank lines from a file using sed and grep? ( blank line contains nothing or only white spaces).
I run the below commands of sed and grep but grep isn't giving output as desired. Why?
sed '/^*$/d' blank
grep -v "^*$" blank... (3 Replies)
Hi,
I have a file with reoccurring patterns and I want extract the 3rd line after the match, then delete another pattern from that third line.
For example the file is in the following format:
Hello
Name: Abc
Number: 123
Hello
Name: FQE
Number: 543
This occurs more than 100... (4 Replies)
Hello Everyone!
I'm kind of new to parsing and would like extract a partial part of my nmap scan output so I can convert it to csv/excel:
My current file has two types of lines like this:
Nmap scan report for dns1 (1.1.1.1)
Nmap scan report for dns2 (2.2.2.2)
Nmap scan report for 3.3.3.3
... (3 Replies)
Hi ,
I have a file where i have modifed certain things compared to original file . The difference of the original file and modified file is as follows.
# diff mir_lex.c.modified mir_lex.c.orig
3209c3209
< if(yy_current_buffer -> yy_is_our_buffer == 0) {
---
>... (5 Replies)
Discussion started by: breezevinay
5 Replies
LEARN ABOUT OSF1
look
look(1) General Commands Manual look(1)NAME
look - Finds lines in a sorted list
SYNOPSIS
look [-df] [-tcharacter] string [file]
The look command prints all lines in a sorted file that begin with string.
OPTIONS
Uses dictionary order; only letters, digits, tabs, and spaces are used in comparisons. Searches without regard to case; treats uppercase
and lowercase as equivalent. Ignores character and characters following it in the search string. If you specify look -tC ABCDE, the
string ABCDE would become (in effect) AB, with CDE being ignored. This option is primarily for shell scripts, in which more than one
string is being processed.
DESCRIPTION
If no file is specified, look searches in the system word list /usr/share/dict/words, with the options -df assumed by default.
The look command uses binary search.
The -d and -f options affect comparisons as in sort.
NOTES
In order to use the -f option, you must first sort file with the sort -f command; otherwise, look displays only lowercase items.
If you do not specify -f, but specify a file (such as /usr/share/dict/words) that has been sorted with sort -f, look may not produce any
output.
EXAMPLES
To search a sorted file called sortfile for all lines that begin with the string as, enter: look as sortfile To search the system word list
for all words beginning with smi, enter: look smi
This might result in: smile smirk smith smithereens Smithfield Smithson smithy smitten
FILES
System word list.
SEE ALSO
Commands: grep(1), sort(1), spell(1)look(1)