Extracting specific characters from a text file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting specific characters from a text file
# 1  
Old 04-10-2009
Extracting specific characters from a text file

I'm extremely new to scripting and linux in general, so please bear with me. The class I'm taking gives virtually no instruction at all, and so I'm trying to learn everything off the web.
Anyway, I'm trying to extract characters that follow after a specific pattern ( '<B><FONT FACE="Arial">' ) but before '<' in a text file. I'm having trouble because there are no spaces, so I can't use $. I'm not even sure what kind of commands I should be using. I tried working with awk, but that didn't get me exactly what I want. Now I'm trying to figure out other ways to do this, but I really have no idea where to start. Any help is greatly appreciated.
# 2  
Old 04-10-2009
Maybe this throws you too deep in sed but for scripting sed/awk is indispensable IMHO.

Code:
sed 's/.*">\(.*\)<.*/\1/'

It's an ordinary sed substitute command in this form: sed 's/remove this/with this/'

With sed you can save substrings with \(.*\) and recall them back with \1, \2, \3 etc.

The command isolates the piece after "> and before the last < in the substring \(.*\) and recalls it back with \1.


Regards
# 3  
Old 04-10-2009
Here is an example of another way of doing it using sed
Code:
TMP=file.$$

cat <<EOT >$TMP
<header>
<description>This is description</description>
<content><B><FONT FACE="Arial">hello livos23</FONT></B></content>
</header>
EOT

# sed by default is greedy and removes up to last >
var=$(sed -n 's/\(<description>\)\([[:print:]]*\)<\/[^>]*>/\2/p' $TMP)
printf "$var\n"

# more general case
var=$(sed -n 's/^.*<B><FONT FACE="Arial">\([[:print:]][^<]*\).*$/\1/p' $TMP)
printf "$var\n"

rm $TMP

exit 0

The output is
Code:
This is description
hello livos23

# 4  
Old 04-10-2009
Code:
sed -e 's/<[^>]*>//g' e.html

remove everything between < and > that's not a >.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract specific line in an html file starting and ending with specific pattern to a text file

Hi This is my first post and I'm just a beginner. So please be nice to me. I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file? I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies

2. UNIX for Dummies Questions & Answers

Extracting lines from a text file based on another text file with line numbers

Hi, I am trying to extract lines from a text file given a text file containing line numbers to be extracted from the first file. How do I go about doing this? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

3. UNIX for Dummies Questions & Answers

Locating and Extracting Specific Patterns from a file

Hi all, 1. I have a file that is getting continously refreshed (appended) I want to grep all the strings containing substring of the type abcdf123@aaa.xxx.yyy.zzz:portnumber: where, before @, any letters or numbers combination, after @, IP address then symbol : then port... (4 Replies)
Discussion started by: kokoras
4 Replies

4. Shell Programming and Scripting

extracting specific text from lines

Hello, i've got this output text: and i need it to look something like this: which means that there won't be absolute path of each directory, just it's size and the last word after last '/' in each line, and i also don't need last line '1.7M /tmp' Looks like there is a simple... (5 Replies)
Discussion started by: krater559
5 Replies

5. Shell Programming and Scripting

extracting specific lines from a file

hi all, i searched in unix.com and accquired the following commands for extracting specific lines from a file .. sed -n '16482,16482p' in.sql > out.sql awk 'NR>=10&&NR<=20' in.sql > out.sql.... these commands are working fine if i give the line numbers as such .. but if i pass a... (2 Replies)
Discussion started by: sais
2 Replies

6. Shell Programming and Scripting

Extracting text out of specific lines

Hi, I have a file like LAHORE 2009-04-16 16:04:19 THU S5830 FAULT MESSAGE SUPPRESS STATUS LOC : ASP00 STS : SUPPRESSING CONTINUE INF : F6201 TRUNK. DATA FAULT REPORT COMPLETED LAHORE 2009-04-16 16:04:20 THU S8400 ISUP SIGNALLING TRACE -... (3 Replies)
Discussion started by: krabu
3 Replies

7. UNIX for Dummies Questions & Answers

extracting text and reusing the text to rename file

Hi, I have some ps files where I want to ectract/copy a certain number from and use that number to rename the ps file. eg: 'file.ps' contains following text: 14 (09 01 932688 0)t the text can be variable, the only fixed element is the '14 ('. The problem is that the fixed element can appear... (7 Replies)
Discussion started by: JohnDS
7 Replies

8. Shell Programming and Scripting

Extracting specific text from a file

Dear All, I have to extract a a few lines from a log file and I know the starting String and end string(WHich is same ). Is there any simplere way using sed - awk. e.g. from the following file -------------------------------------- Some text Date: 21 Oct 2008 Text to be extracted... (8 Replies)
Discussion started by: rahulkav
8 Replies

9. UNIX for Dummies Questions & Answers

extracting few characters from a file

i want to extract few characters from a file based on a special character like || how to do it suggestions please (4 Replies)
Discussion started by: trichyselva
4 Replies

10. UNIX for Dummies Questions & Answers

Is extracting specific files from a zip file possible?

If a zip file contains several zip files, but if the file names of the files needed are known, is there a variation of the unzip command that will allow those few (individual) files to be extracted? --- Example: Zip file name: zip.zip unzip -l zip.zip will display file01, file02, file03, etc.... (1 Reply)
Discussion started by: HLee1981
1 Replies
Login or Register to Ask a Question