Sponsored Content
Top Forums Shell Programming and Scripting extract fields from a downloaded html file Post 302624167 by kalpeer on Monday 16th of April 2012 01:55:14 AM
Old 04-16-2012
Please check this thread.. It might help you...

https://www.unix.com/shell-programmin...text-file.html
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

2. UNIX for Dummies Questions & Answers

Extract some common fields from 1 file that are presnt in another file

I have 2 files FILEA 720646363*PHILIPPINES 117183970*USA 116274291*USA 107940983*USA 107395824*USA 106632425*USA 105861926*USA 105208607*USA 053077046*USA 065428026*ENGLAND FILEB 001125236 001408905 002316511 002521094 020050725 035018308 052288735 (1 Reply)
Discussion started by: unxusr123
1 Replies

3. UNIX for Dummies Questions & Answers

extract fields from text file using delimiter!!

Hi All, I am new to unix scripting, please help me in solving this assignment.. I have a scenario, as follows: 1. i have a text file(read1.txt) with the following data sairam,123 kamal,122 etc.. 2. I have to write a unix... (6 Replies)
Discussion started by: G.K.K
6 Replies

4. Shell Programming and Scripting

Extract urls from index.html downloaded using wget

Hi, I need to basically get a list of all the tarballs located at uri I am currently doing a wget on urito get the index.html page Now this index page contains the list of uris that I want to use in my bash script. can someone please guide me ,. I am new to Linux and shell scripting. ... (5 Replies)
Discussion started by: mnanavati
5 Replies

5. UNIX for Dummies Questions & Answers

How to extract fields from etc/passwd file?

Hi! i want to extract from /etc/passwd file,the user and user info fileds, to a another file.I've tried this: cut -d ':' -f1 ':' -f6 < file but cut can be used to extract olny one field and not two. maybe with awk is this possible? (4 Replies)
Discussion started by: strawhatluffy
4 Replies

6. Shell Programming and Scripting

Extract expressions between two strings in html file

Hello guys, I'm trying to extract all the expressions between the following tags: <b></b> from a HTML file. This is how it looks: big lines containing several dozens expressions (made of 1,2,3,4,6 or even 7 words) I would like to extract: <b>bla ble</b>bla ble</td><tr valign="top"><td... (3 Replies)
Discussion started by: bobylapointe
3 Replies

7. UNIX for Dummies Questions & Answers

Extract table from an HTML file

I want to extract a table from an HTML file. the table starts with <table class="tableinfo" and ends with next closing table tag </table> how can I do this with awk/sed... ---------- Post updated at 04:34 PM ---------- Previous update was at 04:28 PM ---------- also I want to... (4 Replies)
Discussion started by: koutroul
4 Replies

8. Shell Programming and Scripting

Extract specific line in an html file starting and ending with specific pattern to a text file

Hi This is my first post and I'm just a beginner. So please be nice to me. I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file? I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies

9. Shell Programming and Scripting

Extract both contents from a html file and do printing

Hi there, Print IP Address: grep 'HostID :' 10.244.9.124\ nessus.html | awk -F '<br>' '{print $12}' | tr -s ' ' | awk -F ':' '{print "<tr><td>" $2 "</td><td>"}' Print Respective Ports: grep 'classsubsection\|./tcp\|./udp' 10.244.9.124\ nessus.html | grep -v 'h2.classsubsection... (3 Replies)
Discussion started by: alvinoo
3 Replies

10. Shell Programming and Scripting

awk to extract multiple values from file and add two additional fields

In the attached file I am trying to use awk to extract multiple values and create the tab-delimited desired output. In the output R_Index is a the sequential # and Pre_Enrichment is defaulted to .. I can extract from the values to the side of the keywords, but most are above and I can not... (2 Replies)
Discussion started by: cmccabe
2 Replies
HXCOPY(1)							  HTML-XML-utils							 HXCOPY(1)

NAME
hxcopy - copy an HTML file and update its relative links SYNOPSIS
hxcopy [ -i old-URL ] [ -o new-URL ] [ file-or-URL [ file-or-URL ] ] DESCRIPTION
The hxcopy command copies its first argument to its second argument, while updating relative links. The input is assumed to be HTML or XHTML and may be slightly reformatted in the process. If the second argument is omitted, hxcopy writes to standard output. In this case the option -o is required. If the first argument is also omitted, hxcopy reads from standard input. In this case the option -i is required. OPTIONS
The following options are supported: -i old-URL For the purposes of updating relative links, act as if old-URL is the location from which the input is copied. If this option is omitted, the actual location of the first argument is used for calculating relative links. -o new-URL For the purposed of updating relative links, act as if new-URL is the location to which the input is copied. If this option is omitted, the actual location of the second argument is used for calculating relative links. ENVIRONMENT
To use a proxy to retrieve remote files, set the environment variables http_proxy and ftp_proxy. E.g., http_proxy="http://localhost:8080/" BUGS
Unlike the last argument of cp(1), the last argument of hxcopy must be a file, not a directory. The second argument must be a local file. Writing to a URL is not yet implemented. To work around this, replace hxcopy file.html http://example.org/file.html by hxcopy -o http://example.org/file.html file.html tmp.html and then upload tmp.html to the given URL with some other command, such as curl(1). The first argument, however, may be a URL. hxcopy will download the given file. (Currently only HTTP is supported.) EXAMPLE
Assume the HTML file foo.html contains a relative link to "../bar.html". Here are some examples of commands: hxcopy foo.html bar/foo.html The file foo.html is copied to ../bar/foo.html and the relative link to "../bar.html" becomes "../../bar.html". hxcopy foo.html ../foo.html The file foo.html is copied to ../foo.html and the relative link to "../bar.html" is rewritten as "bar.html". hxcopy -i http://my.org/dir1/foo.html -o http://my.org/foo.html file1.html file2.html The file file1.html is copied to file2.html and the relative link to "../bar.html" is rewritten as "bar.html". A command like this may be useful to update files that are later uploaded to a server. SEE ALSO
cp(1), curl(1), hxwls(1) 6.x 9 Dec 2008 HXCOPY(1)
All times are GMT -4. The time now is 12:49 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy