Sponsored Content
Top Forums Shell Programming and Scripting Extract text from html using perl or awk Post 302981611 by RudiC on Thursday 15th of September 2016 05:23:04 PM
Old 09-15-2016
By no stretch of the imagination your awk script will run flawlessly. If the "patterns" were connected with OR operators, and any of them would turn out TRUE, the actual line/record would be printed (the default selected by you). As your file is just ONE line/record, the entire file is printed.
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

2. Shell Programming and Scripting

Is it possible to convert text file to html table using perl

Hi, I have a text file say file1 having data like ABC c:/hm/new1 Dir DEF d:/ner/d sd ...... So i want to make a table from this text file, is it possible to do it using perl. Thanks in advance Sarbjit (1 Reply)
Discussion started by: sarbjit
1 Replies

3. Shell Programming and Scripting

SED to extract HTML text data, not quite right!

I am attempting to extract weather data from the following website, but for the Victoria area only: Text Forecasts - Environment Canada I use this: sed -n "/Greater Victoria./,/Fraser Valley./p" But that phrasing does not sometimes get it all and think perhaps the website has more... (2 Replies)
Discussion started by: lagagnon
2 Replies

4. Shell Programming and Scripting

extract data with awk from html files

Hello everyone, I'm new to this forum and i am new as a shell scripter. my problem is to have html files in a directory and I would like to extract from these some data that lies between two different lines Here's my situation <td align="default"> oxidizability (mg / l): data_to_extract... (6 Replies)
Discussion started by: sbobotex
6 Replies

5. Shell Programming and Scripting

awk -- Extract data from html within multiple tags as reference

Hi, I'm trying to get some data from an html file, but the problem is before it can extract the information I have multiple patterns that need to be passed through. https://www.unix.com/shell-programming-scripting/150711-extract-data-awk-html-files.html Is a similar problem. The only... (5 Replies)
Discussion started by: counfhou
5 Replies

6. Shell Programming and Scripting

Perl script to extract text from image file

Hi Folks, Could you please share your ideas on extracting text from image file(jpg,png and gif formats). Regards, J (1 Reply)
Discussion started by: scriptscript
1 Replies

7. Shell Programming and Scripting

Retrieve information Text/Word from HTML code using awk/sed

awk/sed newbie here. I have a HTML file and from that file and I would like to retrieve a text word. <font face=arial size=-1><li><a href=/value_for_clients/Tokyo/abc_process.txt>abc</a> NDK Version: 4.0 </li> <font face=arial size=-1><li><a... (6 Replies)
Discussion started by: sk2code
6 Replies

8. Shell Programming and Scripting

awk and HTML with conditional text colour

Hello All, I am using awk with html options to format and send output to another file. Below command works fine, no issues. awk 'BEGIN{print "<table border="1" width="1000" >"} {print "<tr>";for(i=1;i<=NF;i++)print "<td>" $i"</td>";print "</tr>"} END {print "</table>"}' ${TMPLOGFILE1} >>... (0 Replies)
Discussion started by: jvmani_1
0 Replies

9. Shell Programming and Scripting

Awk/sed HTML extract

I'm extracting text between table tags in HTML <th><a href="/wiki/Buick_LeSabre" title="Buick LeSabre">Buick LeSabre</a></th> using this: awk -F "</*th>" '/<\/*th>/ {print $2}' auto2 > auto3 then this (text between a href): sed -e 's/\(<*>\)//g' auto3 > auto4 How to shorten this into one... (8 Replies)
Discussion started by: p1ne
8 Replies

10. UNIX for Beginners Questions & Answers

awk to extract value after keyword in html

Using awk to extract value after a keyword in an html, and store in ts. The awk does execute but ts is empty. I use the tag as a delimiter and the keyword as a pattern, but there probably is a better way. Thank you :). file <html><head><title>xxxxxx xxxxx</title><style type="text/css"> ... (4 Replies)
Discussion started by: cmccabe
4 Replies
PPPDUMP(1M)															       PPPDUMP(1M)

NAME
pppdump - convert PPP record file to readable format SYNOPSIS
pppdump [ -h | -p [ -d ]] [ -r ] [ -m mru ] [ file ... ] DESCRIPTION
The pppdump utility converts the files written using the record option of pppd into a human-readable format. If one or more filenames are specified, pppdump will read each in turn; otherwise it will read its standard input. In each case the result is written to standard out- put. The options are as follows: -h Prints the bytes sent and received in hexadecimal. If neither this option nor the -p option is specified, the bytes are printed as the characters themselves, with non-printing and non-ASCII characters printed as escape sequences. -p Collects the bytes sent and received into PPP packets, interpreting the async HDLC framing and escape characters and checking the FCS (frame check sequence) of each packet. The packets are printed as hex values and as characters (non-printable characters are printed as `.'). -d With the -p option, this option causes pppdump to decompress packets which have been compressed with the BSD-Compress or Deflate methods. -r Reverses the direction indicators, so that `sent' is printed for bytes or packets received, and `rcvd' is printed for bytes or pack- ets sent. -m mru Use mru as the MRU (maximum receive unit) for both directions of the link when checking for over-length PPP packets (with the -p option). SEE ALSO
pppd(1m) NOTES
The modified source for this package is available in the SUNWpppgS package. You can get the original source from ftp://linux- care.com.au/pub/ppp. 1 April 1999 PPPDUMP(1M)
All times are GMT -4. The time now is 02:38 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy