Sponsored Content
Top Forums Programming coverting html data to text in 'c' Post 302141245 by fpmurphy on Thursday 18th of October 2007 08:36:55 AM
Old 10-18-2007
HTML is basically ASCII text with embedded tags. You can roll your own parser to find the text you are looking for or you can use something like TinyXML if your HTML is XHTML-comformant.
 

10 More Discussions You Might Find Interesting

1. UNIX Desktop Questions & Answers

coverting to xbm

Heleuw, I want to convert an image to .xbm the problem is that, wehen I convert it it is only a 2 color image,(black&white), someknowes a tool or an other solution to get the complete image to .xbm with colors and sizes etc:confused: Thnx in advance EJ =) (2 Replies)
Discussion started by: EJ =)
2 Replies

2. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

3. UNIX for Dummies Questions & Answers

extract data from html tables

hi i need to use unix to extract data from several rows of a table coded in html. I know that rows within a table have the tags <tr> </tr> and so i thought that my first step should be to to delete all of the other html code which is not contained within these tags. i could then use this method... (8 Replies)
Discussion started by: Streetrcr
8 Replies

4. Shell Programming and Scripting

To Break data out of HTML

I'm working with the output of an html form and trying to get it into CSV. The html is a table with many entries like the following. <tr><td nowrap><b><font size=3>NAME</font></b></td><td nowrap><b>License # : </b>&nbsp;LICENSE</td></tr> <tr><td><b>City : </b>&nbsp;CITY<td nowrap><b>Type :... (1 Reply)
Discussion started by: phip
1 Replies

5. Shell Programming and Scripting

Turn HTML data into delimited text

I have a file I've already partially pruned with grep that has data like: <a href="MasterDetailResults.asp?textfield=a&Application=3D Home Architect 4">3D Home Architect 4</a> </td> Approved </td> -- <a href="MasterDetailResults.asp?textfield=a&Application=3d Home... (6 Replies)
Discussion started by: macxcool
6 Replies

6. Shell Programming and Scripting

SED to extract HTML text data, not quite right!

I am attempting to extract weather data from the following website, but for the Victoria area only: Text Forecasts - Environment Canada I use this: sed -n "/Greater Victoria./,/Fraser Valley./p" But that phrasing does not sometimes get it all and think perhaps the website has more... (2 Replies)
Discussion started by: lagagnon
2 Replies

7. Shell Programming and Scripting

Bash shell script that inserts a text data file into an HTML table

hi , i need to create a bash shell script that insert a text data file into an html made table, this table output has to mailed.I am new to shell scripting and have a very minimum idea of shell scripting. please help. (9 Replies)
Discussion started by: intern123
9 Replies

8. Shell Programming and Scripting

Script to fetch data from HTML

Hi All, There is a link from were I usually search somthing and fetch the data from. Is there any way to automate it through a script if I mention search criteria in a note pad. I mean the script to search the content on the notepad and resutls should be placed into another file. ... (2 Replies)
Discussion started by: indradev
2 Replies

9. Shell Programming and Scripting

Parsing HTML, get text between 2 HTML tags

Hi there, I'm quite new to the forum and shell scripting. I want to filter out the "166.0 points". The results, that i found in google / the forum search didn't helped me :( <a href="/user/test" class="headitem menu" style="color:rgb(83,186,224);">test</a><a href="/points" class="headitem... (1 Reply)
Discussion started by: Mysthik
1 Replies

10. Shell Programming and Scripting

NWS CLI data Coverting to .CSV file

Hi everyone, I found this forum through a google search I'm hoping someone can help me. I am so clueless on coding stuff so bare with me. I need to write a script/program to convert the snowfall data to a .CSV file. But I guess it doesn't end there. I'm looking to grab snowfall totals and... (7 Replies)
Discussion started by: Cambium27
7 Replies
jp2a(1) 							   USER COMMANDS							   jp2a(1)

NAME
jp2a - convert JPEG images to ASCII SYNOPSIS
jp2a [ options ] [ filename(s) | URL(s) ] DESCRIPTION
jp2a will convert JPEG images to ASCII characters. You can specify a mixture of files and URLs. OPTIONS
- Read JPEG image from standard input --background=light --background=dark If you don't want to mess with --invert all the time, just use these instead. If you are using white characters on a black display, then use --background=dark, and vice versa. -b --border Frame output image in a border --chars=... Use the given characters when producing the output ASCII image. Default is " ...',;:clodxkO0KXNWM". --colors Use ANSI color for text output and CSS color for HTML output. -d --debug Print debugging information when using libcurl to download images from the net. -f --term-fit Use the largest dimension that makes the image fit in your terminal display. --term-height Use terminal display height and calculate width based on image aspect ratio. --term-width Use terminal display width and calculate height based on image aspect ratio. -z --term-zoom Use terminal display width and height. --fill When used with --html and --color, then color each output character's background color. For instance, if you want to use fill-out- put on a light background, do jp2a --color --html --html-fill --background=light somefile.jpg --output=dark.html To do the same on a light background: jp2a --color --html --html-fill --background=dark somefile.jpg --output=light.html The default is to have fill disabled. -x --flipx Flip output image horizontally -y --flipy Flip output image vertically --height=N Set output height. If only --height is specified, then output width will be calculated according to the source images aspect ratio. -h --help Display a short help text --grayscale Converts image to grayscale when using --html or --colors. --html Make ASCII output in strict XHTML 1.0, suitable for viewing with web browsers. This is useful with big output dimensions, and you want to check the result with a browser with small font. --html-fill Same as --fill. You should use that option instead. --html-no-bold Do not use bold text for HTML output. --html-raw Output only the image in HTML codes, leaving out the rest of the webpage, so you can construct your own. --html-fontsize=N Set fontsize when using --html output. Default is 4. --html-title=... Set HTML output title. --output=... Write ASCII output to given filename. To explicitly specify standard output, use --output=-. -i --invert Invert output image. If you view a picture with white background, but you are using a display with light characters on a dark back- ground, you shoudl invert the image. --red=... --green=... --blue=... When converting from RGB to grayscale, use the given weights to calculate luminance. These three floating point values must add up to exactly 1.0. The default is red=0.2989, green=0.5866 and blue=0.1145. --size=WIDTHxHEIGHT Set output dimension. -v --verbose Print some verbose information to standard error when reading each JPEG image. --width=N Set output width. If you only specify the width, the height will be calculated automatically. -V --version Print program version. --zoom Sets output dimensions to your entire terminal window, disregarding source image aspect ratio. RETURN VALUES
jp2a returns 1 when errors are encountered, zero for no errors. EXAMPLES
Convert and print imagefile.jpg using ASCII characters in 40 columns and 20 rows: jp2a --size=40x20 imagefile.jpg Download an image off the net, convert and print: jp2a http://www.google.com/intl/en/logos/easter_logo.jpg Output picture.jpg and picture2.jpg, each 80x25 characters, using the characters " ...ooxx@@" for output: jp2a --size=80x25 --chars=" ...ooxx@@" picture.jpg picture2.jpg Output image.jpg using 76 columns, height is automatically calculated from aspect ratio of image.jpg cat image.jpg | jp2a --width=76 - If you use jp2a together with ImageMagick's convert(1) then you can make good use of pipes, and have ImageMagick do all sorts of image con- versions and effects on the source image. For example: convert somefile.png jpg:- | jp2a - --width=80 Check out convert(1) options to see what you can do. Convert can handle almost any image format, so with this combination you can convert images in e.g. PDF or AVI files to ASCII. Although the default build of jp2a includes automatic downloading of files specified by URLs, you can explicitly download them by using curl(1) or wget(1), for example: curl -s http://foo.bar/image.jpg | convert - jpg:- | jp2a - DOWNLOADING IMAGES FROM THE NET
If you have compiled jp2a with libcurl(3), you can download images by specifying URLs: jp2a https://user:pass@foo.com/bar.jpg The protocols recognized are ftp, ftps, file, http, https and tftp. If you need more control of the downloading, you should use curl(1) or wget(1) and jp2a read the image from standard input. jp2a uses pipe and fork to download images using libcurl (i.e., no exec or system calls) and therefore does not worry about malevolently formatted URLs. GRAYSCALE CONVERSION
You can extract the red channel by doing this: jp2a somefile.jpg --red=1.0 --green=0.0 --blue=0.0 This will calculate luminance based on Y = R*1.0 + G*0.0 + B*0.0. The default values is to use Y = R*0.2989 + G*0.5866 + B*0.1145. PROJECT HOMEPAGE
The latest version of jp2a and news is always available from http://jp2a.sourceforge.net SEE ALSO
cjpeg(1), djpeg(1), jpegtran(1), convert(1) BUGS
jp2a does not interpolate when resizing. If you want better quality, try using convert(1) and convert the source image to the exact output dimensions before using jp2a. Another issue is that jp2a skips some X-pixels along each scanline. This gives a less precise output image, and will probably be corrected in future versions. AUTHOR
Christian Stigen Larsen <csl@sublevel3.org> -- http://csl.sublevel3.org jp2a uses jpeglib to read JPEG files. jpeglib is made by The Independent JPEG Group (IJG), who have a page at http://www.ijg.org LICENSE
jp2a is distributed under the GNU General Public License v2. version 1.0 September 4, 2006 jp2a(1)
All times are GMT -4. The time now is 02:14 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy