Sponsored Content
Top Forums UNIX for Dummies Questions & Answers HTML parsing with UNIX shell script Post 302946376 by pilnet101 on Monday 8th of June 2015 09:26:32 PM
Old 06-08-2015
You should really use a proper HTML parser tool for parsing HTML. If all your data is in the same format as your sample data, you can use the below simple awk command:

Code:
awk -F"[<>]" '{print $5":",$9}' file

This User Gave Thanks to pilnet101 For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

HTML parsing by PERL

i have a HTML report file..its in attachment(a part of the whole report is attached..name "input html.doc").also its source is attached in "report source code.txt" i just want to seperate the datas like in first line it should be.. NHTEST-3848498958-NHTEST-10.2-no-baloo a and so on for whole... (3 Replies)
Discussion started by: avik1983
3 Replies

2. UNIX for Dummies Questions & Answers

Unix Shell Script along with .HTML

Hi, I need to know how to interact the unix shell script along with a .html. For example, I have a code like: #! /bin/sh exit_err() { print "Content-type: text/html\n" print $1 exit } toolbin/gu -i -r 'm_who(user,group,role,name,addr,phone)' > /tmp/temp.txt... (3 Replies)
Discussion started by: ronix007
3 Replies

3. Shell Programming and Scripting

Parsing: How to go from HTML to CSV?

Dear all, I have to parse a large amount of html files, which I would like to transform into comma separated values. The html-files have the following structure: <tag1> CATEGORY_1 <tag2><tag3> HEADER_1 <tag4> <tag5> paragraph_1 <tag6> <tag5> paragraph_2 <tag6> <tag3>HEADER_2... (2 Replies)
Discussion started by: docdudetheman
2 Replies

4. UNIX for Advanced & Expert Users

html parsing using unix

hi all, I had raised the same question a few weeks back but forgot to mention a lot of points ... so i am raising a new thread furnishing my requirement ... sorry for that .... here is my problem. i have a html that look like below <tr class="modifications-oddrow"> <td... (2 Replies)
Discussion started by: sais
2 Replies

5. Shell Programming and Scripting

SHELL SCRIPT AND HTML

Hello I'm trying to develop a shell script that executes commands such as cat / etc / fstab uname, etc. ..... which generates me an output file format html own way. Currently the shell script with the commands there, I have some notions of html but the generation of my html file is not clean at... (2 Replies)
Discussion started by: ddtseb
2 Replies

6. Shell Programming and Scripting

Parsing HTML, get text between 2 HTML tags

Hi there, I'm quite new to the forum and shell scripting. I want to filter out the "166.0 points". The results, that i found in google / the forum search didn't helped me :( <a href="/user/test" class="headitem menu" style="color:rgb(83,186,224);">test</a><a href="/points" class="headitem... (1 Reply)
Discussion started by: Mysthik
1 Replies

7. Shell Programming and Scripting

Having problem with how to use HTML in Unix shell scripting

Hi All, I'm new to this forum. This is my first question. I'm trying to automate the status related information in our environment. So this is how the output would be. SEGMENT SERVER PORT1 PORT2 PORT3 PORT4 PORT5 ACS acscsa01 up up up up up All... (2 Replies)
Discussion started by: anand.aswini
2 Replies

8. AIX

How to Use a UNIX Shell Script to Create an HTML Web Page?

dear friends , in my work i have to monitor some system performance in hourly basis by runing some commands , for example (lpstat) to know that all the queue is ready how can i create webpage and connect it with the server (AIX operating system) and make this page refreshed every 10 second and... (12 Replies)
Discussion started by: rami abusweilei
12 Replies

9. Shell Programming and Scripting

Help in using html in Shell script

Hi, I made a script that displays various fileds of report that are required in csv format and send it on mail(the csv file). Now I want to convert the csv format into html table and then send it on mail. Reports_Output.csv Code:... (6 Replies)
Discussion started by: Supriya Singh
6 Replies

10. HP-UX

Unable to send attachment with html tables in UNIX shell script

Heyy, any help would be grateful.... LOOKING FOR THE WAYS TO SEND AN EMAIL WITH ATTACHMENT & HTML TABLES IN BODY THROUGH SHELL SCRIPT (LINUX)..NOT SURE, IF WE HAVE ANY INBUILT HTML TAG OR UNIX COMMAND TO SEND THE ATTACHMENTS. KINDLY HELP below is small script posted for our understanding..... (2 Replies)
Discussion started by: Harsha Vardhan
2 Replies
HTML::FormatPS(3)					User Contributed Perl Documentation					 HTML::FormatPS(3)

NAME
HTML::FormatPS - Format HTML as PostScript SYNOPSIS
use HTML::TreeBuilder; $tree = HTML::TreeBuilder->new->parse_file("test.html"); use HTML::FormatPS; $formatter = HTML::FormatPS->new( FontFamily => 'Helvetica', PaperSize => 'Letter', ); print $formatter->format($tree); Or, for short: use HTML::FormatPS; print HTML::FormatPS->format_file( "test.html", 'FontFamily' => 'Helvetica', 'PaperSize' => 'Letter', ); DESCRIPTION
The HTML::FormatPS is a formatter that outputs PostScript code. Formatting of HTML tables and forms is not implemented. You might specify the following parameters when constructing the formatter object (or when calling format_file or format_string): PaperSize What kind of paper should we format for. The value can be one of these: A3, A4, A5, B4, B5, Letter, Legal, Executive, Tabloid, Statement, Folio, 10x14, Quarto. The default is "A4". PaperWidth The width of the paper, in points. Setting PaperSize also defines this value. PaperHeight The height of the paper, in points. Setting PaperSize also defines this value. LeftMargin The left margin, in points. RightMargin The right margin, in points. HorizontalMargin Both left and right margin at the same time. The default value is 4 cm. TopMargin The top margin, in points. BottomMargin The bottom margin, in points. VerticalMargin Both top and bottom margin at the same time. The default value is 2 cm, PageNo This parameter determines if we should put page numbers on the pages. The default value is true; so you have to set this value to 0 in order to suppress page numbers. (The "No" in "PageNo" means number/numero!) FontFamily This parameter specifies which family of fonts to use for the formatting. Legal values are "Courier", "Helvetica" and "Times". The default is "Times". FontScale This is a scaling factor for all the font sizes. The default value is 1. For example, if you want everything to be almost three times as large, you could set this to 2.7. If you wanted things just a bit smaller than normal, you could set it to .92. Leading This option (pronounced "ledding", not "leeding") controls how much is space between lines. This is a factor of the font size used for that line. Default is 0.1 -- so between two 12-point lines, there will be 1.2 points of space. StartPage Assuming you have PageNo on, StartPage controls what the page number of the first page will be. By default, it is 1. So if you set this to 87, the first page would say "87" on it, the next "88", and so on. NoProlog If this option is set to a true value, HTML::FormatPS will make a point of not emitting the PostScript prolog before the document. By default, this is off, meaning that HTML::FormatPS will emit the prolog. This option is of interest only to advanced users. NoTrailer If this option is set to a true value, HTML::FormatPS will make a point of not emitting the PostScript trailer at the end of the document. By default, this is off, meaning that HTML::FormatPS will emit the bit of PostScript that ends the document. This option is of interest only to advanced users. SEE ALSO
HTML::Formatter TO DO
o Support for some more character styles, notably including: strike-through, underlining, superscript, and subscript. o Support for Unicode. o Support for Win-1252 encoding, since that's what most people mean when they use characters in the range 0x80-0x9F in HTML. o And, if it's ever even reasonably possible, support for tables. I would welcome email from people who can help me out or advise me on the above. COPYRIGHT
Copyright (c) 1995-2002 Gisle Aas, and 2002- Sean M. Burke. All rights reserved. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. AUTHOR
Current maintainer: Sean M. Burke <sburke@cpan.org> Original author: Gisle Aas <gisle@aas.no> perl v5.12.1 2004-06-02 HTML::FormatPS(3)
All times are GMT -4. The time now is 05:36 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy