Sponsored Content
Top Forums Shell Programming and Scripting Extract expressions between two strings in html file Post 302657017 by bobylapointe on Friday 15th of June 2012 10:11:50 PM
Old 06-15-2012
Java Extract expressions between two strings in html file

Hello guys,

I'm trying to extract all the expressions between the following tags: <b></b> from a HTML file.
This is how it looks: big lines containing several dozens expressions (made of 1,2,3,4,6 or even 7 words) I would like to extract:



Code:
<b>bla ble</b>bla ble</td><tr valign="top"><td width="50%"><p align="left" style="margin-top:0;margin-bottom:0"><b>ble bla ble</b>bla bla ble</td><td width="50%"><p align="left" style="margin-top:0;margin-bottom:0"><b>ble ble ble bla ble</b>ble ble ble bla ble</td><td width="50%"><p align="left" style="margin-top:0;margin-bottom:0"> and so on.

I would like to print them out into a new file, under the form:

bla ble
bla bla ble
ble ble ble bla ble

etc.

I know several posts in the forum adress this question - namely to extract expressions between two strings using sed, perl or awk - but none of the commands I found work in this situation (several of the same tags on the same line and a lot of lines).

How could I make either one of these programs go through the WHOLE file in search of every expression that appear between <b></b>?


Thank you very much !
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

2. Shell Programming and Scripting

extract strings from file and display in csv format

Hello All, I have a file whose data looks something like this I want to extract just the id, name and city fields in a csv format and sort them by id. Output should look like this. 1,psi,zzz 2,beta,pqr 3,theta,xyz 4,alpha,abc 5,gamma,jkl (12 Replies)
Discussion started by: grajp002
12 Replies

3. Shell Programming and Scripting

How to write a script to extract strings from a file.

Hello fourm members, I want to write a script to extarct paticular strings from the all type of files(.sh files,logfiles,txtfiles) and redirect into a log file. example: I have to find the line below in the script and extract the uname and Pwds. sqsh -scia2007 -DD0011uw01 -uciadev... (5 Replies)
Discussion started by: rajkumar_g
5 Replies

4. Shell Programming and Scripting

Extract strings from multiple lines into one file -

input file Desired csv output gc_type, date/time, milli secs af, Mar 17 13:09:04 2011, 144.596 af, Mar 20 00:37:37 2011, 144.242 af, ar 20 21:30:59 2011, 108.518 Hi All, Any help in acheiving the above would be appreciated. I would like to parse through lines within one file and... (5 Replies)
Discussion started by: satish.vampire
5 Replies

5. Shell Programming and Scripting

Extract strings within XML file between different delimiters

Good afternoon! I have an XML file from which I want to extract only certain elements contained within each line. The problem is that the format of each line is not exactly the same (though similiar). For example, oa_var will be in each line, however, there may be no value or other... (3 Replies)
Discussion started by: bab@faa
3 Replies

6. Shell Programming and Scripting

Extract strings from file - Help

Hi, I have a file say with following lines (the lines could start from any column and there can be many many create statements in the file) create table table1....table definition... insert into table1 values..... create or replace view view1....view definition.... What i want is to... (2 Replies)
Discussion started by: whoami191
2 Replies

7. Shell Programming and Scripting

extract fields from a downloaded html file

I have around 100 html files and in each html file I have 5-6 such paragraphs of a company and I need to extract the Name of the company from either the one after "title" or "/company" and then the number of employees and finally the location . <div class="search_result"> <div... (1 Reply)
Discussion started by: gubbu
1 Replies

8. Shell Programming and Scripting

Extract two strings from a file and create a new file with these strings

I have the following lines in a log file. It would be great if some one can help me to create a new file with the just entries in the below format. 66.150.161.195 HPSAC=Z05 66.150.161.196 HPSAC=A05 That is just extract the IP address and the string DPSAC=its value 66.150.161.195 -... (1 Reply)
Discussion started by: Tuxidow
1 Replies

9. UNIX for Dummies Questions & Answers

Extract table from an HTML file

I want to extract a table from an HTML file. the table starts with <table class="tableinfo" and ends with next closing table tag </table> how can I do this with awk/sed... ---------- Post updated at 04:34 PM ---------- Previous update was at 04:28 PM ---------- also I want to... (4 Replies)
Discussion started by: koutroul
4 Replies

10. Shell Programming and Scripting

Extract both contents from a html file and do printing

Hi there, Print IP Address: grep 'HostID :' 10.244.9.124\ nessus.html | awk -F '<br>' '{print $12}' | tr -s ' ' | awk -F ':' '{print "<tr><td>" $2 "</td><td>"}' Print Respective Ports: grep 'classsubsection\|./tcp\|./udp' 10.244.9.124\ nessus.html | grep -v 'h2.classsubsection... (3 Replies)
Discussion started by: alvinoo
3 Replies
Authen::SASL::Perl::GSSAPI(3)				User Contributed Perl Documentation			     Authen::SASL::Perl::GSSAPI(3)

NAME
Authen::SASL::Perl::GSSAPI - GSSAPI (Kerberosv5) Authentication class SYNOPSIS
use Authen::SASL qw(Perl); $sasl = Authen::SASL->new( mechanism => 'GSSAPI' ); $sasl = Authen::SASL->new( mechanism => 'GSSAPI', callback => { pass => $mycred }); $sasl->client_start( $service, $host ); DESCRIPTION
This method implements the client part of the GSSAPI SASL algorithm, as described in RFC 2222 section 7.2.1 resp. draft-ietf-sasl-gssapi-XX.txt. With a valid Kerberos 5 credentials cache (aka TGT) it allows to connect to service@host given as the first two parameters to Authen::SASL's client_start() method. Alternatively, a GSSAPI::Cred object can be passed in via the Authen::SASL callback hash using the `pass' key. Please note that this module does not currently implement a SASL security layer following authentication. Unless the connection is protected by other means, such as TLS, it will be vulnerable to man-in-the-middle attacks. If security layers are required, then the Authen::SASL::XS GSSAPI module should be used instead. CALLBACK The callbacks used are: authname The authorization identity to be used in SASL exchange gssmech The GSS mechanism to be used in the connection pass The GSS credentials to be used in the connection (optional) EXAMPLE
#! /usr/bin/perl -w use strict; use Net::LDAP 0.33; use Authen::SASL 2.10; # -------- Adjust to your environment -------- my $adhost = 'theserver.bla.net'; my $ldap_base = 'dc=bla,dc=net'; my $ldap_filter = '(&(sAMAccountName=BLAAGROL))'; my $sasl = Authen::SASL->new(mechanism => 'GSSAPI'); my $ldap; eval { $ldap = Net::LDAP->new($adhost, onerror => 'die') or die "Cannot connect to LDAP host '$adhost': '$@'"; $ldap->bind(sasl => $sasl); }; if ($@) { chomp $@; die " Bind error : $@", " Detailed SASL error: ", $sasl->error, " Terminated"; } print " LDAP bind() succeeded, working in authenticated state"; my $mesg = $ldap->search(base => $ldap_base, filter => $ldap_filter); # -------- evaluate $mesg PROPERTIES The properties used are: maxbuf The maximum buffer size for receiving cipher text minssf The minimum SSF value that should be provided by the SASL security layer. The default is 0 maxssf The maximum SSF value that should be provided by the SASL security layer. The default is 2**31 externalssf The SSF value provided by an underlying external security layer. The default is 0 ssf The actual SSF value provided by the SASL security layer after the SASL authentication phase has been completed. This value is read- only and set by the implementation after the SASL authentication phase has been completed. maxout The maximum plaintext buffer size for sending data to the peer. This value is set by the implementation after the SASL authentication phase has been completed and a SASL security layer is in effect. SEE ALSO
Authen::SASL, Authen::SASL::Perl AUTHORS
Written by Simon Wilkinson, with patches and extensions by Achim Grolms and Peter Marschall. Please report any bugs, or post any suggestions, to the perl-ldap mailing list <perl-ldap@perl.org> COPYRIGHT
Copyright (c) 2006 Simon Wilkinson, Achim Grolms and Peter Marschall. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.18.2 2010-03-11 Authen::SASL::Perl::GSSAPI(3)
All times are GMT -4. The time now is 11:01 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy