In cases where you don't have quoted > characters in tags (and I didn't see any of them in your samples, but didn't do an exhaustive search in your attachment), the following much simpler script might work:
With the sample data you posted in the 1st message in this thread, it produces the output:
I didn't see any problems processing your attached sample either, but due to the length (since this preserves all input lines and just removes tags), I won't post the results here. It would also be easy to get rid of empty lines after removing tags if that is what you want.
This User Gave Thanks to Don Cragun For This Post:
hi all,
i have a html file something similar to this.
<tr class="evenrow">
<td class="data">added</td><td class="data">xyz@abc.com</td>
<td class="data">filename.sql</td><td class="modifications-data">08/25/2009 07:58:40</td><td class="data">Added TK prof script</td>
</tr>
<tr... (1 Reply)
Hi!
I have a bunch of HTML files, which I want to parse to CSV files. Every page has a table in it, and I need to parse each row into a csv record.
With awk and sed, I managed to put every table row in separate lines. So my file looks like this:
<TR> .... </TR>
<TR> .... </TR>
...One... (1 Reply)
hi guys,
i want to parse a file using public function, the file contain raw data in the below format i want to get the output like this to load it to Oracle DB
MARWA1,BSS:26,1,3,0,0,0,0,0.00,22,22,22.00
MARWA2,BSS:26,1,3,0,0,0,0,0.00,22,22,22.00
this the file raw format:
Number of... (6 Replies)
Hello,
I have a html file like this :
<html>
...
...
...
<table>
.......
......
</table>
<table name = "hi">
......
.....
...
</table>
<h1> Welcome </h1>
.......
......
</html> (11 Replies)
Hello,
I want to extract some informations from a html (website, http://www.energiecontracting.de/7-mitglieder/von-A-Z.php?a_z=B&seite=2 ) file and save those in a predefined format (.csv).. However it seems that the code on that website is kinda messy and I can't find a way to handle it... (5 Replies)
Hi all, I have a file that contains a good hundred of these job definitions below:
Job Name Last Start Last End ST Run Pri/Xit
________________________________________________________________ ____________________... (7 Replies)
<DIV><P>Pré-condição aceder ao ecrã Home do MRS.</P></DIV><DIV><P>OK.</P></DIV><DIV><P>Seleccionar Pesquisa de Recepção Directa.</P></DIV><DIV><P>Confirmar que abriu ecrã de Recepção Directa.</P></DIV><DIV> (6 Replies)
I have downloaded source code for 97 files using:
wget -x -i link.txt then run a rename loop:
for file in *
do
mv $file $file.txt
done to keep the html tags but make the file a text that can be parsed.
In each of the 97 txt files the gene # is variable, but the gene is associated... (15 Replies)
I downloaded source code using:
wget -qO- http://fulgentdiagnostics.com/test/clinical-exome/ | cat > flugentsource.txt
Now I am trying to use sed to parse it to confirm a gene count. Basically, output (flugent.txt) all the gene names with a total count after them
I'm not all that... (5 Replies)
Hi,
im trying to read a Temperature value from html code.
So far i have managed to reduce the whole html page down to this single line with the following sed command:sed -n '/Temperature/p' $temp_temperature | tee temp_string
<TD width='350'>Temperature :</td><td>25... (2 Replies)
Discussion started by: naittis
2 Replies
LEARN ABOUT OSX
locale::codes::langvar
Locale::Codes::LangVar(3pm) Perl Programmers Reference Guide Locale::Codes::LangVar(3pm)NAME
Locale::Codes::LangVar - standard codes for language variation identification
SYNOPSIS
use Locale::Codes::LangVar;
$lvar = code2langvar('acm'); # $lvar gets 'Mesopotamian Arabic'
$code = langvar2code('Mesopotamian Arabic'); # $code gets 'acm'
@codes = all_langvar_codes();
@names = all_langvar_names();
DESCRIPTION
The "Locale::Codes::LangVar" module provides access to standard codes used for identifying language variations, such as those as defined in
the IANA language registry.
Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default IANA language
registry codes will be used.
SUPPORTED CODE SETS
There are several different code sets you can use for identifying language variations. A code set may be specified using either a name, or
a constant that is automatically exported by this module.
For example, the two are equivalent:
$lvar = code2langvar('en','alpha-2');
$lvar = code2langvar('en',LOCALE_CODE_ALPHA_2);
The codesets currently supported are:
alpha
This is the set of alphanumeric codes from the IANA language registry, such as 'arevela' for Eastern Armenian.
This code set is identified with the symbol "LOCALE_LANGVAR_ALPHA".
This is the default code set.
ROUTINES
code2langvar ( CODE [,CODESET] )
langvar2code ( NAME [,CODESET] )
langvar_code2code ( CODE ,CODESET ,CODESET2 )
all_langvar_codes ( [CODESET] )
all_langvar_names ( [CODESET] )
Locale::Codes::LangVar::rename_langvar ( CODE ,NEW_NAME [,CODESET] )
Locale::Codes::LangVar::add_langvar ( CODE ,NAME [,CODESET] )
Locale::Codes::LangVar::delete_langvar ( CODE [,CODESET] )
Locale::Codes::LangVar::add_langvar_alias ( NAME ,NEW_NAME )
Locale::Codes::LangVar::delete_langvar_alias ( NAME )
Locale::Codes::LangVar::rename_langvar_code ( CODE ,NEW_CODE [,CODESET] )
Locale::Codes::LangVar::add_langvar_code_alias ( CODE ,NEW_CODE [,CODESET] )
Locale::Codes::LangVar::delete_langvar_code_alias ( CODE [,CODESET] )
These routines are all documented in the Locale::Codes::API man page.
SEE ALSO
Locale::Codes
The Locale-Codes distribution.
Locale::Codes::API
The list of functions supported by this module.
http://www.iana.org/assignments/language-subtag-registry
The IANA language subtag registry.
AUTHOR
See Locale::Codes for full author history.
Currently maintained by Sullivan Beck (sbeck@cpan.org).
COPYRIGHT
Copyright (c) 2011-2012 Sullivan Beck
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.16.2 2012-10-11 Locale::Codes::LangVar(3pm)