RudiC, i'm not dropping it, because i need to get other texts out of the html, but for the example sakes, yes that would make it more optimized.
I have 5 more texts that i'm matching and making the output into a csv file.
The html from which i'm parsing is built up very poorly.
Actual code snippet:
so as you can see it has a lot of spaces, and new lines at the end and beginning.
Your solution doesn't deal with those so i came up with this:
Since i need this all in one line or else the csv file will broke (just realized this) had to get rid of the new lines tr -d "\n\r"
I' removing the extra whitespaces at the beginning and end awk '{$1=$1};1'
Also for csv proofing i'm replacing the commas with semicolon because csv will interpret commas as end of column tr ',' ';'
So this makes me wonder if that one sed could do all these by on it's own.
But i'm happy now because this works now.
I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part.
Same problem happens in "type" command in MS-DOS.
I know you can do it by opening it in Internet Explorer,... (4 Replies)
Hai friends
I have a small doubt..
how can we use html tag in shell scripting
code :
echo "<html>"
echo "<body>"
echo " welcome to peace world "
echo "</body>"
echo "</html>"
output displayed like this:
<html>
<body>
welcome to peace world
</body>
</html> (5 Replies)
hi all,
i have a html file something similar to this.
<tr class="evenrow">
<td class="data">added</td><td class="data">xyz@abc.com</td>
<td class="data">filename.sql</td><td class="modifications-data">08/25/2009 07:58:40</td><td class="data">Added TK prof script</td>
</tr>
<tr... (1 Reply)
Hi!
I have a bunch of HTML files, which I want to parse to CSV files. Every page has a table in it, and I need to parse each row into a csv record.
With awk and sed, I managed to put every table row in separate lines. So my file looks like this:
<TR> .... </TR>
<TR> .... </TR>
...One... (1 Reply)
Guys,
I have a little script that I got of the internet and that I use in Squid to block ads.
I used that script with linux but now i have moved my servers to freebsd. I have a step learning curve there but it is fun: Back to the script issue.
The script used to work i with linux but... (15 Replies)
I have an XML tag like this:
<property name="agent" value="/var/tmp/root/eclipse" />
Is there way using awk that i can get the value from the above tag. So the output should be:
/var/tmp/root/eclipse
Help will be appreciated.
Regards,
Adi (6 Replies)
I want to print from <fruits> to </fruits> tag which have <fruit> as mango. Also i want both <fruits> and </fruits> in output. Please help
eg.
<fruits>
<fruit id="111">mango<fruit>
.
another 20 lines
.
</fruits> (3 Replies)
Hi Guys
Here is my Input :
<?xml version="1.0" encoding="UTF-8"?>
<xn:MeContext id="01736">
<xn:VsDataContainer id="01736">
<xn:attributes>
<xn:vsDataType>vsDataMeContext</xn:vsDataType>
... (12 Replies)
I want to clean a html file.
I try to remove the script part in the html and remove the rest of tags and empty lines.
The code I try to use is the following:
sed '/<script/,/<\/script>/d' webpage.html | sed -e 's/<*>//g' | sed '/^\s*$/d' > output.txt
However, in this method, I can not... (10 Replies)
Discussion started by: YuhuiFeng
10 Replies
LEARN ABOUT DEBIAN
text::csv::encoded::coder::encodeguess
Text::CSV::Encoded::Coder::EncodeGuess(3pm) User Contributed Perl Documentation Text::CSV::Encoded::Coder::EncodeGuess(3pm)NAME
Text::CSV::Encoded::Coder::EncodeGuess - Text::CSV::Encoded coder class using Encode::Guess
SYNOPSIS
use Text::CSV::Encoded coder_class => 'Text::CSV::Encoded::Coder::EncodeGuess';
use Spreadsheet::ParseExcel;
my $csv = Text::CSV::Encoded->new();
$csv->encoding( ['ucs2', 'ascii'] ); # guessing ucs2 or ascii?
$csv->encoding_to_combine('shiftjis');
my $excel = Spreadsheet::ParseExcel::Workbook->Parse( $file );
my $sheet = $excel->{Worksheet}->[0];
for my $row ( $sheet->{MinRow} .. $sheet->{MaxRow} ) {
my @fields;
for my $col ( $sheet->{MinCol} .. $sheet->{MaxCol} ) {
my $cell = $sheet->{Cells}[$row][$col];
push @fields, $cell->{Val};
}
$csv->combine( @fields ) or die;
print $csv->string, "
";
}
DESCRIPTION
This module is inherited from Text::CSV::Encoded::Coder::Encode.
USE
Except for 2 attributes, same as Text::CSV::Encoded::Coder::Encode.
encoding_in
$csv = $csv->encoding_in( $encoding_list_ref );
The accessor to an encoding for pre-parsing CSV strings. If no encoding is given, returns current $encoding, otherwise the object itself.
$encoding_list_ref = $csv->encoding_in()
When you pass a list reference, it might guess the encoding from the given list.
$csv->encoding_in( ['shiftjis', 'euc-jp', 'iso-20022-jp'] );
If it cannot guess the encoding, the first encoding of the list is used.
encoding
$csv = $csv->encoding( $encoding_list_ref );
$encoding_list_ref = $csv->encoding();
You can pass a list reference to this attribute only:
* For list data consumed by combine().
* For list reference returned by getline().
In other word, in "combine" and "print", it might guess an encoding for the passing list data. If it cannot guess the encoding, the first
encoding of the list is used.
SEE ALSO
Encode, Encode::Guess
AUTHOR
Makamaka Hannyaharamitu, <makamaka[at]cpan.org>
COPYRIGHT AND LICENSE
Copyright 2008-2010 by Makamaka Hannyaharamitu
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.14.2 2010-04-26 Text::CSV::Encoded::Coder::EncodeGuess(3pm)