HTML to CSV


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers HTML to CSV
# 1  
Old 06-13-2014
HTML to CSV

Hi, I Have one webpage with tables and I Want to save it to csv.
If I open it in Calc and export it to CSV the file its right separated how can I make the same with awk?
Im attatching the webpage to convert it in CSV.
# 2  
Old 06-13-2014
This needs some work, but might be an ok starting point... something I wrote for something else a couple of years ago. Written in php:

Code:
<?php
/** \action-csv.php
 * \copyright (c) 2012, Christopher Jay Cox, Licensed under GPLv2
 * \author Christopher Jay Cox
 * \version 20120318a
*/

function Html2CSV($page_html) {
        # This also preserves hyperlinks, we'll parse the targets and names later.
        $page_html = strip_tags($page_html, '<a><table><tr><th><td>');

        preg_match_all('/<tr[^>]*>(.*)<\/tr>/isU', $page_html, $trs);

        $ahrefexp = '/<a \s*[^>]*href=["\'](?P<href>[^"\']*)["\']\s*[^>]*>(?P<name>.*)<\s*\/a>/si';
        $csvout='';
        foreach ($trs[1] as $tr) {
                preg_match_all('/<t[hd][^>]*>(.*)<\/t[hd]>/isU', $tr, $tds);
                $first = true;
                foreach ($tds[1] as $td) {
                        # For CSV output, prefer blank to regular decode for nbsp
                        $td = preg_replace('/&nbsp;/', ' ', $td);
                        $td = preg_replace('/&quot;/', '"', $td);
                        # Double quotes must be escaped by another double quote
                        $td = preg_replace('/"/', '""', $td);
                        if (!$first) $csvout .= ',';
                                if (preg_match_all($ahrefexp, $td, $matches))
                                        $td = $matches['name'][0];
                                $csvout .= '"' . html_entity_decode($td, ENT_COMPAT, 'UTF-8') . '"';
                                $first = false;
                }
                $csvout .= "\n";
        }
        return $csvout;
}

$page_html = file_get_contents('/tmp/webpage.html');
$csvout=Html2CSV($page_html);
echo "$csvout"
?>

 
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Converting csv to html format

Below is the code I have - How can I convert the data in the csv into 3 tables in html. instead of 1 table. Attached is the format I am getting. (1 Reply)
Discussion started by: archana25
1 Replies

2. Shell Programming and Scripting

html-to-csv

Dear, I have to format an output that is html with the tags outside the standard for a csv file. follows the input file: <table id=tabela BORDER=1 CELLSPACING=0 CELLPADDING=0 slcolor=#ffffcc dragcolor='gray' img='false' col='1' rowTotal='1' height=100% habilita_primeira='1'... (2 Replies)
Discussion started by: He2
2 Replies

3. Shell Programming and Scripting

Help needed in csv to html

Hi, Below is the code i have. But it prints entire csv line in one column. I want to print 10 comma-separated fields in 10 columns. Almost there, maybe a tweak you guys can help with. cat reports/file.csv |awk -v border=1 -v width=10 -v bgcolor=black -v f gcolor=white ' BEGIN {... (1 Reply)
Discussion started by: jakSun8
1 Replies

4. Shell Programming and Scripting

html to csv conversion

thanks for allowing me to join your forum i have a html file with three columns ------------Last visit date , URL and link,,,,,,,, how can i convert the same into csv so that i can output into database... the mechine is linux...i made a little googling and got idea that there is ways for... (2 Replies)
Discussion started by: certteam
2 Replies

5. UNIX for Dummies Questions & Answers

convert csv to html file

Hi All, I am new to this forum,not sure where to post this query...so posted here Kindly need any of your help on the below ------------ I am using shell scripting and trying to convert a csv file to html file... example.csv --------------- Name Country Age Sex Andy India 25 ... (4 Replies)
Discussion started by: sumithra
4 Replies

6. Shell Programming and Scripting

Parsing: How to go from HTML to CSV?

Dear all, I have to parse a large amount of html files, which I would like to transform into comma separated values. The html-files have the following structure: <tag1> CATEGORY_1 <tag2><tag3> HEADER_1 <tag4> <tag5> paragraph_1 <tag6> <tag5> paragraph_2 <tag6> <tag3>HEADER_2... (2 Replies)
Discussion started by: docdudetheman
2 Replies

7. Shell Programming and Scripting

HTML to csv

Hi !! Could you please let me know of how can a html file be converted to csv.. I am looking out for a script which could do that.. Please find the below example <HTML><BODY><TABLE> <TR><TD>Parent CR</TD><TD>ChildCR</TD><TD>Title</TD><TD>Description</TD></TR> </TABLE></BODY></HTML>... (3 Replies)
Discussion started by: ganga.dharan
3 Replies

8. Shell Programming and Scripting

HTML table to CSV

Hi !! I have HTML Tables through which i want to generate graphs, but for creating graphs i need the file in CSV format so can anyone can please help me in how can i convert my HTML table file to CSV format. Thanks in Advance (2 Replies)
Discussion started by: i_priyank
2 Replies

9. UNIX for Dummies Questions & Answers

Converting HTML to CSV

Hi, I need to convert a relatively large html file (1.5megs) into CSV under Unix. How would I be able to do this? Much thanks. (3 Replies)
Discussion started by: Jexel
3 Replies
Login or Register to Ask a Question