Merge cells in all rows of a HTML table dynamically.


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Merge cells in all rows of a HTML table dynamically.
# 8  
Old 01-18-2018
Quote:
Originally Posted by Mounika
But the thing here is, when I convert the output to HTML format it will show me blank cells instead of merged cells.
The thing is, HTML and CSV are not hundred percent identical. In HTML you can use that rowspan and colspan clauses (to <td>) to expand cells across row/column boundaries. There is no such thing in CSV.

In CSV you have just a succession of "fields" (delimited by comma, hence the name) forming a line (=row). These fields are always a - one! - field, not something spanning lines or several places where a field would be. Even successive lines do not necessarily share some common structure: if a line has less (or more) fields than another line, that is just fine, but these fields will not be realigned to form columns.

Bottom line: it is in fact possible to write a script (well, actually more like a program, because you need to work back and forth) so that the output is not CSV but HTML. In HTML you indeed can have these rowspan- and colspan-clauses you want to have, but this output would never translate back into proper CSV. (That is, not by just stripping down the HTML tags. It would take another equally complex program.)

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
# 9  
Old 01-18-2018
Because one does not know in advance what the rowspan value is, one needs to store at least a bunch of lines in memory.
The following stores all the lines in memory, and prints everything in the END section. (Still it certainly uses less memory than Excel. But there is room for further optimization...)
Code:
awk '
BEGIN {
  FS=","
  print "MIME-Version: 1.0"
  print "Content-Type: text/html"
  print "Content-Disposition: inline"
  print "<HTML><BODY><TABLE border=1>"
}
NR==1 {
  nf=NF
  for (i=1; i<=nf; i++)
    printf "<TH>Header %s</TH>", i
  print ""
}
{
  for (i=1; i<=nf; i++)
    if ($i!=lastval[i]) {
      saveval[NR,i]=$i
      lastspan[i]=NR
    } else {
      rowspan[lastspan[i],i]++
    }
  split($0,lastval)
}
END {
  split("",lastspan)
  for (r=1; r<=NR; r++) {
    printf "<TR>" 
    for (i=1; i<=nf; i++)
      if ((r,i) in rowspan) {
        span[i]=rowspan[r,i]
        printf "<TD rowspan=%s>%s</TD>", span[i]+1, saveval[r,i] 
      } else if (!((i in span) && span[i]--)) {
        printf "<TD>%s</TD>", saveval[r,i] 
      }
    print "</TR>"
  }
  print "</TABLE></BODY></HTML>"
}
' inputfile.csv > inputfile.html

This User Gave Thanks to MadeInGermany For This Post:
# 10  
Old 01-18-2018
Thanks a ton!!! Smilie

Its working perfectly!!
# 11  
Old 02-01-2018
Hello All,

I still need further help on this one. Sometimes the output of the HTML code provided is coming out a bit weird.

Let me post the example below and the output which i was getting and what should be the actual output.
Code:
Header1,Header2,Header3,Header4,Header5,Header6,Header6
AAAA,TTTTT,AA-MMM-YYYY,XYZ,1,AA & BB,Reason1
AAAA,TTTTT,BB-MMM-YYYY,UVW,782,AB & BB,Reason1
AAAA,TTTTT,CC-MMM-YYYY,UVW,908,AC & BB,Reason1
AAAA,TTTTT,DD-MMM-YYYY,XYZ,497,AD & BB,Reason1
AAAA,TTTTT,EE-MMM-YYYY,UVW,37,AD & BD,Reason1
AAAA,TTTTT,FF-MMM-YYYY,XYZ,536,AE & BD,Reason1
AAAA,TTTTT,GG-MMM-YYYY,UVW,43,AE & BE,Reason1
AAAA,TTTTT,HH-MMM-YYYY,UVW,1099,AC & BE,Reason1
AAAA,TTTTT,II-MMM-YYYY,UVW,62,AC & DE,Reason1
AAAA,TTTTT,JJ-MMM-YYYY,UVW,54,AC & EE,Reason1
BBBB,TTTTT,AA-MMM-YYYY,UVW,603,AE & EE,Reason1
BBBB,TTTTT,FF-MMM-YYYY,UVW,603,CE & EE,Reason1
BBBB,TTTTT,GG-MMM-YYYY,UVW,553,CE & ED,Reason1
BBBB,TTTTT,JJ-MMM-YYYY,UVW,603,CC & ED,Reason1
CCCC,TTTTT,BB-MMM-YYYY,UVW,164,CC & EB,Reason1
CCCC,TTTTT,KK-MMM-YYYY,UVW,262,CC & ED,Reason1
CCCC,TTTTT,DD-MMM-YYYY,UVW,262,CC & ED,Reason1
CCCC,TTTTT,LL-MMM-YYYY,UVW,262,CC & ED,Reason1
CCCC,TTTTT,FF-MMM-YYYY,UVW,262,CC & ED,Reason1
CCCC,TTTTT,MM-MMM-YYYY,UVW,262,CC & ED,Reason1
CCCC,TTTTT,HH-MMM-YYYY,UVW,352,CA & ED,Reason1
CCCC,TTTTT,NN-MMM-YYYY,UVW,262,CC & ED,Reason1
CCCC,TTTTT,JJ-MMM-YYYY,UVW,440,CA & EG,Reason1
DDDD,TTTTT,AA-MMM-YYYY,UVW,1490,DA & EG,Reason1
DDDD,TTTTT,CC-MMM-YYYY,UVW,1490,DA & EC,Reason1
DDDD,TTTTT,EE-MMM-YYYY,UVW,1490,DA & EC,Reason1
DDDD,TTTTT,GG-MMM-YYYY,UVW,1490,DA & EC,Reason1
EEEE,TTTTT,AA-MMM-YYYY,UVW,930,DA & ET,Reason1
EEEE,TTTTT,CC-MMM-YYYY,UVW,930,DA & EG,Reason1
EEEE,TTTTT,EE-MMM-YYYY,UVW,930,DA & EG,Reason1
EEEE,TTTTT,GG-MMM-YYYY,UVW,930,DA & EG,Reason1

HTML Format of the above data is being obtained as below:
Code:
MIME-Version: 1.0
Content-Type: text/html
Content-Disposition: inline
<HTML><BODY><TABLE border=1>
<TH>Header 1</TH><TH>Header  2</TH><TH>Header 3</TH><TH>Header  4</TH><TH>Header 5</TH><TH>Header  6</TH><TH>Header 7</TH>
<TR><TD rowspan=10>AAAA</TD><TD  rowspan=31>TTTTT</TD><TD>AA-MMM-YYYY</TD><TD>XYZ</TD><TD>1</TD><TD>AA & BB</TD><TD rowspan=31>Reason1</TD></TR>
<TR><TD>BB-MMM-YYYY</TD><TD  rowspan=2>UVW</TD><TD>782</TD><TD>AB & BB</TD></TR>
<TR><TD>CC-MMM-YYYY</TD><TD>908</TD><TD>AC & BB</TD></TR>
<TR><TD>DD-MMM-YYYY</TD><TD>XYZ</TD><TD>497</TD><TD>AD & BB</TD></TR>
<TR><TD>EE-MMM-YYYY</TD><TD>37</TD><TD>AD & BD</TD></TR>
<TR><TD>FF-MMM-YYYY</TD><TD>536</TD><TD>AE & BD</TD></TR>
<TR><TD>GG-MMM-YYYY</TD><TD  rowspan=25>UVW</TD><TD>43</TD><TD>AE & BE</TD></TR>
<TR><TD>HH-MMM-YYYY</TD><TD>1099</TD><TD>AC & BE</TD></TR>
<TR><TD>II-MMM-YYYY</TD><TD>62</TD><TD>AC & DE</TD></TR>
<TR><TD>JJ-MMM-YYYY</TD><TD>54</TD><TD>AC & EE</TD></TR>
<TR><TD rowspan=4>BBBB</TD><TD>AA-MMM-YYYY</TD><TD  rowspan=2>603</TD><TD>AE & EE</TD></TR>
<TR><TD>FF-MMM-YYYY</TD><TD>AE & BD</TD></TR>
<TR><TD>GG-MMM-YYYY</TD><TD>553</TD><TD>CE & ED</TD></TR>
<TR><TD>JJ-MMM-YYYY</TD><TD>CC & ED</TD></TR>
<TR><TD rowspan=9>CCCC</TD><TD>BB-MMM-YYYY</TD><TD>CC & EB</TD></TR>
<TR><TD>KK-MMM-YYYY</TD><TD  rowspan=5>262</TD><TD rowspan=5>CC & ED</TD></TR>
<TR><TD>DD-MMM-YYYY</TD></TR>
<TR><TD>LL-MMM-YYYY</TD></TR>
<TR><TD>FF-MMM-YYYY</TD></TR>
<TR><TD>MM-MMM-YYYY</TD></TR>
<TR><TD>HH-MMM-YYYY</TD><TD>352</TD><TD>CA & ED</TD></TR>
<TR><TD>NN-MMM-YYYY</TD></TR>
<TR><TD>JJ-MMM-YYYY</TD></TR>
<TR><TD rowspan=4>DDDD</TD><TD>AA-MMM-YYYY</TD><TD  rowspan=4>1490</TD></TR>
<TR><TD>CC-MMM-YYYY</TD><TD rowspan=3>DA & EC</TD></TR>
<TR><TD>EE-MMM-YYYY</TD></TR>
<TR><TD>GG-MMM-YYYY</TD></TR>
<TR><TD rowspan=4>EEEE</TD><TD>AA-MMM-YYYY</TD><TD  rowspan=4>930</TD><TD>DA & ET</TD></TR>
<TR><TD>CC-MMM-YYYY</TD><TD rowspan=3>DA & EG</TD></TR>
<TR><TD>EE-MMM-YYYY</TD></TR>
<TR><TD>GG-MMM-YYYY</TD></TR>
</TABLE></BODY></HTML>

As you observe, the "UVW" part of the below line in the input data is getting missed out some how for next 2 rows resulting in the table showing wrong data altogether.

Code:
AAAA,TTTTT,EE-MMM-YYYY,UVW,37,AD & BD,Reason1
AAAA,TTTTT,FF-MMM-YYYY,XYZ,536,AE & BD,Reason1

Please help me!!

---------- Post updated at 12:08 PM ---------- Previous update was at 11:17 AM ----------

Also, basis checking the data i observe the below:

Code:
END {
  split("",lastspan)
  for (r=1; r<=NR; r++) {
    printf "<TR>" 
    for (i=1; i<=nf; i++)
      if ((r,i) in rowspan) {
        span[i]=rowspan[r,i]
        printf "<TD rowspan=%s>%s</TD>", span[i]+1, saveval[r,i] 
      } else if (!((i in span) && span[i]--)) {
        printf "<TD>%s</TD>", saveval[r,i] 
      }
    print "</TR>"
  }
  print "</TABLE></BODY></HTML>"
}

In this part of the code, if rowspan value is assigned to a particular HTML cell, next row is being processed fine, but next after next row is having trouble. In other words, every second row after rowspan value is set is causing trouble.
Example:

Code:
<TR><TD>BB-MMM-YYYY</TD><TD  rowspan=2>UVW</TD><TD>782</TD><TD>AB & BB</TD></TR>
<TR><TD>CC-MMM-YYYY</TD><TD>908</TD><TD>AC & BB</TD></TR>

In this part rowspan is set to 2. Next row after this one is as below which is getting displayed correctly.

Code:
<TR><TD>DD-MMM-YYYY</TD><TD>XYZ</TD><TD>497</TD><TD>AD & BB</TD></TR>

But, the 2 rows after this row is having a problem.

Code:
<TR><TD>EE-MMM-YYYY</TD><TD>37</TD><TD>AD & BD</TD></TR>
<TR><TD>FF-MMM-YYYY</TD><TD>536</TD><TD>AE & BD</TD></TR>

I hope you understand.

---------- Post updated 02-01-18 at 11:56 AM ---------- Previous update was 01-31-18 at 12:08 PM ----------

Please help me!

I tried what ever I can with the knowledge i have. But couldnt identify the root cause itself where the code is going wrong.

But understood that as and when rowspan is set, post the completion of rowspan value the next row (after rowspan is complete) is coming fine. But the next row after this is not in the correct format.

Last edited by Mounika; 02-05-2018 at 09:52 AM.. Reason: Changing the data portion of the example
# 12  
Old 02-02-2018
Different approach, hopefully simpler:
Code:
tac file | awk '
BEGIN   {
         FS=","
#        print "MIME-Version: 1.0"
#        print "Content-Type: text/html"
#        print "Content-Disposition: inline"
         print "</TABLE></BODY></HTML>"
        }

NR > 1  {printf "<TR>" 
         for (i=1; i<=NF; i++)  {if ($i != LAST[i])     {printf "<TD "
                                                         if (ROWSPAN[i] > 1) printf "rowspan=%s", ROWSPAN[i]
                                                         printf ">%s</TD>", LAST[i]
                                                         ROWSPAN[i] = 0
                                                        }
                                 ROWSPAN[i]++
                                }
         print "</TR>"
        }

        {split ($0, LAST)
        }

NR == 1 {for (i=1; i<=NF; i++) ROWSPAN[i]++
        }

END     {for (i=1; i<=NF; i++)
         printf "<TH>%s</TH>", LAST[i]
         print ""
         print "<HTML><BODY><TABLE border=1>"
        }

' | tac

Pls. test and report back.


EDIT: Or, still a tad simpler,
Code:
tac file | awk '
BEGIN   {
         FS=","
#        print "MIME-Version: 1.0"
#        print "Content-Type: text/html"
#        print "Content-Disposition: inline"
         print "</TABLE></BODY></HTML>"
        }

NR > 1  {printf "<TR>" 
         for (i=1; i<=NF; i++)  {if ($i != LAST[i])     {printf "<TD "
                                                         if (ROWSPAN[i] > 1) printf "rowspan=%s", ROWSPAN[i]
                                                         printf ">%s</TD>", LAST[i]
                                                         ROWSPAN[i] = 0
                                                        }
                                }
         print "</TR>"
        }

        {for (i=split ($0, LAST); i; i--) ROWSPAN[i]++
        }

END     {for (i=1; i<=NF; i++)
         printf "<TH>%s</TH>", LAST[i]
         print ""
         print "<HTML><BODY><TABLE border=1>"
        }

' | tac


Last edited by RudiC; 02-02-2018 at 09:25 AM..
This User Gave Thanks to RudiC For This Post:
# 13  
Old 02-02-2018
Hello RudiC,

Thank you for your response. But, when I was trying its throwing a message
"ksh: tac: not found."

---------- Post updated at 12:06 PM ---------- Previous update was at 11:17 AM ----------

Quote:
Originally Posted by Mounika
Hello RudiC,

Thank you for your response. But, when I was trying its throwing a message
"ksh: tac: not found."
I managed to use the below command instead of tac and code posted by you is giving me results as expected.

Code:
sed '1!G;h;$!d'

Code that has been tested is the below one:

Code:
tac file | awk '
BEGIN   {
         FS=","
#        print "MIME-Version: 1.0"
#        print "Content-Type: text/html"
#        print "Content-Disposition: inline"
         print "</TABLE></BODY></HTML>"
        }

NR > 1  {printf "<TR>" 
         for (i=1; i<=NF; i++)  {if ($i != LAST[i])     {printf "<TD "
                                                         if (ROWSPAN[i] > 1) printf "rowspan=%s", ROWSPAN[i]
                                                         printf ">%s</TD>", LAST[i]
                                                         ROWSPAN[i] = 0
                                                        }
                                 ROWSPAN[i]++
                                }
         print "</TR>"
        }

        {split ($0, LAST)
        }

NR == 1 {for (i=1; i<=NF; i++) ROWSPAN[i]++
        }

END     {for (i=1; i<=NF; i++)
         printf "<TH>%s</TH>", LAST[i]
         print ""
         print "<HTML><BODY><TABLE border=1>"
        }

' | tac

I shall test for multiple combinations of input and shall update you on the complete test results by tomorrow.

Thank you once again.
# 14  
Old 02-02-2018
Well, tac is the "reverse cat" and I thought it were generally available. If your sed snippet provides the same functionality, all should work well...
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Merge Multiple html files into one

Hi all I have written some code to write my output in html. As i have multiple servers, need to generate single html file. but my code is generating html file for each server. I have merged the files using below code. cat /home/*_FinalData.html > /home/MergedFinalData.html But how to... (1 Reply)
Discussion started by: Snehasish
1 Replies

2. UNIX for Beginners Questions & Answers

Remove duplicates in a dataframe (table) keeping all the different cells of just one of the columns

Hello all, I need to filter a dataframe composed of several columns of data to remove the duplicates according to one of the columns. I did it with pandas. In the main time, I need that the last column that contains all different data ( not redundant) is conserved in the output like this: A ... (5 Replies)
Discussion started by: pedro88
5 Replies

3. Shell Programming and Scripting

Parameterizing to dynamically generate the extract file from Oracle table using Shell Script

I have below 2 requirements for parameterize the generate the extract file from Oracle table using Shell Script. Could you please help me by modifying the script and show me how to execute it. First Requirement: I have a requirement where I need to parameterize to generate one... (0 Replies)
Discussion started by: hareshvikram
0 Replies

4. UNIX for Beginners Questions & Answers

Putting query result dynamically to one cell of table from shell

I have to send a data in mail table format.only one cell need to get dynamically from query. my code is like (echo '<table boarder="1"> echo '<tr><td>stock</td><td>/path</td><td>.........</td></tr>' echo '</table> )sendmail.. in ......... I am trying to get query result.By putting query... (2 Replies)
Discussion started by: meera_123
2 Replies

5. Programming

Perl script to merge cells in column1 which has same strings, for all sheets in a excel workbook

Perl script to merge cells ---------- Post updated at 12:59 AM ---------- Previous update was at 12:54 AM ---------- I am using below code to read files from a dir and print to excel. open(my $in, '<', $file) or die "Could not open file: $!"; my $rowCount = 0; my $colCount = 0;... (11 Replies)
Discussion started by: Jack_Bruce
11 Replies

6. Shell Programming and Scripting

extract complex data from html table rows

I have bash, awk, and sed available on my portable device. I need to extract 10 fields from each table row from a web page that looks like this: </tr> <tr> <td>28 Apr</td> <td><a... (6 Replies)
Discussion started by: rickgtx
6 Replies

7. Shell Programming and Scripting

Merge two cells in excel via UNIX?

Hi UNIX Gods! Is it possible to merge two cells in .csv file using unix commands? Imagine that this is my present csv file opened via excel: Gate Reports| | fatal alerts | 200 | is is possible to make it look like this using unix? Gate Reports | fatal... (1 Reply)
Discussion started by: 4dirk1
1 Replies

8. Programming

How do I change html style dynamically

I've got the following form element in a template driven web page... <INPUT type="text" class="normal" id="LastName" name="LastName" value="{LastName}"> The stylesheet description is simply... input.normal { width: 250; } I want to change the background colour of the input box after the user... (0 Replies)
Discussion started by: JerryHone
0 Replies

9. Shell Programming and Scripting

How to merge rows into columns ????

Hi guz I want to merge multiple rows into a multiple columns based on the first column. The file has symbol // I want to break the symbool // and I nedd exactlynew column at that point the output will be like this please guyz help in this isssue!!!!! merging rows into columns ... (4 Replies)
Discussion started by: bogu0001
4 Replies

10. Shell Programming and Scripting

Deleting table cells in a script

I'd like to use sed or awk to do this but I'm weak on both along with RE. Looking for a way with sed or awk to count for the 7th table data within a table row and if the condition is met to delete "<td>and everything in between </td>". Since the table header start on a specific line each time, that... (15 Replies)
Discussion started by: phpfreak
15 Replies
Login or Register to Ask a Question