Transpose Daily Data from Column to Row.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Transpose Daily Data from Column to Row.
# 1  
Old 02-04-2011
Transpose Daily Data from Column to Row.

Hi I'm looking to transpose Linux data from a daily report that logs every 10mins like below.
After the first "comma" I need the daily total for Col2 and Col3 transposed like below.

The new transposed format below will then be exported to Microsoft Excel for Reporting.

Any help would be appreciated!

Raw Data
HTML Code:
                                   Col2  Col3
10-06-2006 17:09:28,1880,1862,1865,1866,1879,1881,1879,1879
10-06-2006 17:19:28,1885,1883,1884,1884,1648,1648,1648,1648
10-06-2006 17:29:28,1853,1844,1845,1845,1879,1878,1879,1879
10-06-2006 17:39:28,1912,1910,1914,1914,1847,1847,1847,1847
10-06-2006 17:49:28,1871,1855,1859,1858,1842,1843,1842,1842
10-06-2006 17:59:28,1907,1895,1898,1898,1789,1789,1789,1789
10-07-2006 18:09:28,2235,2233,2235,2236,2078,2078,2078,2078
10-07-2006 18:19:28,2094,2090,2094,2094,1932,1931,1932,1932
10-07-2006 18:29:28,1999,1977,1981,1981,1928,1929,1928,1928
10-08-2006 18:39:28,2061,2041,2062,2061,2005,2005,2005,2005
10-08-2006 18:49:28,2204,2207,2195,2193,2034,2034,2034,2034
10-08-2006 18:59:28,2354,2343,2347,2349,2143,2143,2141,2141
Desired Transposed Result
HTML Code:
10/06                    10/07                  10/08            10/31
Total for Col2    Total for Col2    Total for Col2..........
Total for Col3    Total for Col3    Total for Col3...........
# 2  
Old 02-04-2011
you have 8 column data, which are col2, which are col3?

and give at least one line real sample output, not just "total for col2"
# 3  
Old 02-04-2011
Thnx RDC!

I apologize my column labels weren't lined up correctly. Please see below.

 
Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10
10-06-2006 17:09:28,1880,1862,1865,1866,1879,1881,1879,1879
10-06-2006 17:19:28,1885,1883,1884,1884,1648,1648,1648,1648
10-06-2006 17:29:28,1853,1844,1845,1845,1879,1878,1879,1879
The total for the day are usually around 100,000 to 300,000.

For instance, I would like to transpose the date format below in the first row and totals for the day in the next two rows for col2 and col3.

Below are the actual totals for those days.
HTML Code:
10/06       10/07       10/08
92,121    100,091   107,111
93,836    101,237   109,080
# 4  
Old 02-04-2011
Here's a short Perl script for the job -

Code:
$
$ # show the contents of the data file "f6"
$
$ cat f6
10-06-2006 17:09:28,1880,1862,1865,1866,1879,1881,1879,1879
10-06-2006 17:19:28,1885,1883,1884,1884,1648,1648,1648,1648
10-06-2006 17:29:28,1853,1844,1845,1845,1879,1878,1879,1879
10-06-2006 17:39:28,1912,1910,1914,1914,1847,1847,1847,1847
10-06-2006 17:49:28,1871,1855,1859,1858,1842,1843,1842,1842
10-06-2006 17:59:28,1907,1895,1898,1898,1789,1789,1789,1789
10-07-2006 18:09:28,2235,2233,2235,2236,2078,2078,2078,2078
10-07-2006 18:19:28,2094,2090,2094,2094,1932,1931,1932,1932
10-07-2006 18:29:28,1999,1977,1981,1981,1928,1929,1928,1928
10-08-2006 18:39:28,2061,2041,2062,2061,2005,2005,2005,2005
10-08-2006 18:49:28,2204,2207,2195,2193,2034,2034,2034,2034
10-08-2006 18:59:28,2354,2343,2347,2349,2143,2143,2141,2141
$
$
$ # run the Perl script that processes the file "f6"
$
$ perl -ne '/^(\d+)-(\d+)-(\d+) .*?,(.*)$/;
           $k = "$3$1$2";
           @x = split(/,/, $4);
           $len = $#x;
           if (defined $counts{$k}) {
             $counts{$k} = [ map{$x[$_] + ${$counts{$k}}[$_]} 0..$#x ];
           } else {
             $counts{$k} = [ @x ];
           }
           END {
             print join "\t", sort keys %counts;
             print "\n";
             for $i (0..$len) {
               for $k (sort keys %counts) {
                 print ${$counts{$k}}[$i],"\t";
               }
               print "\n";
             }
           }
         ' f6
20061006        20061007        20061008
11308   6328    6619
11249   6300    6591
11265   6310    6604
11265   6311    6603
10884   5938    6182
10886   5938    6182
10884   5938    6180
10884   5938    6180
$
$

I've delimited the processed data with tab characters and so they may not align on the console.
But they will show up aligned in MS Excel when saved as a tsv (tab-separated values) file.

HTH,
tyler_durden

Last edited by durden_tyler; 02-04-2011 at 03:09 PM..
This User Gave Thanks to durden_tyler For This Post:
# 5  
Old 02-05-2011
Works perfectly! Since I'm a noob I was going to ask how to retrieve summary for a specific month, day, or year. But I can always grep that out to a file and run your perl
script against the output.

I like that you figured out the column issue with Microsoft Excel. Smilie
Now I just have to copy and paste to Excel, Nicely done!

Appreciated the help Durden!

Thanks again Durden and RDC!
# 6  
Old 02-05-2011
Quote:
Originally Posted by ravzter
...I was going to ask how to retrieve summary for a specific month, day, or year. ...
Given this data file -

Code:
$
$
$ cat input.txt
10-06-2006 17:09:28,1880,1862,1865,1866,1879,1881,1879,1879
10-06-2006 17:19:28,1885,1883,1884,1884,1648,1648,1648,1648
10-06-2006 17:29:28,1853,1844,1845,1845,1879,1878,1879,1879
10-06-2006 17:39:28,1912,1910,1914,1914,1847,1847,1847,1847
10-06-2006 17:49:28,1871,1855,1859,1858,1842,1843,1842,1842
10-06-2006 17:59:28,1907,1895,1898,1898,1789,1789,1789,1789
10-07-2006 18:09:28,2235,2233,2235,2236,2078,2078,2078,2078
10-07-2006 18:19:28,2094,2090,2094,2094,1932,1931,1932,1932
10-07-2006 18:29:28,1999,1977,1981,1981,1928,1929,1928,1928
10-08-2006 18:39:28,2061,2041,2062,2061,2005,2005,2005,2005
10-08-2006 18:49:28,2204,2207,2195,2193,2034,2034,2034,2034
10-08-2006 18:59:28,2354,2343,2347,2349,2143,2143,2141,2141
$
$

here's a Perl program that does the number crunching for individual days/months/years -

Code:
$
$
$ cat -n process_input.pl
     1  #!perl -w
     2  # declare variables, assign where necessary
     3  my $key;
     4  my @x;
     5  my $len = -1;
     6  my %counts;
     7  my $i;
     8  my $k;
     9
    10  # capture user input
    11  print "Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).\n";
    12  print "Press RETURN to process entire file.                 => ";
    13  chomp(my $datevalue = <STDIN>);
    14
    15  # start processing the input file
    16  my $file = "input.txt";
    17  open (F, $file) or die "Can't open $file: $!";
    18  while (<F>) {
    19    # match the regex with the current line
    20    /^(\d+)-(\d+)-(\d+) .*?,(.*)$/;
    21    # set up the key for the hash "%counts"
    22    if ($datevalue eq "") {
    23      $key = "$3$1$2";
    24    } elsif ($datevalue eq "$3$1$2") {
    25      $key = "$3$1$2";
    26    } elsif ($datevalue eq "$3$1") {
    27      $key = "$3$1";
    28    } elsif ($datevalue eq "$3") {
    29      $key = "$3";
    30    } else {
    31      next;
    32    }
    33    # load the numbers to the right of the timestamp into @x array
    34    @x = split(/,/, $4);
    35    $len = $#x;
    36    # the hash %counts has a key as determined above, and a value
    37    # that is a reference to the number array. As each line is processed,
    38    # the respective elements of the number array are added up.
    39    if (defined $counts{$key}) {
    40      $counts{$key} = [ map{$x[$_] + ${$counts{$key}}[$_]} 0..$#x ];
    41    } else {
    42      $counts{$key} = [ @x ];
    43    }
    44  }
    45  close (F) or die "Can't close $file: $!";
    46
    47  # now simply iterate through the %counts hash and print the array
    48  print join "\t", sort keys %counts;
    49  print "\n";
    50  for $i (0..$len) {
    51    for $k (sort keys %counts) {
    52      print ${$counts{$k}}[$i],"\t";
    53    }
    54    print "\n";
    55  }
$
$

And the tests follow -

Code:
$
$
$ # (1) Test for contents of the entire file
$ perl process_input.pl
Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).
Press RETURN to process entire file.                 =>
20061006        20061007        20061008
11308   6328    6619
11249   6300    6591
11265   6310    6604
11265   6311    6603
10884   5938    6182
10886   5938    6182
10884   5938    6180
10884   5938    6180
$
$ # (2) Test for individual days
$ perl process_input.pl
Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).
Press RETURN to process entire file.                 => 20061006
20061006
11308
11249
11265
11265
10884
10886
10884
10884
$
$ perl process_input.pl
Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).
Press RETURN to process entire file.                 => 20061007
20061007
6328
6300
6310
6311
5938
5938
5938
5938
$
$ perl process_input.pl
Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).
Press RETURN to process entire file.                 => 20061008
20061008
6619
6591
6604
6603
6182
6182
6180
6180
$
$ # (3) Test for the month that exists in data file
$ perl process_input.pl
Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).
Press RETURN to process entire file.                 => 200610
200610
24255
24140
24179
24179
23004
23006
23002
23002
$
$ # (4) Test for the year that exists in data file
$ perl process_input.pl
Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).
Press RETURN to process entire file.                 => 2006
2006
24255
24140
24179
24179
23004
23006
23002
23002
$
$ # (5) Test for non-existent day, month and year
$ perl process_input.pl
Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).
Press RETURN to process entire file.                 => 20061022

$
$ perl process_input.pl
Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).
Press RETURN to process entire file.                 => 200611

$
$ perl process_input.pl
Enter day (YYYYMMDD), month (YYYYMM) or year (YYYY).
Press RETURN to process entire file.                 => 2007

$
$

Hope that helps,
tyler_durden

Last edited by durden_tyler; 02-05-2011 at 10:48 PM..
This User Gave Thanks to durden_tyler For This Post:
# 7  
Old 02-06-2011
Code:
$ awk --version|head -1
GNU Awk 3.1.8

$ awk -F "[ ,]" '
{for (i=3;i<=NF;i++) {split($1,a,"-"); date=a[1]"/"a[2];d[date];s[date OFS i]+=$i}}
END{ n=asorti(d,t)
for (j=1;j<=n;j++) printf t[j] OFS; printf ORS;
for (i=3;i<=NF;i++)
        {for (j=1;j<=n;j++) printf s[t[j] OFS i] OFS;
         printf ORS}
}' infile

10/06 10/07 10/08
11308 6328 6619
11249 6300 6591
11265 6310 6604
11265 6311 6603
10884 5938 6182
10886 5938 6182
10884 5938 6180
10884 5938 6180

This User Gave Thanks to rdcwayx For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Transpose the Row and column

Hi, I have data in form of A ram B shyam C seeta D geeta A bob B methew C Richad D Mike and i want it in this form. A B C D ram shyam seeta geeta bob methew Richard Mike. please help by providing the scripting for this. (3 Replies)
Discussion started by: ricbha
3 Replies

2. Shell Programming and Scripting

How transpose column in a row?

Hello guys, First of all happy holidays and happy new year. I'm new in bioinformatic and also it is my first time that I write in this forum. Therefore, sorry if I make some mistakes. I'm writing to ask your help to fix a problem: I have a file like this: gene1 GO:0016491|GO:0055114... (8 Replies)
Discussion started by: Salvatore_espos
8 Replies

3. Shell Programming and Scripting

Transpose from row to column using timestamp in first column

Gents, Transpose from row to column, taking in consideration the first column, which contends the date. Input file 72918,111000009,111000009,111000009,111000009,111000009,111000009,111000009,111000009,111000009 72918,2356,2357,2358,2359,2360,2361,2362,2363,2364 72918,0,0,0,0,0,0,0,0,0... (12 Replies)
Discussion started by: jiam912
12 Replies

4. Shell Programming and Scripting

Transpose row to column

I'm using the testawk.awk from the following thread https://www.unix.com/shell-programming-and-scripting/18897-row-column-transpose.htmlI'm getting the following output fieldname1 data1 fieldname2 data2 fieldname3 data3 How can I get like this instead 1 fieldname1 data1 2 fieldname2 data2... (1 Reply)
Discussion started by: makkan
1 Replies

5. Shell Programming and Scripting

To transpose row into column

Hi All, In shell, I have below data coming from some some text file as below: . 351706 5861.8 0.026 0.012 12.584 0.026 0.012 12.582 0.000 0.000 0.000 Now i need the above data to be transposed as below 351706... (16 Replies)
Discussion started by: Anamica
16 Replies

6. Shell Programming and Scripting

Transpose column to row - awk

Hi there, I have a small csv file example below: source,cu_001,cu_001_volume,cu_001_mass,cu_002,cu_002_volume,cu_002_mass,cu_003,cu_003_volume,cu_003_mass ja116,1.33,3024000,9374400,1.54,3026200,9375123,1.98,3028000,9385512 I want to transpose columns to rows starting at the second... (3 Replies)
Discussion started by: theflamingmoe
3 Replies

7. Shell Programming and Scripting

Column to row Transpose

Hi Folks, Iam a kinda newbie to unix shell scripting, the scenario is i have a text file containing the following info Charlie chicago 15 Charlie newyork 26 jonny chicago 14 jonny newyork 15 joe chicago 15 joe newyork 18output should be Name chicago ... (3 Replies)
Discussion started by: tech_frk
3 Replies

8. UNIX for Dummies Questions & Answers

Row to column transpose

Hi there, Below is sample three rows which i need transpose into multiple rows. By keeping first 2 fields static and split them into multiple rows depend following date field. Each into seperate rows. Sample code: ... (6 Replies)
Discussion started by: ganeshd
6 Replies

9. Shell Programming and Scripting

Transpose column to row

Hi i have a file which has values seperated by "," as shown below and I want to transpose for every doc_id in one row. Input: DOC_ID,KEYWORD 105,REGISTROS 105,GEOLOGIA 105,NUCLEOS 105,EXPEDIENTE 105,PROGRAMAS 10025,EXPEDIENTE 10025,LOCALIZACIONES 10025,OFICIOS 10025,PROGRAMAS... (4 Replies)
Discussion started by: juelillo
4 Replies

10. Shell Programming and Scripting

Row to column transpose

Can we transpose rows to columns? Fields within row are separated by a comma. (10 Replies)
Discussion started by: videsh77
10 Replies
Login or Register to Ask a Question