Transpose data from columns to lines for each event


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Transpose data from columns to lines for each event
# 1  
Old 01-13-2009
Transpose data from columns to lines for each event

Hi everyone,

Maybe somebody could help me with this.

I have a text file showing in 2 columns registers of services used by customers in a comercial place.

The register for the use of any particular service begins with "EVENT" in column 1.
I would like to transpose the info for each block in one line. I mean, the different
words in column 1 will appear only once like a header, and the data in column 2
will appear below of its respective column in 1 line only.

Source file. (Not all blocks have the same registers in column 1, some have more than others)
Code:
EVENT                                
INTERNET CONNECTION                  
Date                       11/01/2009
Initial hour               07:30     
Number of users            27        
Average of use             32 min    
Final hour                 19:00     
 
EVENT                                
LOCAL CALL                           
Date                       11/01/2009
Initial hour               07:42     
Number of users            15        
Average of use             7 min     
Final hour                 16:11     
 
EVENT                                
INTERNATIONAL CALL                   
Date                       11/01/2009
Initial hour               09:14     
Number of users            21        
Average of use             5 min     
Final hour                 16:17     
 
EVENT                                
PRINTER USE                          
Date                       12/01/2009
Initial hour               07:30     
Number of users            23        
Average of pages printed   17        
Final hour                 19:00

I would like to tabulate it as follow
Code:
 
EVENT                   Date    Initial hour Number of users Average of use Average of pages printed  Final hour
INTERNET CONNECTION  11/01/2009    07:30             27          32 min                                   19:00   
LOCAL CALL           11/01/2009    07:42             15           7 min                                   16:11   
INTERNATIONAL CALL   11/01/2009    09:14             21           5 min                                   16:17   
PRINTER USE          12/01/2009    07:30             23                                  17               19:00

So far I know that If I use:
Code:
 
awk '/INTERNET CONNECTION/ { getline; print $2}' Input_1.txt
awk '/LOCAL CALL/ { getline; print $2}' Input_1.txt
awk '/INTERNATIONAL CALL/ { getline; print $2}' Input_1.txt
awk '/PRINTER USE/ { getline; print $2}' Input_1.txt

the result is the date (in column 2) for each register
11/01/2009
11/01/2009
11/01/2009
12/01/2009

But how can I follow to get want I want (the other columns below the respective header).

Thanks in advance for any help.

Best regards.
# 2  
Old 01-13-2009
Hi cgkmal,

Do you expect the file to contain a dynamic list of headers? Or will the headers be fixed? meaning it will always have the same header every time?

If it is fixed then we could just hard code the header part as well as other parts of the table that are fixed (ie. values under the EVENT column)

It would be helpful if you could identify this fixed variables of the table (if there are any).
# 3  
Old 01-13-2009
angheloko,

Thanks for your answer. Yes, the words in column 1 that will become in headers, are always the same. The only this is that some Event blocks have less than others event blocks. I mean, not all blocks will have value
below some headers.

Many thanks for the help you can give me.
# 4  
Old 01-14-2009
Hi,

Sorry for the delay. I was pretty busy today...doing documentation (oh the pain! every programmer's nightmare). Anyway, here's a very crude implementation:

Code:
# Extract data only and separate into files
sed 's/^EVENT//g;s/^[A-Z][A-Z][A-Z]*/EVENT  &/g;/^ *$/d;s/ *$//g;s/   */|/g' input.txt > input2.txt
grep -n EVENT input2.txt | cut -d: -f1 | while read X; do
        START=$X
        ((END=X+5))
        sed -n "${START},${END}p" input2.txt > input2.txt.$X
done

# Compose the headers
sed 's/   */|/g;/^ *$/d' input.txt | awk -F"|" ' { print $1 } ' | sort | uniq -ud > headers.txt
grep -v [A-Z][A-Z][A-Z]* headers.txt > colheaders.txt
grep [A-Z][A-Z][A-Z]* headers.txt | sed '/^EVENT/d' > rowheaders.txt

# Create the unformatted output
LINE="EVENT     "`cat colheaders.txt | tr "\n" "\t"`
echo "$LINE" > output.txt
cat rowheaders.txt | while read X; do
        echo "---"
        echo "Row: $X"
        FILE=`grep "$X" input2.txt.* | cut -d: -f1`
        echo "File: $FILE"
        LINE="$X"
        cat colheaders.txt | while read Y; do
                echo "$Y"
                LINE="$LINE     "`awk ' BEGIN { FS="|" } $1==key { print $field } ' key="$Y" field=2 $FILE`
        done
        echo ">> $LINE"
        echo "$LINE" >> output.txt
done

# Make it pretty
awk '
BEGIN { FS="\t" }

{ printf ("%-22s%-11s%-13s%-16s%-15s%-26s%-10s\n", $1, $4, $6, $7, $3, $2, $5) }

' output.txt > output2.txt

Basically,
the input file in input.txt
the output file is output2.txt

and some temporary files - input2.txt*, headers.txt, colheaders.txt, rowheaders.txt (just rm them in the end of the script)

Anyway, had to rush it so I know that there could be other more simpler solutions but here you go... Try it yourself...

My output:
Code:
EVENT                 Date       Initial hour Number of users Average of use Average of pages printed  Final hour
INTERNATIONAL CALL    11/01/2009 09:14        21              5 min                                    16:17
INTERNET CONNECTION   11/01/2009 07:30        27              32 min                                   19:00
LOCAL CALL            11/01/2009 07:42        15              7 min                                    16:11
PRINTER USE           12/01/2009 07:30        23                             17                        19:00

# 5  
Old 01-14-2009
hi, a little difficult, hope can helop you.

1> convert your file into strict two column files

Code:
cvt.sh
sed -n '/EVENT/{
h
N
h
x
s/\n//
p
}
/EVENT/ !{
p
}' yourfile

2> use below perl script to process it
Code:
sub _exist{
	my($ref,$value)=(@_);
	my @arr=@{$ref};
	for(my $i=0;$i<=$#arr;$i++){
		return 1 if $arr[$i] eq $value;
	}
	return 0;
}
$/="\n\n";
open FH,"sh cvt.sh|";
my (%res,$n,@seq);
while(<FH>){
	my @arr=split("\n",$_);
	foreach(@arr){
		my @tmp=split(/  +/,$_);
		$res{$.}->{$tmp[0]}=$tmp[1];
		push @seq,$tmp[0] if (_exist(\@seq,$tmp[0])==0);
	}
	$n++;
}
close FH;
print ((join "        ",@seq),"\n");
map { printf("%20s",$_) } @sep;
for($i=1;$i<=$n;$i++){
	my %hash=%{$res{$i}};
	map {printf("%20s",$hash{$_})} @seq;
	print "\n";
}

output:
Code:
EVENT        Date        Initial hour        Number of users        Average of use        Final hour        Average of pages printed
 INTERNET CONNECTION          11/01/2009               07:30                  27              32 min               19:00     
          LOCAL CALL          11/01/2009               07:42                  15               7 min               16:11     
  INTERNATIONAL CALL          11/01/2009               09:14                  21               5 min               16:17     
         PRINTER USE          12/01/2009               07:30                  23                                   19:00                  17

# 6  
Old 01-14-2009
Hello angheloko and summer_cherry,

Many thanks for take some of your time to help me. I got some errors testing your codes. Explanations below.

angheloko,

May you help saying what I´m doing wrong or how did you send the scripts? because I see it works for you, and for me doesn´t show the complete answer.

If a leave only input.txt and your script within a folder and run it step by step the behaviour is as follow:

(1)
If I run the first script (# Extract data only and separate into files) looks good so far and generates:


input2.txt (adding some pipes)
input2.txt.10
input2.txt.18
input2.txt.2
input2.txt.26

(2)
If I run the 2nd script (# Compose the headers) looks good so far and generates:


colheaders.txt
headers.txt
rowheaders.txt

(3)
If I run the 3rd script (# Create the unformatted output) looks good so far and generates:


output.txt -->(looks like transpose column into line, but appears some squares in the format)

example:


Code:
	input.txt:INTERNET CONNECTION

	input.txt:Date

(4)
If I run the 4th script (# Make it pretty) generates:


output2.txt

and only shows

Code:
EVENT     input.txt:EVENT                                
input.txt:Initial hour               07:30     
input.txt:Average of use             32 min    
input.txt:Final hour                 19:00     
input.txt:Date                       11/01/2009
input.txt:INTERNET CONNECTION                  
input.txt:Number of users            27

(5)
If I run the complete script within a folder with other files in it I get an error.
(It´s not too relevant, I only isolated the files and ran it again, information only)

Code:
sed: "input.txt", line 30: warning: newline appended
sed: input.txt: cannot open [No such file or directory]
---
Ö►╚¤¦S┌Dºhrar:}' input.txt@t ¿É.n☺D☻☻j¿{D─].:↔3       CDRs_1.pl*ÒÞ↨
          I┤▬↕_Y∟GòWâ╔►ë►!È@#a-¿»╗ÈѬÆû]ÍH¼Wâ▲M¯~ao▼═▄7▄¦Z/▒/¾
ßzÑ%ÝÖÛ"L©/¬¡           Í°C-QÕ~♥ffr?
File:        é/D*¨K█¢J┌b
CDRs_1.sh:cvt.sh
Ö►╚¤¦S┌Dºhr:}' input.txt@t ¿É.n☺D☻☻j¿{D─].:↔3 CDRs_1.pl*ÒÞ↨
          I┤▬↕_Y∟GòWâ╔►ë►!È@#a-¿»╗ÈѬÆû]ÍH¼Wâ▲M¯~ao▼═▄7▄¦Z/▒/¾
ßzÑ%ÝÖÛ"L©/¬¡           Í°C-QÕ~♥ffr?
$            é/D*¨K█¢J┌b


summer_chery,

Thanks for your help, really. But I tryed to test it, the first part looks like work for me withot errors,
but when I try to send the second script I get


Code:
[root@trm72 cc]# ./script.pl
./script.pl: line 2: sub: command not found
./script.pl: line 3: syntax error near unexpected token `$ref,$value'
'/script.pl: line 3: `  my($ref,$value)=(@_);
[root@trm72 cc]#

May you help saying what I´m doing wrong or how did you send the scripts? because I see it works for you.


Thanks for your help again.
# 7  
Old 01-14-2009
Hi cgkmal,

Could you post the flat file (source file) and the outputs of the script (input2.txt, etc...) so we can isolate what part that caused the error.

In my case, when I run the script:

input.txt (source file):
Code:
EVENT
INTERNET CONNECTION
Date                       11/01/2009
Initial hour               07:30
Number of users            27
Average of use             32 min
Final hour                 19:00

EVENT
LOCAL CALL
Date                       11/01/2009
Initial hour               07:42
Number of users            15
Average of use             7 min
Final hour                 16:11

EVENT
INTERNATIONAL CALL
Date                       11/01/2009
Initial hour               09:14
Number of users            21
Average of use             5 min
Final hour                 16:17

EVENT
PRINTER USE
Date                       12/01/2009
Initial hour               07:30
Number of users            23
Average of pages printed   17
Final hour                 19:00

input2.txt (processed input.txt):
Code:
EVENT|INTERNET CONNECTION
Date|11/01/2009
Initial hour|07:30
Number of users|27
Average of use|32 min
Final hour|19:00
EVENT|LOCAL CALL
Date|11/01/2009
Initial hour|07:42
Number of users|15
Average of use|7 min
Final hour|16:11
EVENT|INTERNATIONAL CALL
Date|11/01/2009
Initial hour|09:14
Number of users|21
Average of use|5 min
Final hour|16:17
EVENT|PRINTER USE
Date|12/01/2009
Initial hour|07:30
Number of users|23
Average of pages printed|17
Final hour|19:00

input2.txt.(n) (separated records):
Code:
EVENT|INTERNET CONNECTION
Date|11/01/2009
Initial hour|07:30
Number of users|27
Average of use|32 min
Final hour|19:00

headers.txt (row and column headers):
Code:
Average of pages printed
Average of use
Date
EVENT
Final hour
INTERNATIONAL CALL
INTERNET CONNECTION
Initial hour
LOCAL CALL
Number of users
PRINTER USE

colheaders.txt (column headers):
Code:
Average of pages printed
Average of use
Date
Final hour
Initial hour
Number of users

rowheaders.txt (row headers):
Code:
INTERNATIONAL CALL
INTERNET CONNECTION
LOCAL CALL
PRINTER USE

output.txt (awk friendly table - tab-delimited):
Code:
EVENT   Average of pages printed        Average of use  Date    Final hour      Initial hour    Number of users
INTERNATIONAL CALL              5 min   11/01/2009      16:17   09:14   21
INTERNET CONNECTION             32 min  11/01/2009      19:00   07:30   27
LOCAL CALL              7 min   11/01/2009      16:11   07:42   15
PRINTER USE     17              12/01/2009      19:00   07:30   23

output2.txt (formatted output):
Code:
EVENT                 Date       Initial hour Number of users Average of use Average of pages printed  Final hour
INTERNATIONAL CALL    11/01/2009 09:14        21              5 min                                    16:17
INTERNET CONNECTION   11/01/2009 07:30        27              32 min                                   19:00
LOCAL CALL            11/01/2009 07:42        15              7 min                                    16:11
PRINTER USE           12/01/2009 07:30        23                             17                        19:00

Screen looks like this while running the script:
Code:
---
Row: INTERNATIONAL CALL
File: input2.txt.13
Average of pages printed
Average of use
Date
Final hour
Initial hour
Number of users
>> INTERNATIONAL CALL           5 min   11/01/2009      16:17   09:14   21
---
Row: INTERNET CONNECTION
File: input2.txt.1
Average of pages printed
Average of use
Date
Final hour
Initial hour
Number of users
>> INTERNET CONNECTION          32 min  11/01/2009      19:00   07:30   27
---
Row: LOCAL CALL
File: input2.txt.7
Average of pages printed
Average of use
Date
Final hour
Initial hour
Number of users
>> LOCAL CALL           7 min   11/01/2009      16:11   07:42   15
---
Row: PRINTER USE
File: input2.txt.19
Average of pages printed
Average of use
Date
Final hour
Initial hour
Number of users
>> PRINTER USE  17              12/01/2009      19:00   07:30   23

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Transpose rows to certain columns

Hello, I have the following data and I want to use awk to transpose each value to a certain column , so in case the value is not available the column should be empty. Example: Box Name: BoxA Weight: 1 Length :2 Depth :3 Color: red Box Name: BoxB Weight: 3 Length :4 Color: Yellow... (5 Replies)
Discussion started by: rahman.ahmed
5 Replies

2. Shell Programming and Scripting

Transpose columns to row

Gents Using the attached file and using this code. awk '{print substr($0,4,2)}' input.txt | sort -k1n | awk '{a++}END{for(i in a) print i,a}' | sort -k1 > output i got the this output. 00 739 01 807 02 840 03 735 04 782 05 850 06 754 07 295 08 388 09 670 10 669 11 762 (8 Replies)
Discussion started by: jiam912
8 Replies

3. Shell Programming and Scripting

Transpose comma delimited data in rows to columns

Hello, I have a bilingual database with the following structure a,b,c=d,e,f The right half is in a Left to right script and the second is in a Right to left script as the examples below show What I need is to separate out the database such that the first word on the left hand matches the first... (4 Replies)
Discussion started by: gimley
4 Replies

4. Shell Programming and Scripting

Transpose lines from individual blocks to unique lines

Hello to all, happy new year 2013! May somebody could help me, is about a very similar problem to the problem I've posted here where the member rdrtx1 and bipinajith helped me a lot. https://www.unix.com/shell-programming-scripting/211147-map-values-blocks-single-line-2.html It is very... (3 Replies)
Discussion started by: Ophiuchus
3 Replies

5. Shell Programming and Scripting

transpose selected columns

Can I transform input like the below ? Note: Insert zeros if there is no value to transform. Input key name score key1 abc 10 key2 abc 20 key1 xxx 100 key2 xxx 20 key1 zzz 0 key2 zzz 29 key3 zzz 129 key1 yyy 39output abc ... (1 Reply)
Discussion started by: quincyjones
1 Replies

6. Shell Programming and Scripting

Transpose Data from Columns to rows

Hello. very new to shell scripting and would like to know if anyone could help me. I have data thats being pulled into a txt file and currently have to manually transpose the data which is taking a long time to do. here is what the data looks like. Server1 -- Date -- Other -- value... (7 Replies)
Discussion started by: Mikes88
7 Replies

7. Shell Programming and Scripting

transpose rows to columns

Any tips on how I can awk the input data to display the desired output per below? Thanking you in advance. input test data: 2 2010-02-16 10:00:00 111111111111 bytes 99999999999 bytes 90% 4 2010-02-16 12:00:00 333333333333 bytes 77777777777 bytes 88% 5 2010-02-16 11:00:00... (4 Replies)
Discussion started by: ux4me
4 Replies

8. Shell Programming and Scripting

Transpose columns to Rows

I have a data A 1 B 2 C 3 D 4 E 5 i would like to change the data A B C D E 1 2 3 4 5 Pls suggest how we can do it in UNIX. Start using code tags, thanks. Also start reading your PM's you get from Mods as well read the Forum Rules. That might not do any harm. (24 Replies)
Discussion started by: aravindj80
24 Replies

9. Shell Programming and Scripting

Transpose Rows Into Columns

I'm aware there are a lot of resources dedicated to the question of transposing rows and columns, but I'm a total newbie at this and the task appears to be beyond me. I have 40 text files with content that looks like this: Dokument 1 von 146 Orange County Register (California) June 26, 2010... (2 Replies)
Discussion started by: spindoctor
2 Replies

10. Shell Programming and Scripting

Transpose columns to Rows : Big data

Hi, I did read a few posts on the subjects, tried out a few solutions, but did not solve my problem. https://www.unix.com/302121568-post11.html https://www.unix.com/shell-programming-scripting/137953-large-file-columns-into-rows-etc-4.html Please help. Problem very similar to the second link... (15 Replies)
Discussion started by: genehunter
15 Replies
Login or Register to Ask a Question