Transpose data from columns to lines for each event


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Transpose data from columns to lines for each event
# 15  
Old 01-16-2009
Good! Now try this one and see if it works:

Code:
awk '
BEGIN { RS="EVENT"; FS="\n"; cols[0]="EVENT"; totalcols=0; rowno=0 } 
$2 != "" {
    vals[0] = $2
    for (i = 3; i <= NF; i++) {
    
        # Extract column name
        col = substr($i, 1, index($i, "  "))
        sub("^ *", "", col); sub(" *$", "", col)
    
        # See if column already existing
        found = 0
        for (colno = 0; colno <= totalcols; colno++) 
            if ( cols[colno] == col ) found = 1
        
        # If not, set position
        if ( found == 0 ) {
            totalcols++
            colno = totalcols
            cols[colno] = col
        }
        
        # Extract the value only
        val = substr($i, length(col) + 1)
        sub("^ *", "", val); sub(" *$", "", val)
        vals[colno] = val
    }
    
    for (i = 0; i <= totalcols; i++) {
        line[rowno] = line[rowno]","vals[i]
    }
    rowno++;
}
END { 
    for (i = 0; i <= totalcols; i++)
        header=header","cols[i]
    print header
    for (i = 0; i < rowno; i++)
        print line[i]
}
' input.txt | sed 's/^,//g'

# 16  
Old 01-16-2009
angheloko,

Running your last code over the original input.txt the result is:

Code:
EVENT,Initial hour,Number of users,Average of use,Final hour,,Date,Average of pages printed
INT
Date                       11/01/2009,07:30,27,32 min,19:00,
LOCAL CALL,07:30,27,32 min,19:00,,11/01/2009
INT,07:30,27,32 min,19:00,,11/01/2009
Date                       11/01/2009,07:30,27,32 min,19:00,,11/01/2009
PRINT,07:30,27,32 min,19:00,,11/01/2009
Date                       12/01/2009,07:30,27,32 min,19:00,,11/01/2009,17

# 17  
Old 01-16-2009
Wasn't expecting that. Anyway, I changed it a little. Try it again.

Code:
awk '
BEGIN { RS="EVENT"; FS="\n"; cols[0]="EVENT"; totalcols=0; rowno=0 } 
$2 != "" {
	vals[0] = $2

	for (i = 3; i <= NF; i++) {
	
		# Extract column name
		col = substr($i, 1, index($i, "  "))
		sub("^ *", "", col); sub(" *$", "", col)
	
		# See if column already existing
		found = 0
		for (colno = 0; colno <= totalcols; colno++) 
			if ( cols[colno] == col ) found = 1
		
		# If not, set position
		if ( found == 0 && col != "" ) {
			totalcols++
			colno = totalcols
			cols[colno] = col
			print colno": "col
		}
		
		# Extract the value only
		val = substr($i, length(col) + 1)
		sub("^ *", "", val); sub(" *$", "", val)
		vals[colno] = val
	}
	
	for (i = 0; i <= totalcols; i++) {
		line[rowno] = line[rowno]","vals[i]
	}
	rowno++;
}
END { 
	for (i = 0; i <= totalcols; i++)
		header=header","cols[i]
	print header
	for (i = 0; i < rowno; i++)
		print line[i]
}
' input.txt | sed 's/^,//g'

input.txt is:

Code:
EVENT
INTERNET CONNECTION
Date                       11/01/2009
Initial hour               07:30
Number of users            27
Average of use             32 min
Final hour                 19:00
EVENT
LOCAL CALL
Date                       11/01/2009
Initial hour               07:42
Number of users            15
Average of use             7 min
Final hour                 16:11
EVENT
INTERNATIONAL CALL
Date                       11/01/2009
Initial hour               09:14
Number of users            21
Average of use             5 min
Final hour                 16:17
EVENT
PRINTER USE
Date                       12/01/2009
Initial hour               07:30
Number of users            23
Average of pages printed   17
Final hour                 19:00

O/P is:
Code:
EVENT,Date,Initial hour,Number of users,Average of use,Final hour,Average of pages printed
INTERNET CONNECTION,11/01/2009,07:30,27,32 min,19:00
LOCAL CALL,11/01/2009,07:30,27,32 min,19:00
INTERNATIONAL CALL,11/01/2009,07:30,27,32 min,19:00
PRINTER USE,11/01/2009,07:30,27,32 min,19:00,17

# 18  
Old 01-16-2009
Well,

Now I receive this:

Code:
 
1: Initial hour
2: Number of users
3: Average of use
4: Final hour
5: Date
6: Average of pages printed
EVENT,Initial hour,Number of users,Average of use,Final hour,Date,Average of pages printed
07:30,27,32 min,19:00
11/01/2009
 
17

I confused why you get different results with the same code.Smilie

It could be something in format of both columns?

Well, thanks for your helpSmilie
# 19  
Old 01-16-2009
It's the machine. If we had the same OS this would have been solved earlier Smilie

Anyway, made some changes again and got the desire results with this one. Go try and post the results.

Code:
awk '
BEGIN { RS="EVENT"; FS="\n"; cols[0]="EVENT"; totalcols=0; rowno=0 }

$2 != "" {
    line=line"EVENT  "$2
    for (i=3; i<NF; i++) {
        line=line","$i
        
        # Extract column name
        col = substr($i, 1, index($i, "  "))
        sub("^ *", "", col); sub(" *$", "", col)
    
        # See if column already existing
        found = 0
        for (j = 0; j <= totalcols; j++) 
            if ( cols[j] == col ) found = 1
        
        # If not, set position
        if ( found == 0 && col != "" ) {
            totalcols++
            j = totalcols
            cols[j] = col
        }
    }
    line=line"|"
}

END {
    for (i = 0; i <= totalcols; i++)
        header=header","cols[i]
    print header
    
    # Split into records
    top=split(line, records, "|")
    for (i=1; i<top; i++) {
    
        # Split into fields
        top2=split(records[i], fields, ",")
        for (j=0; j<=totalcols; j++) {
            # Extract column name
            col = substr($i, 1, index($i, "  "))
            sub("^ *", "", col); sub(" *$", "", col)
            
            found=0
            for (k=0; k<=top2; k++) {
                
                # Extract column name
                col = substr(fields[k], 1, index(fields[k], "  "))
                sub("^ *", "", col); sub(" *$", "", col)
            
                #print ">"cols[j]": "fields[k]": "col
                if (cols[j] == col) {
                    # Extract the value only
                    val = substr(fields[k], length(col) + 1)
                    sub("^ *", "", val); sub(" *$", "", val)
                    row=row","val
                    found=1
                    #print "found: "val
                }
            }
            if (found==0) row=row","
        }
        print row
        row=""
    }
}
' input.txt | sed 's/^,//g' > input.txt.tmp

# Make it pretty - This is where you'll make adjustments for the output
awk '
BEGIN { FS="," }

{ printf ("%-22s%-11s%-13s%-16s%-15s%-11s%s\n", $1, $2, $3, $4, $5, $6, $7) }

' input.txt.tmp

input.txt:

Code:
EVENT
INTERNET CONNECTION
Date                       11/01/2009
Initial hour               07:30
Number of users            27
Average of use             32 min
Final hour                 19:00
EVENT
LOCAL CALL
Date                       11/01/2009
Initial hour               07:42
Number of users            15
Average of use             7 min
Final hour                 16:11
EVENT
INTERNATIONAL CALL
Date                       11/01/2009
Initial hour               09:14
Number of users            21
Average of use             5 min
Final hour                 16:17
EVENT
PRINTER USE
Date                       12/01/2009
Initial hour               07:30
Number of users            23
Average of pages printed   17
Final hour                 19:00

# 20  
Old 01-16-2009
Hi angheloko,

I´ve tryed, and looks better each time. This time I can see 3 things:

1- It´s putting "Date" like row header and it has to be like column header.

2- The "words" of column 1 that will become in row headers are being splitted, for example "INTERNET CONNECTION"
in the output.txt only appears like "INT" and the same for the other.

3-The column headers are being joined with the next column header in some cases,
example: "Initial hourNumber of usersAverage of use"

This is my new output for me in this time:

Code:
 
EVENT                 Initial hourNumber of usersAverage of use  Final hour     Date       Average of pages printed
INT                                                                                     
Date                       11/01/200907:30      27           32 min          19:00                     
LOCAL CALL            07:42      15           7 min           16:11          11/01/2009 
INT                                                                                     
Date                       11/01/200909:14      21           5 min           16:17                     
PRINT                                                                                   
Date                       12/01/200907:30      23                           19:00                     17

a question:

In the code I see that looks variable, if the lines within blocks are more than for or 5 the code will process it or it´s taking
fixed number of column headers and row headers?

Very appreciated your help angheloko, and I´m learning a little bit each time with your help.

Best regards.
# 21  
Old 01-16-2009
Hi cg,

Maybe our input files have different formats because when I test it, it's rendering perfectly.

As for your question, the code will take a variable number of row and column headers. As long as:

Row headers and their values are separated by 2 or more spaces (not tabs)

- and -

Column headers are located after the EVENT keyword.

I tested it with this:

Code:
EVENT
INTERNET CONNECTION
Date                       11/01/2009
Initial hour               07:30
Number of users            27
Final hour                 19:00
EVENT
LOCAL CALL
Date                       11/01/2009
Initial hour               07:42
Number of users            15
Average of use             7 min
Final hour                 16:11
EVENT
INTERNATIONAL CALL
Date                       11/01/2009
Initial hour               09:14
Number of users            21
Average of use             5 min
Final hour                 16:17
EVENT
PRINTER USE
Date                       12/01/2009
Initial hour               07:30
Number of users            23
Average of pages printed   17
Final hour                 19:00

and the output is:

Code:
EVENT                 Date       Initial hour Number of users Final hour     Average of use Average of pages printed
INTERNET CONNECTION   11/01/2009 07:30        27              19:00                     
LOCAL CALL            11/01/2009 07:42        15              16:11          7 min      
INTERNATIONAL CALL    11/01/2009 09:14        21              16:17          5 min      
PRINTER USE           12/01/2009 07:30        23              19:00                         17

As you can noticed, I removed the value for 'Average of use' for INTERNET CONNECTION event and the output was correct.

I think its the input file. Make sure the it is in the following format:

Code:
EVENT
<row_header_1>
<column_1><2 or more spaces(not tabs)><value_1>
<column_2><2 or more spaces(not tabs)><value_2>
<column_3><2 or more spaces(not tabs)><value_3>
...
<column_n><2 or more spaces(not tabs)><value_n>
EVENT
<row_header_2>
<column_1><2 or more spaces(not tabs)><value_1>
<column_2><2 or more spaces(not tabs)><value_2>
<column_3><2 or more spaces(not tabs)><value_3>
...
<column_n><2 or more spaces(not tabs)><value_n>

Believe me, man. I'm learning more and more about awk each time I look at your problem.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Transpose rows to certain columns

Hello, I have the following data and I want to use awk to transpose each value to a certain column , so in case the value is not available the column should be empty. Example: Box Name: BoxA Weight: 1 Length :2 Depth :3 Color: red Box Name: BoxB Weight: 3 Length :4 Color: Yellow... (5 Replies)
Discussion started by: rahman.ahmed
5 Replies

2. Shell Programming and Scripting

Transpose columns to row

Gents Using the attached file and using this code. awk '{print substr($0,4,2)}' input.txt | sort -k1n | awk '{a++}END{for(i in a) print i,a}' | sort -k1 > output i got the this output. 00 739 01 807 02 840 03 735 04 782 05 850 06 754 07 295 08 388 09 670 10 669 11 762 (8 Replies)
Discussion started by: jiam912
8 Replies

3. Shell Programming and Scripting

Transpose comma delimited data in rows to columns

Hello, I have a bilingual database with the following structure a,b,c=d,e,f The right half is in a Left to right script and the second is in a Right to left script as the examples below show What I need is to separate out the database such that the first word on the left hand matches the first... (4 Replies)
Discussion started by: gimley
4 Replies

4. Shell Programming and Scripting

Transpose lines from individual blocks to unique lines

Hello to all, happy new year 2013! May somebody could help me, is about a very similar problem to the problem I've posted here where the member rdrtx1 and bipinajith helped me a lot. https://www.unix.com/shell-programming-scripting/211147-map-values-blocks-single-line-2.html It is very... (3 Replies)
Discussion started by: Ophiuchus
3 Replies

5. Shell Programming and Scripting

transpose selected columns

Can I transform input like the below ? Note: Insert zeros if there is no value to transform. Input key name score key1 abc 10 key2 abc 20 key1 xxx 100 key2 xxx 20 key1 zzz 0 key2 zzz 29 key3 zzz 129 key1 yyy 39output abc ... (1 Reply)
Discussion started by: quincyjones
1 Replies

6. Shell Programming and Scripting

Transpose Data from Columns to rows

Hello. very new to shell scripting and would like to know if anyone could help me. I have data thats being pulled into a txt file and currently have to manually transpose the data which is taking a long time to do. here is what the data looks like. Server1 -- Date -- Other -- value... (7 Replies)
Discussion started by: Mikes88
7 Replies

7. Shell Programming and Scripting

transpose rows to columns

Any tips on how I can awk the input data to display the desired output per below? Thanking you in advance. input test data: 2 2010-02-16 10:00:00 111111111111 bytes 99999999999 bytes 90% 4 2010-02-16 12:00:00 333333333333 bytes 77777777777 bytes 88% 5 2010-02-16 11:00:00... (4 Replies)
Discussion started by: ux4me
4 Replies

8. Shell Programming and Scripting

Transpose columns to Rows

I have a data A 1 B 2 C 3 D 4 E 5 i would like to change the data A B C D E 1 2 3 4 5 Pls suggest how we can do it in UNIX. Start using code tags, thanks. Also start reading your PM's you get from Mods as well read the Forum Rules. That might not do any harm. (24 Replies)
Discussion started by: aravindj80
24 Replies

9. Shell Programming and Scripting

Transpose Rows Into Columns

I'm aware there are a lot of resources dedicated to the question of transposing rows and columns, but I'm a total newbie at this and the task appears to be beyond me. I have 40 text files with content that looks like this: Dokument 1 von 146 Orange County Register (California) June 26, 2010... (2 Replies)
Discussion started by: spindoctor
2 Replies

10. Shell Programming and Scripting

Transpose columns to Rows : Big data

Hi, I did read a few posts on the subjects, tried out a few solutions, but did not solve my problem. https://www.unix.com/302121568-post11.html https://www.unix.com/shell-programming-scripting/137953-large-file-columns-into-rows-etc-4.html Please help. Problem very similar to the second link... (15 Replies)
Discussion started by: genehunter
15 Replies
Login or Register to Ask a Question