Rows into columns?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Rows into columns?
# 1  
Old 06-06-2010
Rows into columns?

I have a file thats space delimited that looks something like this:

Code:
Joe Smith jsmith 43234 bill1;bill2;read;read2;schedule
Andy Summers asummers 11232 bill1;read
Beth McConnel bmconnel 43443 read;read2;schedule;bill
Susan Fowler sfowler 09332 bill1;read;schedule

I need to transpose/manupilate the rows to look like:

Code:
Joe Smith jsmith 43234 bill1
Joe Smith jsmith 43234 bill2
Joe Smith jsmith 43234 read
Joe Smith jsmith 43234 read2
Joe Smith jsmith 43234 schedule
Andy Summers asummers 11232 bill1
Andy Summers asummers 11232 read
Beth McConnel bmconnel 43443 read
Beth McConnel bmconnel 43443 read2
Beth McConnel bmconnel 43443 schedule
Beth McConnel bmconnel 43443 bill
Susan Fowler sfowler 09332 bill1
Susan Fowler sfowler 09332 read
Susan Fowler sfowler 09332 schedule

I guess the good thing is that the row I need to transpose will always start in column 5 (user function) and that the items I need to transpose are semi colon delmited, but its varaible, sometimes there is only 1 entry, other times there could be as many as 7.

I am able to do elementary transpostions like making 1;2;3;4 into
Code:
1
2
3
4

But I am having a hard time capturing the preceeding info and repeating it per user function. Can anyone provide any hints or suggestions?

Last edited by Scott; 06-06-2010 at 05:28 PM.. Reason: Code tags
# 2  
Old 06-06-2010
Code:
awk '{n=split($5,a,";");for(i=1;i<=n;i++){print $1,$2,$3,$4,a[i]}}' infile > outfile

This User Gave Thanks to bartus11 For This Post:
# 3  
Old 06-06-2010
I'm sorry to keep clogging people's posts. I've been practicing trying to find pure shell (or close to) solutions to problems, so while this may not be the most practical solution for your needs, I'd appreciate if anyone could tell me if there's a better way I could have solved this guys problem than what I came up with:

Code:
(23:43:55\[D@DeCoBox15)
[~]$ cat input
Joe Smith jsmith 43234 bill1;bill2;read;read2;schedule
Andy Summers asummers 11232 bill1;read
Beth McConnel bmconnel 43443 read;read2;schedule;bill
Susan Fowler sfowler 09332 bill1;read;schedule

(01:16:30\[D@DeCoBox15)
[~]$ cat mysolution
#!/bin/bash
    tr ";" "\n" < $1 | while read line; do
        if [[ $(echo $line|wc -w) -gt 1 ]]; then
            array=($line);
            echo "${array[*]}";
        else
            echo "${array[0]} ${array[1]} ${array[2]} ${array[3]} $line";
        fi;
    done

(01:16:54\[D@DeCoBox15)
[~]$ ./mysolution input
Joe Smith jsmith 43234 bill1
Joe Smith jsmith 43234 bill2
Joe Smith jsmith 43234 read
Joe Smith jsmith 43234 read2
Joe Smith jsmith 43234 schedule
Andy Summers asummers 11232 bill1
Andy Summers asummers 11232 read
Beth McConnel bmconnel 43443 read
Beth McConnel bmconnel 43443 read2
Beth McConnel bmconnel 43443 schedule
Beth McConnel bmconnel 43443 bill
Susan Fowler sfowler 09332 bill1
Susan Fowler sfowler 09332 read
Susan Fowler sfowler 09332 schedule

I think I could have eliminated the tr if I changed the IFS and tweaked the way I set the names, but this is the best I could come up with.

Damn, after trying out Bartus's solution, I just feel silly now...

Code:
[~]$ time awk '{n=split($5,a,";");for(i=1;i<=n;i++){print $1,$2,$3,$4,a[i]}}' pxe > /dev/null

real    0m0.088s
user    0m0.000s
sys     0m0.031s

(23:49:50\[D@DeCoBox15)
[~]$ time ./tst > /dev/null

real    0m2.338s
user    0m0.391s
sys     0m0.516s


Last edited by DeCoTwc; 06-06-2010 at 07:17 PM.. Reason: completely boinked it the first time
# 4  
Old 06-06-2010
This one probably won't win any performance benchmark, but it should produce the desired output.
Code:
while read line; do
firstpart=$(echo "$line" | sed 's/^\(.*[0-9]\{5\}\).*/\1/')
records=$(echo "$line" | awk '{print $NF }' | sed 's/;/ /g')
for i in $(echo $records); do
echo $firstpart $i
done
done <file

# 5  
Old 06-06-2010
Quote:
Originally Posted by pseudocoder
This one probably won't win any performance benchmark, but it should produce the desired output.
Code:
while read line; do
firstpart=$(echo "$line" | sed 's/^\(.*[0-9]\{5\}\).*/\1/')
records=$(echo "$line" | awk '{print $NF }' | sed 's/;/ /g')
for i in $(echo $records); do
echo $firstpart $i
done
done <file

Ok...I just don't get it. Your loop involves creating multiple variables with multiple calls to sed and awk, and my loop only has wc -w in it.

Is 6 calls of wc -w really slower than 4 calls of awk and 8 calls of sed?

Code:
(01:21:43\[D@DeCoBox15)
[~]$ time ./mysolution input
Joe Smith jsmith 43234 bill1
Joe Smith jsmith 43234 bill2
Joe Smith jsmith 43234 read
Joe Smith jsmith 43234 read2
Joe Smith jsmith 43234 schedule
Andy Summers asummers 11232 bill1
Andy Summers asummers 11232 read
Beth McConnel bmconnel 43443 read
Beth McConnel bmconnel 43443 read2
Beth McConnel bmconnel 43443 schedule
Beth McConnel bmconnel 43443 bill
Susan Fowler sfowler 09332 bill1
Susan Fowler sfowler 09332 read
Susan Fowler sfowler 09332 schedule

real    0m2.092s
user    0m0.272s
sys     0m0.468s


(00:50:43\[D@DeCoBox15)
[~]$ time ./yoursolution input
Joe Smith jsmith 43234 bill1
Joe Smith jsmith 43234 bill2
Joe Smith jsmith 43234 read
Joe Smith jsmith 43234 read2
Joe Smith jsmith 43234 schedule
Andy Summers asummers 11232 bill1
Andy Summers asummers 11232 read
Beth McConnel bmconnel 43443 read
Beth McConnel bmconnel 43443 read2
Beth McConnel bmconnel 43443 schedule
Beth McConnel bmconnel 43443 bill
Susan Fowler sfowler 09332 bill1
Susan Fowler sfowler 09332 read
Susan Fowler sfowler 09332 schedule

real    0m1.622s
user    0m0.151s
sys     0m0.394s


Last edited by DeCoTwc; 06-06-2010 at 07:22 PM..
# 6  
Old 06-06-2010
And on top of that your solution doesn't really give correct results :P At least from the output you are showing here.
Quote:
Originally Posted by DeCoTwc
Ok...I just don't get it. Your loop involves creating multiple variables with multiple calls to sed and awk, and my loop only has wc -w in it.

Is 6 calls of wc -w really slower than 4 calls of awk and 8 calls of sed?

Code:
(00:50:33\[D@DeCoBox15)
[~]$ time ./mysolution input
Joe Smith jsmith 43234 bill1
Joe Smith 43234 bill1 bill2
Joe Smith 43234 bill1 read
Joe Smith 43234 bill1 read2
Joe Smith 43234 bill1 schedule
Andy Summers asummers 11232 bill1
Andy Summers 11232 bill1 read
Beth McConnel bmconnel 43443 read
Beth McConnel 43443 read read2
Beth McConnel 43443 read schedule
Beth McConnel 43443 read bill
Susan Fowler sfowler 09332 bill1
Susan Fowler 09332 bill1 read
Susan Fowler 09332 bill1 schedule

real    0m2.273s
user    0m0.286s
sys     0m0.457s

(00:50:43\[D@DeCoBox15)
[~]$ time ./yoursolution input
Joe Smith jsmith 43234 bill1
Joe Smith jsmith 43234 bill2
Joe Smith jsmith 43234 read
Joe Smith jsmith 43234 read2
Joe Smith jsmith 43234 schedule
Andy Summers asummers 11232 bill1
Andy Summers asummers 11232 read
Beth McConnel bmconnel 43443 read
Beth McConnel bmconnel 43443 read2
Beth McConnel bmconnel 43443 schedule
Beth McConnel bmconnel 43443 bill
Susan Fowler sfowler 09332 bill1
Susan Fowler sfowler 09332 read
Susan Fowler sfowler 09332 schedule

real    0m1.622s
user    0m0.151s
sys     0m0.394s

# 7  
Old 06-06-2010
Quote:
Originally Posted by bartus11
And on top of that your solution doesn't really give correct results :P At least from the output you are showing here.
Well I'll be a son of a something else...fixed it.

Code:
(01:18:49\[D@DeCoBox15)
[~]$ ./mysolution input > my_results

(01:19:05\[D@DeCoBox15)
[~]$ sdiff expected_results my_results
Joe Smith jsmith 43234 bill1                                    Joe Smith jsmith 43234 bill1
Joe Smith jsmith 43234 bill2                                    Joe Smith jsmith 43234 bill2
Joe Smith jsmith 43234 read                                     Joe Smith jsmith 43234 read
Joe Smith jsmith 43234 read2                                    Joe Smith jsmith 43234 read2
Joe Smith jsmith 43234 schedule                                 Joe Smith jsmith 43234 schedule
Andy Summers asummers 11232 bill1                               Andy Summers asummers 11232 bill1
Andy Summers asummers 11232 read                                Andy Summers asummers 11232 read
Beth McConnel bmconnel 43443 read                               Beth McConnel bmconnel 43443 read
Beth McConnel bmconnel 43443 read2                              Beth McConnel bmconnel 43443 read2
Beth McConnel bmconnel 43443 schedule                           Beth McConnel bmconnel 43443 schedule
Beth McConnel bmconnel 43443 bill                               Beth McConnel bmconnel 43443 bill
Susan Fowler sfowler 09332 bill1                                Susan Fowler sfowler 09332 bill1
Susan Fowler sfowler 09332 read                                 Susan Fowler sfowler 09332 read
Susan Fowler sfowler 09332 schedule                             Susan Fowler sfowler 09332 schedule

(01:19:15\[D@DeCoBox15)
[~]$

This User Gave Thanks to DeCoTwc For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Columns to rows

HI UNIX Gurus, Not sure if this was already asked and an UNIX Guru has replied but I could not find what i wanted. I have linux environment and need help on this. I have several files like this. a,1 b,1 utc,10/12/2019 local,10/12/2018 name,xxxy deg,feh 10,12 20,8 30,50 32,64 46,65... (5 Replies)
Discussion started by: Roopensingh
5 Replies

2. Shell Programming and Scripting

Rows to columns

Hi, I have a text file with records as below Service Contract: Account Type: Client Number: Group Number: Account Currency: I want to print 2nd,3rd and 5th as a separate column, like -> Account Type: ,Client Number: ,Account Currency: How can I do that? (1 Reply)
Discussion started by: dsid
1 Replies

3. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

4. Shell Programming and Scripting

Evaluate 2 columns, add sum IF two columns match on two rows

Hi all, I know this sounds suspiciously like a homework course; but, it is not. My goal is to take a file, and match my "ID" column to the "Date" column, if those conditions are true, add the total number of minutes worked and place it in this file, while not printing the original rows that I... (6 Replies)
Discussion started by: mtucker6784
6 Replies

5. Shell Programming and Scripting

Deleting all the fields(columns) from a .csv file if all rows in that columns are blanks

Hi Friends, I have come across some files where some of the columns don not have data. Key, Data1,Data2,Data3,Data4,Data5 A,5,6,,10,, A,3,4,,3,, B,1,,4,5,, B,2,,3,4,, If we see the above data on Data5 column do not have any row got filled. So remove only that column(Here Data5) and... (4 Replies)
Discussion started by: ks_reddy
4 Replies

6. Shell Programming and Scripting

Rows to Columns

Hi Guru's, I have a requirement where i need to convert rows to column based on a key column. Input: Account_id|Trip_Org|Trip_Dest|City|Hotel_Nm 123|DFW|CHI|Dallas|Hyatt 123|LAS|LPA|Vegas|Hyatt Palace Output:... (6 Replies)
Discussion started by: rakesh5300
6 Replies

7. Shell Programming and Scripting

Rows into Columns

Input File vCenter Servers: 172.28.173.207: vCenter Server connectivity status: Accessible ESX servers: Name: nyp-vhst1001-at.hq.nt.life.com IP address: 10.34.36.11 Virtual machines: Name:nyp-bbmds-at Ip address: 172.28.173.139 ... (1 Reply)
Discussion started by: greycells
1 Replies

8. Shell Programming and Scripting

rows to columns

Hi Friends, I have a input file as below. how to convert rows to columns? Friday:recharge 3861140 Monday:recharge 4036228 Saturday:recharge 3996376 Sunday:recharge 3777749 Thursday:recharge 3858537 Tuesday:recharge 4047045 Wednesday:recharge 3954798 desinred output Sunday ... (3 Replies)
Discussion started by: suresh3566
3 Replies

9. Shell Programming and Scripting

Columns into rows

Hi, Let me know how to achieve the below requirment Input: ======== BEGIN DSSUBRECORD Name "DOC_NO_2" SqlType "-5" Precision "0" Scale "0" Nullable "0" END DSSUBRECORD BEGIN DSSUBRECORD Name "FROM_LOC" ... (1 Reply)
Discussion started by: Ramya_1104
1 Replies

10. Shell Programming and Scripting

# of rows and columns

Hi, Does anyone know the command to know the # of rows and columns for a file? thanks (3 Replies)
Discussion started by: kylle345
3 Replies
Login or Register to Ask a Question