Sponsored Content
Top Forums Shell Programming and Scripting Convert rows to column and add header Post 302920595 by redse171 on Friday 10th of October 2014 11:08:04 AM
Old 10-10-2014
Quote:
Originally Posted by pravin27
How about this ?

Code:
 awk 'NR==FNR{a[$(NF-1)]=$(NF-1);next}
{if (FNR==1) { asort(a) ; printf "Fam\t\tNo\t\tName\t" ; for ( j in a ) { printf a[j] FS }} if ( !b[$1,$2] ) { if ( FNR>1) { for(j in a) {if ( p[a[j]] ) { printf OFS p[a[j]] } else {printf OFS "NULL"} } delete p;} printf "\n"; b[$1,$2]++; printf $1 OFS $2 OFS ;for (i=3;i<=NF-2;i++) { printf $i FS  } ; } if (FNR==1) { asort(a) } ; for ( j in a ) { if ( a[j] == $(NF-1) ) { p[a[j]]=$NF;} }} END {  for(j in a) {if ( p[a[j]] ) { printf OFS p[a[j]] } else {printf OFS "NULL" } } printf "\n";}' OFS="|" testFile testFile

Hi Pravin27,

Thanks a lot! It really works perfectly on my real data. New things in the codes that interest me like asort function. I am wondering about the input file being written twice there too. Anyways, i will study your code first and come back if i don't understand it. Smilie

---------- Post updated at 11:08 AM ---------- Previous update was at 11:02 AM ----------

Quote:
Originally Posted by Akshay Hegde
Try

Input
Code:
[akshay@nio tmp]$ cat file
192.98.1   192.98.192.98.17    VVC family                            Zorro    10
192.98.1   192.98.192.98.17    VVC family                            Ace      1
192.98.1   192.98.192.98.17    VVC family                            Bora     1
192.98.1   192.98.192.98.17    VVC family                            Sakura   5
12.A.4     12.A.4.10.30        channel2 family                       Usopun   1
7.A3.14    7.A3.14.3.1         DuanXon channel family                T-Law    1
7.A3.14    7.A3.14.3.1         DuanXon channel family                Robyn    1
7.A3.14    7.A3.14.3.1         DuanXon channel family                Zorro    1
7.A3.14    7.A3.14.3.1         DuanXon channel family                Ace      1
7.A3.14    7.A3.14.3.1         DuanXon channel family                Bora     1
7.A3.14    7.A3.14.3.1         DuanXon channel family                Sakura   1
7.A3.14    7.A3.14.3.1         DuanXon channel family                Hashir   1
8.M.14     8.M.14.1            potential receptor, channel family    Robyn    1
8.M.14     8.M.14.1.2          potential receptor, channel family    Usopun   1
8.M.14     8.M.14.1.3          potential receptor, channel family    T-Law    1
8.M.14     8.M.14.1.3          potential receptor, channel family    Zorro    2
8.M.14     8.M.14.1.3          potential receptor, channel family    Ace      4
8.M.14     8.M.14.1.3          potential receptor, channel family    Bora     1
8.M.14     8.M.14.1.3          potential receptor, channel family    Sakura   2
1.P.5      1.P.5.18.1          major intrinsic family                Ace      8
1.P.5      1.P.5.18.3          major intrinsic family                Sakura   1
1.P.5      1.P.5.6.4           major intrinsic family                T-Law    1
1.P.5      1.P.5.6.4           major intrinsic family                Robyn    6
1.P.5      1.P.5.6.4           major intrinsic family                Sakura   1

Excel version

Code:
awk '
    {
        # Field 1,2,4 and 5
        f1 = $1  
        f2 = $2 
        f4 = $(NF -1)
        f5 = $NF
                
        # This is for f3
        # We can also do this like $1 = $2 = $(NF -1) = $NF ="" ; f3 = $0
        # But there will be OFS issue, you have set OFS in END block I guess for proper formating

        split($0,d,/[^[:space:]]*/)
        for(i=3; i<=NF-2; i++)
        {
            f3 = sprintf("%s%s%s",f3,d[i],$i)
        }
        
        # Remove Leading and trailing space in f3
        gsub(/^[ \t]+|[ \t]+$/,"",f3)    

        # Index
        key = f1 SUBSEP f2 SUBSEP f3        

        # Uniq index
        D[key]
        
        # Hash with value
        N[key,f4] = f5

        # Uniq User
        U[f4] 
        
        # reset variable
        key = f3 = "" 
    }
    END{
        # Sort User array
        n = asorti(U,copy)        
    
        # Loop through uniq index
        for(key in D)
        {
            # Split key
            split(key,S,SUBSEP)

            # Loop through uniq user
            for(i=1;i<=n;i++)
            {
            
                # User
                u = copy[i]

                # If header not printed then
                if(!header)
                {
                    # Create header
                    hdr = sprintf("%s%s%s",hdr,OFS,u)
                }            
                
                # index we are looking for
                ind = S[1] SUBSEP S[2] SUBSEP S[3] SUBSEP u

                # if key exists and use that value else string null
                val = (ind in N)? N[ind] : "null"
                
                # Save value in variable
                str = sprintf("%s%s%s",str,OFS,val)
            }
            
            # write header if header not set
            if(!header)
            {
                print "Fam","No","Name" hdr
                
                # set flag header
                header = 1
            }
            
            # Print values
            # Quote is added incase if you want to open them in excel as 
            # there are comma inside the field
            print S[1],S[2],"\""S[3]"\"" str
            str = ""
        }
    }
    '     OFS=","  file


Resulting

Code:
Fam,No,Name,Ace,Bora,Hashir,Robyn,Sakura,T-Law,Usopun,Zorro
12.A.4,12.A.4.10.30,"channel2 family",null,null,null,null,null,null,1,null
1.P.5,1.P.5.6.4,"major intrinsic family",null,null,null,6,1,1,null,null
1.P.5,1.P.5.18.3,"major intrinsic family",null,null,null,null,1,null,null,null
8.M.14,8.M.14.1.3,"potential receptor, channel family",4,1,null,null,2,1,null,2
8.M.14,8.M.14.1,"potential receptor, channel family",null,null,null,1,null,null,null,null
8.M.14,8.M.14.1.2,"potential receptor, channel family",null,null,null,null,null,null,1,null
7.A3.14,7.A3.14.3.1,"DuanXon channel family",1,1,1,1,1,1,null,1
1.P.5,1.P.5.18.1,"major intrinsic family",8,null,null,null,null,null,null,null
192.98.1,192.98.192.98.17,"VVC family",1,1,null,null,5,null,null,10

Fixed Width don't blame me if not opening properly on excel
Code:
awk '
    {
        split($0,d,/[^[:space:]]*/)
        for(i=1; i<=NF-2; i++)
        {
            key = sprintf("%s%s%s",key,d[i],$i)
        }
            D[key]
            N[key,$(NF-1)] = $NF; 
            U[$(NF-1)] 
        key = "" 
    }
    END{
        n = asorti(U,copy)
        for(key in D)
        {
            split(key,S,SUBSEP)
            for(i=1;i<=n;i++)
            {
            
                u = copy[i]

                if(!header)
                {
                    hdr = sprintf("%3s%s%3s",hdr,"\t",u)
                }
                val = ((S[1],u) in N)? N[S[1],u] : "null"
                str = sprintf("%3s%s%3s",str,"\t",val)
            }
            if(!header)
            {
                printf("%-100s%s\n",sprintf("%-15s%-25s%s","Fam","No","Name"),hdr);
                header = 1
            }
            printf("%-100s%s\n", S[1],str)
            str = ""
        }
    }
    '  file

Resulting
Code:
Fam            No                       Name                                                               Ace    Bora    Hashir    Robyn    Sakura    T-Law    Usopun    Zorro
1.P.5      1.P.5.18.3          major intrinsic family                                                      null    null    null    null      1    null    null    null
8.M.14     8.M.14.1.3          potential receptor, channel family                                            4      1    null    null      2      1    null      2
8.M.14     8.M.14.1            potential receptor, channel family                                          null    null    null      1    null    null    null    null
192.98.1   192.98.192.98.17    VVC family                                                                    1      1    null    null      5    null    null     10
1.P.5      1.P.5.18.1          major intrinsic family                                                        8    null    null    null    null    null    null    null
1.P.5      1.P.5.6.4           major intrinsic family                                                      null    null    null      6      1      1    null    null
12.A.4     12.A.4.10.30        channel2 family                                                             null    null    null    null    null    null      1    null
8.M.14     8.M.14.1.2          potential receptor, channel family                                          null    null    null    null    null    null      1    null
7.A3.14    7.A3.14.3.1         DuanXon channel family                                                        1      1      1      1      1      1    null      1

Hi Akshay Hegde,

Your codes worked awesome!! Many new things in the codes and i am really2 appreciate your clear explanation for each step. Thanks a million!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Convert Header into Column in Text file

Hi Gurus, I have a requirement like this and have to create a UX shell scripts. Thanks in advance. File-in: ------ Header2007-12-012007-11-21 100|xyz|was 101|wsa|qws ...... ....... Output should be: ------------------- 2007-12-01|100|xyz|was 2007-12-01|101|wsa|qws ...... .......... (7 Replies)
Discussion started by: vsubbu1000
7 Replies

2. Shell Programming and Scripting

convert rows into column

if u have a data 2 4 6 8 5 4 4 5 6 then result shud be like 2 4 6 7 5 4 4 5 6 (3 Replies)
Discussion started by: cdfd123
3 Replies

3. Shell Programming and Scripting

Convert Column to rows

Hi, I have a file with below contents. Heading1 Heading2 Heading3 Heading4 Value1 Value2 Value3 Value4 The file has only 2 rows and is tab separated The desired output is : Heading1 Value1 Heading2 Value2 Heading3 Value3 Heading4 Value4 CAn you please help? (5 Replies)
Discussion started by: kaponeh
5 Replies

4. Shell Programming and Scripting

Convert rows into column groups

Hi I have the text file like this "A" "AA Info" "AA Text" "AAA" "ABC" "ABC Info" "ABC Tech" "AGH" "SYN" "SYMBony" "SYN BEREN" Like about 2000 lines Output would be in Column with groups like following "A" "AA Info", "AA Text" "AAA" "ABC","ABC Info","ABC Tech" (0 Replies)
Discussion started by: selvanraj
0 Replies

5. Shell Programming and Scripting

convert columns into rows with respect to first column

Hello All, Please help me with this file. My input file (Tab separated) is like: Abc-01 pc1 -0.69 Abc-01 E2cR 0.459666666666667 Abc-01 5ez.2 1.2265625 Xyz-01 pc1 -0.153 Xyz-01 E2cR 1.7358 Xyz-01 5ez.2 2.0254 Ced-02 pc1 -0.5714 Ced-02 ... (7 Replies)
Discussion started by: mira
7 Replies

6. Shell Programming and Scripting

Add column header and row header

Hi, I have an input like this 1 2 3 4 2 3 4 5 4 5 6 7 I would like to count the no. of columns and print a header with a prefix "Col". I would also like to count the no. of rows and print as first column with each line number with a prefix "Row" So, my output would be ... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

7. Shell Programming and Scripting

Convert Rows into Column

Hi Experts, I have a requirement to convert rows into columns. For e.g. Input File: Output File should be like Appreciate if you could suggest code snippet(may be awk) for above requirement... Thanks in Advance for your help... (3 Replies)
Discussion started by: sai_2507
3 Replies

8. Shell Programming and Scripting

Convert header rows into

I want to put the 3 first lines into a single line separated by ; I've tried to use Sed and Awk but without success. I'm new to Shell scripting. Thanks in advance! Input 112 DESAC_201309_OR_DJ10 DJ10 1234567890123;8 1234567890124;20 1234567890125;3 expected Output... (8 Replies)
Discussion started by: MoroccanRoll
8 Replies

9. Shell Programming and Scripting

Convert Column data values to rows

Hi all , I have a file with the below content Header Section employee|employee name||Job description|Job code|Unitcode|Account|geography|C1|C2|C3|C4|C5|C6|C7|C8|C9|Csource|Oct|Nov|Dec|Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep Data section ... (1 Reply)
Discussion started by: Hypesslearner
1 Replies

10. Shell Programming and Scripting

Convert rows into column along with header

Hi, I have a requirement to format the data in a new order. Here is my source format : ppp ***Wed Dec 16 10:32:30 GMT 2015 header1 header2 header3 header4 header5 server1 0.00 0.02 0.07 0.98 server2 0.01 0.00 0.08 0.79 server3 0.05 0.82 0.77 0.86 ... (18 Replies)
Discussion started by: john_prince
18 Replies
All times are GMT -4. The time now is 07:25 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy