Reading columns, making a new file using another as template


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reading columns, making a new file using another as template
# 1  
Old 03-09-2012
Reading columns, making a new file using another as template

Hi fellas,
I have two files such as:
File 1
interacao,AspAsp,AspCys,CysAsp,CysCys,classe

File 2
interacao,AspAsp,CysAsp,AspCys,CysCys,classe
beta_alfa, DA, CA, DD, CD,ppi

Thus, I want to make a File 3 using the File 1 as model:
e.g.
File 3
interacao,AspAsp,AspCys,CysAsp,CysCys,classe
beta_alfa, DA, DD, CA, CD,ppi

NOTE: I inseted the spaces in the File 2 and File 3 examples just to be easier to see.

In the example I just gave 6 columns, but my real file has 402 columns.
So, to do an awk -F "," '{print $1","$2","$4","$3","$5","$6}'
will not work because I dont know the position of the itens of the File 1 in the File 2 (for example the AspCys could be the sixth, the second, or the last columns).

I hope that you can help me and I would like an small explanation of the code, because I'm newbie and do not know a lot the commands.

Thanks in advance.
# 2  
Old 03-09-2012
Had an almost identical problem a while back, looking for what I wrote.

Meanwhile, you don't have to do "," all the time, you can control that with OFS

Code:
$ echo a b c d | awk -v OFS="," '{ $1=$1 } 1'
a,b,c,d

$

# 3  
Old 03-09-2012
This is probably overkill, but I had 500 megabytes of extremely messy flatfiles to merge and sort. This ought to be reliable if not fast, tolerant of things like missing columns.

Code:
$ cat col.awk

# Set up input and output separators
BEGIN { FS=","  ;       OFS="," }

# First line in a file?  Figure out what our columns are.
FNR == 1 {
        if(NR==1) # Very first line in very first file
        {
                COLMAX=NF
                # Mark down the contents of all the columns
                for(N=1; N<=NF; N++)
                {
                        ORDER[N]=$N
                        ORDER[$N]=N
                        REORDER[N]=N
                }
                print # Print columns
                next # Go to next line
        }

        # First line in the second/third/fourth file?  Find out how we need to reorder.

        # Delete old columns
        for(X in REORDER)       delete REORDER[X];

        # Match field M against column N.
        for(N=1; N<=COLMAX; N++)
        {
                for(M=1; M<=NF; M++)
                if($M == ORDER[N])
                {
                        REORDER[N]=M;
                        break;
                }

                if(!REORDER[N])
                {
                        print "Couldn't find " ORDER[M] " in " FILENAME >"/dev/stderr";
                        REORDER[N]=NF+10;
                }
        }

        # Only print the first line of columns
        if(NR != 1)     next;
}

# Reorder all input
{
        split($0, ZZT, FS);

        PFIX=""
        STR=""
        for(N=1; N<=COLMAX; N++)
        {
                STR=STR PFIX ZZT[REORDER[N]];
                PFIX=","
        }

        $0=STR
}

1 # Print all other lines

$ awk -f col.awk columnfile data

interacao,AspAsp,AspCys,CysAsp,CysCys,classe
beta_alfa,DA,DD,CA,CD,ppi

$

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Reading columns using arrays

Hello, Please help in how to read rows and columns using array and print them. I have below output and i want to store this in array and print the required rows or columns. aaaaaaa 123 bbbbbb 456 ccccccc 888 Use code tags, thanks. (1 Reply)
Discussion started by: Cva2568
1 Replies

2. Shell Programming and Scripting

Reading specific range of columns in an Excel file

Hi All, I want to read an excel file. PFA excel, I want to read the cloumn from A to G and the V to AH starting from Row number 3. Please help me on this. (7 Replies)
Discussion started by: Abhisrajput
7 Replies

3. Shell Programming and Scripting

Making a composite file of transposed columns

Hello, I have a directory with allot of tab delimited text files that have data that look like, filePath distance (1,4-dioxan-2-ylmethyl)methylamine 0.0 4-methylmorpholine 0.0755473632594 1-propyl-4-piperidone 0.157792911954 heptaminol 0.158142893249 N-acetylputrescine 0.158689628956... (3 Replies)
Discussion started by: LMHmedchem
3 Replies

4. Shell Programming and Scripting

Reading columns from a text file and to make an array for each column

Hi, I am not so familiar with bash scripting and would appreciate your help here. I have a text file 'input.txt' like this: 2 3 4 5 6 7 8 9 10 I want to store each column in an array like this a ={2 5 8}, b={3 6 9}, c={4 7 10} so that i can access any element, e.g b=6 for the later use. (1 Reply)
Discussion started by: Asif Siddique
1 Replies

5. Programming

Reading multiple columns in C++

Dear all, I am novice in C+= programing. I would like to seek help in one of the progra. Here it is, I have txt file which has the data as following order varA varB -21 0 -21.2 3, 4, 5, 6 -21.4 45, 65, 87, 98, 98 -22.0 345677, 349887, 98766, 877654, 987543 -23.0 76549,... (17 Replies)
Discussion started by: emily
17 Replies

6. Shell Programming and Scripting

Creating a larger .xml file from a template(sample file)

Dear All, I have a template xml file like below. ....Some---Header....... <SignalPreference> ... <SignalName>STRING</SignalName> ... </SignalPreference> ......Some formatting text....... <SignalPreference> ......... ... (3 Replies)
Discussion started by: ks_reddy
3 Replies

7. Shell Programming and Scripting

Compare columns and rows with template, and fill empty slots.

Hi, I'm working on a script that will take the contents of a file, that is in a row and column format, and compare it to a arrangment file. Such that if there is any or all blanks in my content file, the blank will be filled with a flag and will retain the row and column configuration. Ex. ... (2 Replies)
Discussion started by: hizzle
2 Replies

8. Shell Programming and Scripting

Reading columns in tab delimited file

I want to read only one column in "|" delimited file and write that column to a new file. For Ex: Input File 1|abc|324|tt 2|efd|11|cbcb 3||1|fg 4|ert|23|88 Output : I want to read column 3 in diff file. 324 11 1 88 Can anyone give me inputs on this ? (2 Replies)
Discussion started by: net
2 Replies

9. Shell Programming and Scripting

ksh questions - Reading columns and lines on unix

1-) For the command below, I want to read second column: 32751. How will I get it ? $ ps -ef|grep deneme U00 32751 22745 0 16:30 pts/1 00:00:00 ksh deneme U00 32762 32132 0 16:30 pts/2 00:00:00 grep deneme 2-) For the command below, how will I read all lines line by line? For... (1 Reply)
Discussion started by: senem
1 Replies

10. Shell Programming and Scripting

gawk - reading two files & re arrange the columns

Hi, I am trying to read 2 files and writing to the 3rd file if I find the same elements in 2 files. my first file is 1 0 kb12124819 766409 1.586e-01 1 0 kb17160939 773886 8.674e-01 1 0 kb4475691 836671 8.142e-01 1 0 ... (2 Replies)
Discussion started by: ezhil01
2 Replies
Login or Register to Ask a Question