combining columns from different files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting combining columns from different files
# 8  
Old 09-09-2005
Quote:
Originally Posted by iomaire
Thanks. I really appreciate all your help.

Here are the commands that get me the result I need:

gawk -v c=2 -f io.awk ./result*.dat >unsorted.dat
gawk ' { print | "sort" }' sorted.dat >final.dat
Code:
gawk -v c=2 -f io.awk ./result*.dat | sort -n > final.dat

# 9  
Old 03-26-2006
some documentation added

I had a similar data sorting task and found the above program extremely useful as a starting point and learning tool. Here is the same program with documentation from my figuring out of the program, in case it is useful for anyone else.

#This regex quietly ignores all lines in the datafile starting with #.
#If for some reason you don't want to do that, just start with the {

!/^[#]/ {

#This just checks that the inputted column is between 1 and the greatest
#column number:

if ( c > 0 && c <= NF ) {

# idx is a string made of $1, which is the data index, and c, which
# is the column number of the data we want to extract. They are
# separated by the separator SUBSEP, which can be set if you want
# in a BEGIN{} statement. See for example this page on arrays and SUBSEP.

idx = $1 SUBSEP c

# a is a 1-dimensional array, whose index is the string idx. While
# scanning through the first file, the (idx in a) test will return false,
# so a[idx] = $c. In subsequent files, (idx in a) will pass, so
# a[idx] will then equal a[idx] OFS $c. OFS is the output field
# separator which I set to " ", $c is the data column. So a is a
# string variable whose string is the row of data which increases in
# length by an OFS and a data value for each file scanned.

a[idx] = ( idx in a ) ? a[idx] OFS $c : $c

}
}
END {

# idx is as above, except that it is now being recalled as the index
# of a. It is still in the form of a string. I found it more clear
# to call it idx again instead of rec.

for( idx in a ) {

# this creates the array idxA by splitting rec between every field
# separator SUBSEP

split(idx, idxA, SUBSEP)

#idxA[1] is the row index, idxA[2] would be the column number
#a[idx] is the string of data values for the same row collected from each datafile.
#"%d%s%s\n" says to format the printed line as a decimal integer followed by
#two strings then a newline. See for example the printf section of the gawk manual.

printf("%d%s%s\n", idxA[1], OFS, a[idx])

}
}
# 10  
Old 03-26-2007
Question modifying the above script

Hi, i have some questions regarding modifying this script. I should add I'm a awk newbie.

Currently I have many files with 3 columns.

The awk script is similar to above, but I am not interested in printing the index, so slightly modified.

Code:
!/^[#]/ {
  if ( c > 0 && c <= NF ) {
idx = $1 SUBSEP c
 a[idx] = ( idx in a ) ? a[idx] OFS $c : $c
  }
}
END {
  for( idx in a ) {
    split(idx, idxA, SUBSEP)
# modified -->
    printf("%s%s\n", OFS, a[idx])
  }
}

I'd like to combine the 2nd column to one file and the 3rd column to another, so I use commands like this.

gawk -v c=2 -f io.awk ?E+0.final | sort -n > file2.dat
gawk -v c=3 -f io.awk ?E+0.final | sort -n > file3.dat

The script works fine for column 2 but not fine for column 3.

The files look like this:
Quote:
5.00000007E-11 0.0810279995 2.52286541E+09
1.00000001E-10 0.254880995 4.57596416E+09
1.49999999E-10 0.519167006 6.65693082E+09
2.00000003E-10 0.864251971 7.69276518E+09
2.49999993E-10 1.28940797 7.19983002E+09
2.99999997E-10 1.75356197 5.53754522E+09
3.50000001E-10 1.96022499 3.2696681E+09
4.00000005E-10 1.94632304 1.06016013E+09
4.50000009E-10 1.91220903 -492845248.
4.99999986E-10 1.91022301 -897060992.
the problem, as I see it, could be either the scientific notation OR the fact that the above script was written for column 2.

Any suggestions??
Thanks again for this helpful script

phil

Last edited by psny18; 03-26-2007 at 03:50 PM..
# 11  
Old 02-14-2009
Printing specified columns from all files to a new file side by side...

Hi..
Hi All,

I am also looking for this kind of script... But my application is little different. I also don't need any index column(s). But I need to print $1,$11 compulsorily from all files and any one column at a time from the other columns.
In a nut shell, I need to print the $1,$11 and any other column out of total 34 columns to a new file.
E.g: $1,$11 and $2(or $3...... so on $34 excluding $1,$11, as they are already printed once)of all files should be printed to a new text file. Can anybody modify the below script to match to my requirement...

!/^[#]/ {
if ( c > 0 && c <= NF ) {
idx = $1 SUBSEP c
a[idx] = ( idx in a ) ? a[idx] OFS $c : $c
}
}
END {
for( idx in a ) {
split(idx, idxA, SUBSEP)
# modified -->
printf("%s%s\n", OFS, a[idx])
}
}This script is working very well for one column specified in the variable 'c' during run time..

Thanks ....
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Combining certain columns of multiple files into one file

Hello Unix gurus, I have a large number of files (say X) each containing two columns of data and the same number of rows. I would like to combine these files to create a unique merged file containing X columns corresponding to the second column of each file (with a bonus of having the first... (3 Replies)
Discussion started by: ksennin
3 Replies

2. Shell Programming and Scripting

Join two files combining multiple columns and produce mix and match output

I would like to join two files when two columns in each file matches with each other and then produce an output when taking multiple columns. Like I have file A 1234,ABCD,23,JOHN,NJ,USA 2345,ABCD,24,SAM,NY,USA 5678,GHIJ,24,TOM,NY,USA 5678,WXYZ,27,MAT,NJ,USA and file B ... (2 Replies)
Discussion started by: mady135
2 Replies

3. Shell Programming and Scripting

Combining rows into columns

hi experts, I have a flat file with below contents Database1 Table1 column1 Database1 Table1 column2 Database1 Table1 column3 Database1 Table1 column4 Database1 Table2 Column1 Database1 Table2 Column2 Database2 Table1 Column1 Database2 Table1 Column2 Database2 Table1 Column3... (9 Replies)
Discussion started by: Selva_2507
9 Replies

4. Linux

[Solved] Combining columns from different files

Hey Guys & Gals, I am stuck with the following ; I have 2 text files, each containing 2 columns. My goal is to have a column from the 2nd file placed inbetween the columns in the first file. Basically the idea is, each address has a different name (but 1 name per address) but 1 address... (6 Replies)
Discussion started by: TAPE
6 Replies

5. Shell Programming and Scripting

Combining columns from multiple files into one single output file

Hi, I have 3 files with one column value as shown File: a.txt ------------ Data_a1 Data_a2 File2: b.txt ------------ Data_b1 Data_b2 Data_b3 Data_b4 File3: c.txt ------------ Data_c1 Data_c2 Data_c3 Data_c4 Data_c5 (6 Replies)
Discussion started by: vfrg
6 Replies

6. UNIX for Dummies Questions & Answers

Need Help in reading N days files from a Directory & combining the files

Hi All, Request your expertise in tackling one requirement in my project,(i dont have much expertise in Shell Scripting). The requirement is as below, 1) We store the last run date of a process in a file. When the batch run the next time, it should read this file, get the last run date from... (1 Reply)
Discussion started by: dsfreddie
1 Replies

7. Shell Programming and Scripting

Combining columns from multiple files to one file

I'm trying to combine colums from multiple file to a single file but having some issues, appreciate your help. The filenames are the same except for the extension, path1.m0 --------- a b c d e f g h i path1.m1 --------- m n o p q r s t u File names are path1.m The... (3 Replies)
Discussion started by: rkmca
3 Replies

8. UNIX for Dummies Questions & Answers

Combining two text files as columns?

I have one space delimited file with multiple columns and one tab delimited file with multiple columns (They have the same number of rows). I want to basically combine these two text files into a new text file by column. How would I go about doing that? (1 Reply)
Discussion started by: evelibertine
1 Replies

9. Shell Programming and Scripting

Combining columns from different files

I have two files I need to combine. The problem I'm having is I need to only combine data from the second file in the empty spaces of the first. For example: file1 Data Field Data Field Data Field Data Field file2 a - Insert Data b - Insert Data c - Insert Data d - Insert Data... (10 Replies)
Discussion started by: handband2
10 Replies

10. Shell Programming and Scripting

Combining Two fixed width columns to a variable length file

Hi, I have two files. File1: File1 contains two fixed width columns ID of 15 characters length and Name is of 100 characters length. ID Name 1-43<<11 spaces>>Swapna<<94 spaces>> 1-234<<10 spaces>>Mani<<96 spaces>> 1-3456<<9 spaces>>Kapil<<95 spaces>> File2: ... (4 Replies)
Discussion started by: manneni prakash
4 Replies
Login or Register to Ask a Question