|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Combine columns from many files but keep them aligned in columns-shorter left column issue
Hello everyone, I searched the forum looking for answers to this but I could not pinpoint exactly what I need as I keep having trouble. I have many files each having two columns and hundreds of rows. first column is a string (can have many words) and the second column is a number.The files are tab separated. Example of some rows of three files: file1.txt Code:
Column1 Column2 S1 31.37 S1 31.49 S1 31.74 S10 31.13 S10 31.10 S10 31.29 S11 29.49 S11 29.54 S11 29.25 S12 31.24 S12 31.05 S12 31.50 S13 32.48 S13 31.07 ... file2.txt Code:
column 1 Column 2 1 CVS IN 23.14 1 CVS IN 23.24 1 CVS IN 23.24 1 CVS IP 21.53 1 CVS IP 21.40 1 CVS IP 21.36 1 WB IN 22.89 1 WB IN 23.02 1 WB IN 23.23 1 WB IP 26.70 1 WB IP 26.50 1 WB IP 26.67 2 CVS IN 23.34 2 CVS IN 23.34 2 CVS IN 23.14 2 CVS IP 21.38 2 CVS IP 21.49 2 CVS IP 21.54 file3.txt Code:
Column 1 Column2 1 WB IN 23.45 1 WB IN 23.53 1 WB IP 24.55 1 CVS IN 23.62 1 CVS IN 23.46 I CVS IP 21.74 1 WB IN 23.33 1 WB IN 23.42 1 WB IP 26.24 1 CVS IN 23.71 1 CVS IN 23.44 I CVS IP 22.35 1 WB IN 24.75 1 WB IN 24.88 1 WB IP 25.50 I want to to combine the files into one. I use the following code for the time being: Code:
paste $folder/*.txt | column -s $'\t' -t > folder/output_biorad.gff The problem is that the first file in the file is smaller than the other ones and the output, when the rows of the first file end gets misaligned. The columns of the second file shift to the left, and all the others follow etc. I have many files and this is getting very confusing for the users, who are biologists and do not want to manipulate datasets. When there is a shorter left column the right columns become missaligned. This is an example output that shows the problem: the problem starts at the line that i show as <----problem. the column I CVS IP should be the third column but it shifts to second. ....... Code:
8 WB IP 25.53 8 WB IN 22.98 7 WB IN 25.7 3 CVS IN 8 CVS IN 23.44 8 WB IP 28.75 7 WB IP 26.59 3 CVS IP 8 CVS IP 22.25 8 WB IP 28.83 7 WB IP 26.16 3 CVS IP 8 CVS IP 22.37 8 WB IP 29.05 7 WB IP 26.28 3 CVS IP 8 WB IN 24.47 I CVS IP 8 CVS IN 25.7 3 WB IN 8 WB IP 26.05 I CVS IP 8 CVS IN 25.31 3 WB IN 8 WB IP 26 I CVS IP 8 CVS IN 26.11 3 WB IN 8 CVS IN 25.11 I CVS IP 8 CVS IP 25.95 3 WB IN 8 CVS IP 23.19 I CVS IP 8 CVS IP 25.48 3 WB IN 8 CVS IP 23.19 I CVS IP 8 WB IN 25.83 3 WB IN 8 WB IN 24.18 I CVS IP 8 WB IN 26.1 3 WB IP 8 WB IP 33.73 I CVS IP 8 WB IN 25.81 3 WB IP 8 WB IP 33.75 I CVS IP 8 WB IP 26.74 3 WB IP 8 CVS IN 24.24 I CVS IP 8 WB IP 26.73 3 CVS IN 8 CVS IP 23.22 I CVS IP 8 WB IP 26.75 3 CVS IN 8 CVS IP 23.83 I CVS IP I CVS IP 3 CVS IN I CVS IP I CVS IP 3 CVS IN <--problem I CVS IP I CVS IP 3 CVS IN I CVS IP I CVS IP 3 CVS IN I CVS IP I CVS IP 3 CVS IP I CVS IP I CVS IP 3 CVS IP I CVS IP I CVS IP 3 CVS IP I CVS IP I CVS IP 3 WB IN I CVS IP I CVS IP 3 WB IN I CVS IP I CVS IP 3 WB IN I CVS IP I CVS IP 3 WB IN I CVS IP I CVS IP 3 WB IN I CVS IP I CVS IP 3 WB IN I CVS IP I CVS IP 3 WB IP I CVS IP I CVS IP 3 WB IP I CVS IP I CVS IP 3 WB IP I CVS IP I CVS IP 3 CVS IN I CVS IP I CVS IP 3 CVS IN I CVS IP I CVS IP 3 CVS IN I CVS IP I CVS IP 3 CVS IN I CVS IP I CVS IP 3 CVS IN I CVS IP I CVS IP 3 CVS IN I CVS IP I CVS IP 3 CVS IP I CVS IP I CVS IP 3 CVS IP I CVS IP I CVS IP 3 CVS IP 4V CVS IN I CVS IP 3 WB IN 4V CVS IN I CVS IP 3 WB IN 4V CVS IN I CVS IP 3 WB IN 4V CVS IN I CVS IP 3 WB IP When the lines from the firs file end all utput is shifted to the left. How can I keep them aligned? i looked everywhere and I have not been able to find out how to solve it. I am not entirely familiar with awk, i use perl for my other scripting and i am a beginner in shell scripting. Many thanks for your help in advance Last edited by vbe; 11-30-2012 at 08:49 AM.. Reason: use also code tags for your data, it keeps the format... |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
try: Code:
awk '
{
(FNR>m)? m=FNR:0;
f[FILENAME]++ ? 0:fc++;
a[fc-1,FNR]=$0;
}
END {
for (j=1;j<=m;j++) {
for (i=0;i<fc;i++) {
printf (a[i,j])? (a[i,j] "\t"):("\t\t");
}
print "";
}
}
' file*.txt > out.xls |
| The Following User Says Thank You to rdrtx1 For This Useful Post: | ||
isildur1234 (12-03-2012) | ||
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
Thank you so much. This is working and aligning the columns correctly!
If you have time, can youvery very briefly explain how it works, why your for statements manage to align the columns, especially what happens at the printf? for (j=1;j<=m;j++) { for (i=0;i<fc;i++) { printf (a[i,j])? (a[i,j] "\t"):("\t\t"); awk seems to be a very powerful language for file manipulation. How did you learn it? Textbook or website? Again, thank you so much. |
|
#4
|
|||
|
|||
|
Code:
awk '{
(FNR>m)? m=FNR:0; # use file record number as max record counter
f[FILENAME]++ ? 0:fc++; # set new file counter (++ fail indicates new file)
a[fc-1,FNR]=$0; # store data in two column,row array (file, record)}
END {
for (j=1;j<=m;j++) { # record counter loop
for (i=0;i<fc;i++) { # file counter loop
printf (a[i,j])? (a[i,j] "\t"):("\t\t"); # print tab separated data or just tabs if empty
}
print ""; # print new line character
}
}' file*.txt > out.xlsLearned by reading awk book and samples on the web. This site has most excellent solutions all over. Search forums for examples. Last edited by rdrtx1; 12-03-2012 at 11:52 AM.. |
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
this code not works properly...
Code:
awk: cmd. line:15: fatal: cannot open file `file*.txt' for reading (No such file or directory) if you try a.txt b.txt > out.xls problem is you need to align again manually... |
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
Right now my input to this code file*.txt > out.xls
outputs the files pasted together but the files are pasted randomly. I want to read the files from my folder alphabetical lexicographical order and paste them together. is there a way to sort my input files alphabetically before I run this code? I tried sort file*.txt but it doesnt work, nor does ls-1|folder. |
| Sponsored Links | ||
|
![]() |
| Tags |
| awk, columns combine, perl |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Combine columns from 100 files with same structure | genehunter | UNIX for Dummies Questions & Answers | 6 | 09-04-2012 07:47 AM |
| Combine columns from multiple files | sdf | Shell Programming and Scripting | 2 | 05-03-2012 02:16 PM |
| How to combine 2 files with 6 columns? | tgoldstone | UNIX for Dummies Questions & Answers | 2 | 03-14-2012 02:13 AM |
| How to combine 2 files into 1 file with 2 columns | pinpe | Shell Programming and Scripting | 11 | 06-20-2011 11:46 AM |
| Output columns needs to be aligned | nkamalkishore | Shell Programming and Scripting | 4 | 11-06-2009 10:24 AM |
|
|