Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Concatenate column values when header is Matching from multiple files Post 302982110 by Don Cragun on Saturday 24th of September 2016 02:06:47 AM
Old 09-24-2016
You could also try the following...

Note that you say that your files are <tab> delimited, but all of the sample data you have shown us uses one or two <space> characters to separate fields; not a <tab> character.

The following will work with input files with fields separated by one or more blanks (where a blank is a <space> or a <tab>). This will not work if you have input files that really do use a <tab> as a field separator and some of your field data contains a <space>. It will work with any number of input files. It will work with any number of fields in a line (as long as all files have the same number of fields). The output header is taken from the 1st line of the 1st input file. The first line of all other input files are ignored. If the name on a data line is the same as the name on the header line in the 1st input file, that data will be merged into the output header line (i.e., it is assumed that the name used in the header in the 1st input file is not used as a name on any non-header line in any of the input files). Output fields will be separated by a <tab> character. The names of the files to be processed are not built into this script, they must be supplied as command-line arguments to the following script:
Code:
#!/bin/ksh
awk '	# Use the awk utility to interpret the following script...
BEGIN {	# Set output field separator.
	OFS = "\t"
}
NR == 1 || FNR > 1 {
	# Gather data from the 1st line in the 1st file (the header is supposed
	# to be the same in all input files) and from the 2nd line on in every
	# input file...
	# If we have not seen the name found in the first field before...
	if(!($1 in name)) {
		# Add the 1st first to the list of known names, increment the
		# number of names we have seen, and note the output line number
		# where this name should appear in the output....
		name[order[++nc] = $1]
		# and initialize the data for each output field for this name
		# from the corresponding input fields on this line.
		for(i = 2; i <= NF; i++)
			d[$1, i] = $i
	} else	# And if we have seen this name before, add data to be output
		# for this name to the accumalated data we have seen before for
		# this name.
		for(i = 2; i <= NF; i++)
			d[$1, i] = d[$1, i] "/" $i
}
END {	# Now that we have hit EOF on the last input file, print the accumulated
	# output.  For each name seen...
	for(i = 1; i <= nc; i++) {
		# Print the name...
		printf("%s", order[i])
		# and for the remaining fields...
		for(j = 2; j <= NF; j++)
			# print the output field separator followed by the
			# accumulated data for this name and field number.
			printf("%s%s", OFS, d[order[i], j])
		# and after the last field has been printed, add an aoutput
		# record separator.
		print ""
	}
}' "$@"	# Terminate the awk script and use the command line arguments as the
	# list of files to be processed.

This was written and tested using a Korn shell, but will work with any shell that uses Bourne shell syntax. If you save this script in a file named merger and make it executable:
Code:
chmod +x merger

and execute it with the pathnames of your sample input files:
Code:
./merger a.txt b.txt c.txt

it produces the output:
Code:
Name	9/1	9/2
X	1/13/25	7/19/31
y	2/14/26	8/20/32
z	3/15/27	9/21/33
a	4/16/28	10/22/34
b	5/17/29	11/23/35
c	6/18/30	12/24/36

Note that the output you said you wanted on the last line of the output was:
Code:
c 6/16/30 12/24/36

which, in addition to using <space> as a field separator instead of <tab>, also has 16 as the data from the 2nd column of the last line in b.txt instead of the value 18 that was contained in that field in your sample input file.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Joining multiple files based on one column with different and similar values (shell or perl)

Hi, I have nine files looking similar to file1 & file2 below. File1: 1 ABCA1 1 ABCC8 1 ABR:N 1 ACACB 1 ACAP2 1 ACOT1 1 ACSBG 1 ACTR1 1 ACTRT 1 ADAMT 1 AEN:N 1 AKAP1File2: 1 A4GAL 1 ACTBL 1 ACTL7 (4 Replies)
Discussion started by: seqbiologist
4 Replies

2. Shell Programming and Scripting

Rename a header column by adding another column entry to the header column name URGENT!!

Hi All, I have a file example.csv which looks like this GrpID,TargetID,Signal,Avg_Num CSCH74_1_1,2007,61,256 CSCH74_1_1,212007,647,679 CSCH74_1_1,12007,3,32 CSCH74_1_1,207,299,777 I want the output as GrpID,TragetID,Signal-CSCH74_1_1,Avg_Num CSCH74_1_1,2007,61,256... (4 Replies)
Discussion started by: Vavad
4 Replies

3. UNIX for Dummies Questions & Answers

Rename a header column by adding another column entry to the header column name

Hi All, I have a file example.csv which looks like this GrpID,TargetID,Signal,Avg_Num CSCH74_1_1,2007,61,256 CSCH74_1_1,212007,647,679 CSCH74_1_1,12007,3,32 CSCH74_1_1,207,299,777 I want the output as GrpID,TragetID,Signal-CSCH74_1_1,Avg_Num CSCH74_1_1,2007,61,256... (1 Reply)
Discussion started by: Vavad
1 Replies

4. UNIX for Dummies Questions & Answers

shift values in one column as header for values in another column

Hi Gurus, I have a tab separated text file with two columns. I would like to make the first column values as headings for the second column values. Ex. >value1 subjects >value2 priorities >value3 requirements ...etc and I want to have a file >value1 subjects >value2 priorities... (4 Replies)
Discussion started by: Unilearn
4 Replies

5. Shell Programming and Scripting

Compare values in two files. For matching rows print corresponding values from File 1 in File2.

- I have two files (File 1 and File 2) and the contents of the files are mentioned below. - I am trying to compare the values of Column1 of File1 with Column1 of File2. If a match is found, print the corresponding value from Column2 of File1 in Column5 of File2. - I tried to modify and use... (10 Replies)
Discussion started by: Santoshbn
10 Replies

6. Shell Programming and Scripting

Sum values of specific column in multiple files, considering ranges defined in another file

I have a file (let say file B) like this: File B: A1 3 5 A1 7 9 A2 2 5 A3 1 3 The first column defines a filename and the other two define a range in that specific file. In the same directory, I have also three more files (File A1, A2 and A3). Here is 10 sample lines... (3 Replies)
Discussion started by: Bastami
3 Replies

7. Shell Programming and Scripting

Sum column values matching other field

this is part of a KT i am going thru. i am writing a script in bash shell, linux where i have 2 columns where 1st signifies the nth hour like 00, 01, 02...23 and 2nd the file size. sample data attached. Desired output is 3 columns which will give the nth hour, number of entries in nth hour and... (3 Replies)
Discussion started by: alpha_1
3 Replies

8. Shell Programming and Scripting

Extracting values based on line-column numbers from multiple text files

Dear All, I have to solve the following problems with multiple tab-separated text file but I don't know how. Any help would be greatly appreciated. I have access to Linux mint (but not as a professional). I have multiple tab-delimited files with the following structure: file1: 1 44 2 ... (5 Replies)
Discussion started by: Bastami
5 Replies

9. Shell Programming and Scripting

Concatenate values in the first column based on the second column.

I have a file (myfile.txt) with contents like this: 1.txt apple is 3.txt apple is 5.txt apple is 2.txt apple is a 7.txt apple is a 8.txt apple is a fruit 4.txt orange not a fruit 6.txt zero isThe above file is already sorted using this command: sort -k2 myfile.txtMy objective is to get... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

10. Shell Programming and Scripting

Comparing same column from two files, printing whole row with matching values

First I'd like to apologize if I opened a thread which is already open somewhere. I did a bit of searching but could quite find what I was looking for, so I will try to explaing what I need. I'm writing a script on our server, got to a point where I have two files with results. Example: File1... (6 Replies)
Discussion started by: mitabrev83
6 Replies
All times are GMT -4. The time now is 05:56 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy