Concatenate column values when header is Matching from multiple files


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Concatenate column values when header is Matching from multiple files
# 8  
Old 09-24-2016
@Ravinder...Thank you so much I added below to get it tab delimited however I was not able get the header file. As header will be constant in all three files can get the header. Otherwise it worked perfectly fine. Thank you again you have really helped

Code:
 
 paste Allen_Free.txt Allen_Current.txt Allen_Allocated.txt | awk 'function get(field){q=NF/3;for(i=field;i<=NF;i+=q){$field=i>field?$field "/" $i:$field;}} BEGIN{FS=OFS="\t"}  NR>1 {for(j=2;j<=NF/3;j++){get(j)};for(j=NF/3+1;j<=NF;j++){$j=""};sub(/[[:space:]]+$/,X,$0);print

please help

Last edited by Nina2910; 09-24-2016 at 11:20 PM..
# 9  
Old 09-25-2016
Did you try the code I suggested in post #5 in this thread?
# 10  
Old 09-25-2016
Quote:
Originally Posted by Nina2910
@Ravinder...Thank you so much I added below to get it tab delimited however I was not able get the header file. As header will be constant in all three files can get the header. Otherwise it worked perfectly fine. Thank you again you have really helped
Code:
 
 paste Allen_Free.txt Allen_Current.txt Allen_Allocated.txt | awk 'function get(field){q=NF/3;for(i=field;i<=NF;i+=q){$field=i>field?$field "/" $i:$field;}} BEGIN{FS=OFS="\t"}  NR>1 {for(j=2;j<=NF/3;j++){get(j)};for(j=NF/3+1;j<=NF;j++){$j=""};sub(/[[:space:]]+$/,X,$0);print

please help
Hello Nina2910,

Request you to please answer Don's question. Glad that I could help you, you could run following script, where I have made a minor change into it to get the headers of your Input_file.
Code:
cat script.ksh
COUNT=$(ls *.txt | wc -l)
paste *.txt | awk -vcount="$COUNT" 'function get(field){
							q=NF/count;
							for(i=field;i<=NF;i+=q){
										$field=i>field?$field "/" $i:$field;
							       		       }
                                                       } 
                   NR>1               {
					for(j=2;j<=NF/count;j++)   {
								get(j)
                                                               };
					for(j=NF/count+1;j<=NF;j++){
								$j=""
							       };
                                        sub(/[[:space:]]+$/,X,$0);
					print
 				      }
                   NR==1              {
					for(j=1;j<=NF/count;j++)   {
								printf("%s\t",$j);
                                                                   }
					print X;
				      }
		                   '  FS=OFS="\t"

Thanks,
R. Singh

Last edited by RavinderSingh13; 09-25-2016 at 02:14 AM..
# 11  
Old 09-25-2016
@Don ...it worked perfectly and thank you so for sparing time for me and explaining it for me. I am so sorry could not replied on it earlier. I was looking for a function or one liner so that I can use it in my script.

---------- Post updated at 08:30 PM ---------- Previous update was at 08:06 PM ----------

@Rudi ...Thank you so much but it changes the header columns order

---------- Post updated at 08:30 PM ---------- Previous update was at 08:30 PM ----------

@Ravinder thank you however the latest code didn't work Smilie
# 12  
Old 09-26-2016
Quote:
Originally Posted by Nina2910
@Don ...it worked perfectly and thank you so for sparing time for me and explaining it for me. I am so sorry could not replied on it earlier. I was looking for a function or one liner so that I can use it in my script.

---------- Post updated at 08:30 PM ---------- Previous update was at 08:06 PM ----------

@Rudi ...Thank you so much but it changes the header columns order

---------- Post updated at 08:30 PM ---------- Previous update was at 08:30 PM ----------

@Ravinder thank you however the latest code didn't work Smilie
You asked Ravinder for an explanation of his code, so I assumed you would want comments on how my code worked as well.

Sorry, but I don't do one-liners; I try to write code that can be read and understood. You can convert my code to an unreadable 1-liner if you want to; but if you ever need to modify it in the future and can't figure out how to do it, don't expect me to try to help you modify my code after you have made it unreadable!

I'm sorry that my code did not meet your needs either. If you needed a function instead of a complete script, you should have explained what inputs your function would be given and what the function is supposed to return to the invoking script. I guess I don't see what a function would do for you that isn't done by the script I suggested in post #5 in this thread.
This User Gave Thanks to Don Cragun For This Post:
# 13  
Old 09-26-2016
@Don ...I think I spoke too soon actually I used your code as function and it worked fine ...Thank you so much and I understand what you saying and I understood your code as well ....Thanks one again Smilie ...do you recommend any video or any book I am new to shell scripting and want to learn awk because my new profile demands me to do lots of shell scripting. Thanks once again
# 14  
Old 09-26-2016
Quote:
Originally Posted by Nina2910
@Don ...it worked perfectly and thank you so for sparing time for me and explaining it for me. I am so sorry could not replied on it earlier. I was looking for a function or one liner so that I can use it in my script.
---------- Post updated at 08:30 PM ---------- Previous update was at 08:06 PM ----------
@Rudi ...Thank you so much but it changes the header columns order
---------- Post updated at 08:30 PM ---------- Previous update was at 08:30 PM ----------
@Ravinder thank you however the latest code didn't work Smilie
Hello Nina2910,

Above script provided by me worked fine for me, you could try to run it in as follows too.
Code:
cat script.ksh
COUNT=$(ls *.txt | wc -l)
paste *.txt | awk -vcount=$COUNT 'function get(field){q=NF/count;for(i=field;i<=NF;i+=q){$field=i>field?$field "/" $i:$field;}} NR>1{for(j=2;j<=NF/count;j++){get(j)};for(j=NF/count+1;j<=NF;j++){$j=""};sub(/[[:space:]]+$/,X,$0);print} NR==1{for(j=1;j<=NF/count;j++){printf("%s\t",$j)}print X;}'

Following is the explanation of above code, please do not run the following code it is only for explanation purposes I have split it.
Code:
COUNT=$(ls *.txt | wc -l) #### Creating a variable named COUNT(in shell) and it's value will be number of .txt files. If you want to hardcode files which you showed in my code previously then you could ignore this variable and could hardcode file names with paste command along with subsituting the count variable with number of files in awk code too.
paste *.txt               #### Using paste command with all files whose extension is .txt so that it will concatenate their contents as per line numbers.
|                         #### Using pipe here to send the standard output of paste command to awk command as standard input.
awk                       #### Starting awk 
-vcount=$COUNT            #### -v option is used to define ann awk's variable. So here I am making count variable which will have values of SHELL variable named COUNT's value init. This is the way where we could set a shell's variable's value to an awk's variable too.
'function get(field)      #### starting a function here, as we all know in function we could write a logic which we need to perform several times and could save our time and could make code neat and clear, so creating a function for same. Function name is get. passing a value to it called field as by name itself it is clear we are going to pass field into it.
{q=NF/count;              #### creating a variable named q whose value is NF/count where NF is number of fields and it is an awk's in-built variable which gives number of fields into any line/record. so q will have actually number of fields for a single file. Here you have a;ready mentioned that number of fields will be equal in each Input_file so I am dividing TOTAL number of fields with TOTAL number of Input_files so that could get 1 Input_file's number of fields.
for(i=field;i<=NF;i+=q)   #### starting a for loop here whose syntax will be always the usual one for(variable initilization,condition,variable decrement/increment). Similarly we are starting a variabled named i whose value will be equal to variable field(which we are passing to function) and till the value of NF(number of fields in current line/record) it will execute this for loop.
{$field=                  #### Making values of $field where $field defines, let's say we have field variable's value as 2 then $2 defines 2nd field of current line etc.
i>field                   #### checking here condition if i's value is greater than variable field.
?                         #### ? is a well known ternary operator which defines the next statements will be executing if above condition is TRUE.
$field "/" $i             #### setting value of $field into $field "/" $i now.
:                         #### : is a well known ternary operator which defines that statements which are coming next will be executed because condition showed 2 steps above is NOT TRUE.
$field;}}                 #### keeping the value of $field as same $field.
NR>1                      #### Now coming into main section where we are checking if value of NR>1 where NR is awk's built in variable which defines the number of records in a line/record, so I am making sure we are not executing further statements while line number is one which is your header line.
{for(j=2;j<=NF/count;j++) #### starting a for loop here which wil run till variable j's value if less than and equal to value of NF/count.
{get(j)};                 #### Calling function get which we created and explained above for each field of each record/line.
for(j=NF/count+1;j<=NF;j++) #### Starig a for loop whic will run till the value of variable j is less than and equal to value of NF/count+1.
{$j=""};                  #### So I am nullifying the fields, so basically what I am doing is since we need to only fields depending on 1 Input_file so I am Nullifying extra fields now, whose value already been concatinated to needed fields already above.
sub(/[[:space:]]+$/,X,$0);#### substituting the space at last to NULL, when we wil Nullify the fields then at last space will be there so removing it completly with sub which is awk's in-built functionality whose syntax is sub(/pattern/variable/,"new value",line/record/variable).
print}                    #### Finally printing the value of newly modified fields which is requirement.
NR==1{                    #### checking here condition when NR==1 means when 1st line is there then only execute further statements.
for(j=1;j<=NF/count;j++){ #### starting a for loop till variable j's value is less than or equal to value of NF/count+1.
printf("%s\t",$j)}        #### printing the value of fields here.
print X;}'                #### printing value of a NULL value variable, basically to get a new line after headers are printed by above for loop.

Thanks,
R. Singh

Last edited by RavinderSingh13; 09-26-2016 at 04:35 AM.. Reason: fixed some typos
This User Gave Thanks to RavinderSingh13 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Comparing same column from two files, printing whole row with matching values

First I'd like to apologize if I opened a thread which is already open somewhere. I did a bit of searching but could quite find what I was looking for, so I will try to explaing what I need. I'm writing a script on our server, got to a point where I have two files with results. Example: File1... (6 Replies)
Discussion started by: mitabrev83
6 Replies

2. Shell Programming and Scripting

Concatenate values in the first column based on the second column.

I have a file (myfile.txt) with contents like this: 1.txt apple is 3.txt apple is 5.txt apple is 2.txt apple is a 7.txt apple is a 8.txt apple is a fruit 4.txt orange not a fruit 6.txt zero isThe above file is already sorted using this command: sort -k2 myfile.txtMy objective is to get... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

3. Shell Programming and Scripting

Extracting values based on line-column numbers from multiple text files

Dear All, I have to solve the following problems with multiple tab-separated text file but I don't know how. Any help would be greatly appreciated. I have access to Linux mint (but not as a professional). I have multiple tab-delimited files with the following structure: file1: 1 44 2 ... (5 Replies)
Discussion started by: Bastami
5 Replies

4. Shell Programming and Scripting

Sum column values matching other field

this is part of a KT i am going thru. i am writing a script in bash shell, linux where i have 2 columns where 1st signifies the nth hour like 00, 01, 02...23 and 2nd the file size. sample data attached. Desired output is 3 columns which will give the nth hour, number of entries in nth hour and... (3 Replies)
Discussion started by: alpha_1
3 Replies

5. Shell Programming and Scripting

Sum values of specific column in multiple files, considering ranges defined in another file

I have a file (let say file B) like this: File B: A1 3 5 A1 7 9 A2 2 5 A3 1 3 The first column defines a filename and the other two define a range in that specific file. In the same directory, I have also three more files (File A1, A2 and A3). Here is 10 sample lines... (3 Replies)
Discussion started by: Bastami
3 Replies

6. Shell Programming and Scripting

Compare values in two files. For matching rows print corresponding values from File 1 in File2.

- I have two files (File 1 and File 2) and the contents of the files are mentioned below. - I am trying to compare the values of Column1 of File1 with Column1 of File2. If a match is found, print the corresponding value from Column2 of File1 in Column5 of File2. - I tried to modify and use... (10 Replies)
Discussion started by: Santoshbn
10 Replies

7. UNIX for Dummies Questions & Answers

shift values in one column as header for values in another column

Hi Gurus, I have a tab separated text file with two columns. I would like to make the first column values as headings for the second column values. Ex. >value1 subjects >value2 priorities >value3 requirements ...etc and I want to have a file >value1 subjects >value2 priorities... (4 Replies)
Discussion started by: Unilearn
4 Replies

8. UNIX for Dummies Questions & Answers

Rename a header column by adding another column entry to the header column name

Hi All, I have a file example.csv which looks like this GrpID,TargetID,Signal,Avg_Num CSCH74_1_1,2007,61,256 CSCH74_1_1,212007,647,679 CSCH74_1_1,12007,3,32 CSCH74_1_1,207,299,777 I want the output as GrpID,TragetID,Signal-CSCH74_1_1,Avg_Num CSCH74_1_1,2007,61,256... (1 Reply)
Discussion started by: Vavad
1 Replies

9. Shell Programming and Scripting

Rename a header column by adding another column entry to the header column name URGENT!!

Hi All, I have a file example.csv which looks like this GrpID,TargetID,Signal,Avg_Num CSCH74_1_1,2007,61,256 CSCH74_1_1,212007,647,679 CSCH74_1_1,12007,3,32 CSCH74_1_1,207,299,777 I want the output as GrpID,TragetID,Signal-CSCH74_1_1,Avg_Num CSCH74_1_1,2007,61,256... (4 Replies)
Discussion started by: Vavad
4 Replies

10. Shell Programming and Scripting

Joining multiple files based on one column with different and similar values (shell or perl)

Hi, I have nine files looking similar to file1 & file2 below. File1: 1 ABCA1 1 ABCC8 1 ABR:N 1 ACACB 1 ACAP2 1 ACOT1 1 ACSBG 1 ACTR1 1 ACTRT 1 ADAMT 1 AEN:N 1 AKAP1File2: 1 A4GAL 1 ACTBL 1 ACTL7 (4 Replies)
Discussion started by: seqbiologist
4 Replies
Login or Register to Ask a Question