Concatenate column values when header is Matching from multiple files

09-24-2016

Registered User

21, 0

Join Date: Sep 2016

Last Activity: 6 October 2016, 10:52 AM EDT

Posts: 21

Thanks Given: 23

Thanked 0 Times in 0 Posts

@Ravinder...Thank you so much I added below to get it tab delimited however I was not able get the header file. As header will be constant in all three files can get the header. Otherwise it worked perfectly fine. Thank you again you have really helped

Code:

 
 paste Allen_Free.txt Allen_Current.txt Allen_Allocated.txt | awk 'function get(field){q=NF/3;for(i=field;i<=NF;i+=q){$field=i>field?$field "/" $i:$field;}} BEGIN{FS=OFS="\t"}  NR>1 {for(j=2;j<=NF/3;j++){get(j)};for(j=NF/3+1;j<=NF;j++){$j=""};sub(/[[:space:]]+$/,X,$0);print

please help

Last edited by Nina2910; 09-24-2016 at 11:20 PM..

Nina2910

View Public Profile for Nina2910

Find all posts by Nina2910

09-25-2016

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Did you try the code I suggested in post #5 in this thread?

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

09-25-2016

Moderator

3,105, 1,603

Join Date: May 2013

Last Activity: 31 August 2020, 1:46 AM EDT

Location: Chennai

Posts: 3,105

Thanks Given: 1,269

Thanked 1,603 Times in 1,369 Posts

Quote:

Originally Posted by Nina2910

Code:

 
 paste Allen_Free.txt Allen_Current.txt Allen_Allocated.txt | awk 'function get(field){q=NF/3;for(i=field;i<=NF;i+=q){$field=i>field?$field "/" $i:$field;}} BEGIN{FS=OFS="\t"}  NR>1 {for(j=2;j<=NF/3;j++){get(j)};for(j=NF/3+1;j<=NF;j++){$j=""};sub(/[[:space:]]+$/,X,$0);print

please help

Hello Nina2910,

Request you to please answer Don's question. Glad that I could help you, you could run following script, where I have made a minor change into it to get the headers of your Input_file.

Code:

cat script.ksh
COUNT=$(ls *.txt | wc -l)
paste *.txt | awk -vcount="$COUNT" 'function get(field){
							q=NF/count;
							for(i=field;i<=NF;i+=q){
										$field=i>field?$field "/" $i:$field;
							       		       }
                                                       } 
                   NR>1               {
					for(j=2;j<=NF/count;j++)   {
								get(j)
                                                               };
					for(j=NF/count+1;j<=NF;j++){
								$j=""
							       };
                                        sub(/[[:space:]]+$/,X,$0);
					print
 				      }
                   NR==1              {
					for(j=1;j<=NF/count;j++)   {
								printf("%s\t",$j);
                                                                   }
					print X;
				      }
		                   '  FS=OFS="\t"

Thanks,
R. Singh

Last edited by RavinderSingh13; 09-25-2016 at 02:14 AM..

RavinderSingh13

View Public Profile for RavinderSingh13

Find all posts by RavinderSingh13

09-25-2016

Registered User

21, 0

Join Date: Sep 2016

Last Activity: 6 October 2016, 10:52 AM EDT

Posts: 21

Thanks Given: 23

Thanked 0 Times in 0 Posts

@Don ...it worked perfectly and thank you so for sparing time for me and explaining it for me. I am so sorry could not replied on it earlier. I was looking for a function or one liner so that I can use it in my script.

---------- Post updated at 08:30 PM ---------- Previous update was at 08:06 PM ----------

@Rudi ...Thank you so much but it changes the header columns order

---------- Post updated at 08:30 PM ---------- Previous update was at 08:30 PM ----------

@Ravinder thank you however the latest code didn't work

Nina2910

View Public Profile for Nina2910

Find all posts by Nina2910

09-26-2016

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Quote:

Originally Posted by Nina2910

You asked Ravinder for an explanation of his code, so I assumed you would want comments on how my code worked as well.

Sorry, but I don't do one-liners; I try to write code that can be read and understood. You can convert my code to an unreadable 1-liner if you want to; but if you ever need to modify it in the future and can't figure out how to do it, don't expect me to try to help you modify my code after you have made it unreadable!

I'm sorry that my code did not meet your needs either. If you needed a function instead of a complete script, you should have explained what inputs your function would be given and what the function is supposed to return to the invoking script. I guess I don't see what a function would do for you that isn't done by the script I suggested in post #5 in this thread.

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

09-26-2016

Registered User

21, 0

Join Date: Sep 2016

Last Activity: 6 October 2016, 10:52 AM EDT

Posts: 21

Thanks Given: 23

Thanked 0 Times in 0 Posts

@Don ...I think I spoke too soon actually I used your code as function and it worked fine ...Thank you so much and I understand what you saying and I understood your code as well ....Thanks one again

...do you recommend any video or any book I am new to shell scripting and want to learn awk because my new profile demands me to do lots of shell scripting. Thanks once again

Nina2910

View Public Profile for Nina2910

Find all posts by Nina2910

09-26-2016

Moderator

3,105, 1,603

Join Date: May 2013

Last Activity: 31 August 2020, 1:46 AM EDT

Location: Chennai

Posts: 3,105

Thanks Given: 1,269

Thanked 1,603 Times in 1,369 Posts

Quote:

Originally Posted by Nina2910

@Don ...it worked perfectly and thank you so for sparing time for me and explaining it for me. I am so sorry could not replied on it earlier. I was looking for a function or one liner so that I can use it in my script.
---------- Post updated at 08:30 PM ---------- Previous update was at 08:06 PM ----------
@Rudi ...Thank you so much but it changes the header columns order
---------- Post updated at 08:30 PM ---------- Previous update was at 08:30 PM ----------
@Ravinder thank you however the latest code didn't work Smilie

Hello Nina2910,

Above script provided by me worked fine for me, you could try to run it in as follows too.

Code:

cat script.ksh
COUNT=$(ls *.txt | wc -l)
paste *.txt | awk -vcount=$COUNT 'function get(field){q=NF/count;for(i=field;i<=NF;i+=q){$field=i>field?$field "/" $i:$field;}} NR>1{for(j=2;j<=NF/count;j++){get(j)};for(j=NF/count+1;j<=NF;j++){$j=""};sub(/[[:space:]]+$/,X,$0);print} NR==1{for(j=1;j<=NF/count;j++){printf("%s\t",$j)}print X;}'

Following is the explanation of above code, please do not run the following code it is only for explanation purposes I have split it.

Code:

COUNT=$(ls *.txt | wc -l) #### Creating a variable named COUNT(in shell) and it's value will be number of .txt files. If you want to hardcode files which you showed in my code previously then you could ignore this variable and could hardcode file names with paste command along with subsituting the count variable with number of files in awk code too.
paste *.txt               #### Using paste command with all files whose extension is .txt so that it will concatenate their contents as per line numbers.
|                         #### Using pipe here to send the standard output of paste command to awk command as standard input.
awk                       #### Starting awk 
-vcount=$COUNT            #### -v option is used to define ann awk's variable. So here I am making count variable which will have values of SHELL variable named COUNT's value init. This is the way where we could set a shell's variable's value to an awk's variable too.
'function get(field)      #### starting a function here, as we all know in function we could write a logic which we need to perform several times and could save our time and could make code neat and clear, so creating a function for same. Function name is get. passing a value to it called field as by name itself it is clear we are going to pass field into it.
{q=NF/count;              #### creating a variable named q whose value is NF/count where NF is number of fields and it is an awk's in-built variable which gives number of fields into any line/record. so q will have actually number of fields for a single file. Here you have a;ready mentioned that number of fields will be equal in each Input_file so I am dividing TOTAL number of fields with TOTAL number of Input_files so that could get 1 Input_file's number of fields.
for(i=field;i<=NF;i+=q)   #### starting a for loop here whose syntax will be always the usual one for(variable initilization,condition,variable decrement/increment). Similarly we are starting a variabled named i whose value will be equal to variable field(which we are passing to function) and till the value of NF(number of fields in current line/record) it will execute this for loop.
{$field=                  #### Making values of $field where $field defines, let's say we have field variable's value as 2 then $2 defines 2nd field of current line etc.
i>field                   #### checking here condition if i's value is greater than variable field.
?                         #### ? is a well known ternary operator which defines the next statements will be executing if above condition is TRUE.
$field "/" $i             #### setting value of $field into $field "/" $i now.
:                         #### : is a well known ternary operator which defines that statements which are coming next will be executed because condition showed 2 steps above is NOT TRUE.
$field;}}                 #### keeping the value of $field as same $field.
NR>1                      #### Now coming into main section where we are checking if value of NR>1 where NR is awk's built in variable which defines the number of records in a line/record, so I am making sure we are not executing further statements while line number is one which is your header line.
{for(j=2;j<=NF/count;j++) #### starting a for loop here which wil run till variable j's value if less than and equal to value of NF/count.
{get(j)};                 #### Calling function get which we created and explained above for each field of each record/line.
for(j=NF/count+1;j<=NF;j++) #### Starig a for loop whic will run till the value of variable j is less than and equal to value of NF/count+1.
{$j=""};                  #### So I am nullifying the fields, so basically what I am doing is since we need to only fields depending on 1 Input_file so I am Nullifying extra fields now, whose value already been concatinated to needed fields already above.
sub(/[[:space:]]+$/,X,$0);#### substituting the space at last to NULL, when we wil Nullify the fields then at last space will be there so removing it completly with sub which is awk's in-built functionality whose syntax is sub(/pattern/variable/,"new value",line/record/variable).
print}                    #### Finally printing the value of newly modified fields which is requirement.
NR==1{                    #### checking here condition when NR==1 means when 1st line is there then only execute further statements.
for(j=1;j<=NF/count;j++){ #### starting a for loop till variable j's value is less than or equal to value of NF/count+1.
printf("%s\t",$j)}        #### printing the value of fields here.
print X;}'                #### printing value of a NULL value variable, basically to get a new line after headers are printed by above for loop.

Thanks,
R. Singh

Last edited by RavinderSingh13; 09-26-2016 at 04:35 AM.. Reason: fixed some typos

This User Gave Thanks to RavinderSingh13 For This Post:

RavinderSingh13

View Public Profile for RavinderSingh13

Find all posts by RavinderSingh13

UNIX for Beginners Questions & Answers

Concatenate column values when header is Matching from multiple files

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Comparing same column from two files, printing whole row with matching values

Discussion started by: mitabrev83

2. Shell Programming and Scripting

Concatenate values in the first column based on the second column.

Discussion started by: shoaibjameel123

3. Shell Programming and Scripting

Extracting values based on line-column numbers from multiple text files

Discussion started by: Bastami

4. Shell Programming and Scripting

Sum column values matching other field

Discussion started by: alpha_1

5. Shell Programming and Scripting

Sum values of specific column in multiple files, considering ranges defined in another file

Discussion started by: Bastami

6. Shell Programming and Scripting

Compare values in two files. For matching rows print corresponding values from File 1 in File2.

Discussion started by: Santoshbn

7. UNIX for Dummies Questions & Answers

shift values in one column as header for values in another column

Discussion started by: Unilearn

8. UNIX for Dummies Questions & Answers

Rename a header column by adding another column entry to the header column name

Discussion started by: Vavad

9. Shell Programming and Scripting

Rename a header column by adding another column entry to the header column name URGENT!!

Discussion started by: Vavad

10. Shell Programming and Scripting

Joining multiple files based on one column with different and similar values (shell or perl)

Discussion started by: seqbiologist