Sponsored Content
Top Forums Shell Programming and Scripting Find columns in a file based on header and print to new file Post 302986451 by LMHmedchem on Friday 25th of November 2016 01:18:03 PM
Old 11-25-2016
Quote:
Originally Posted by Scrutinizer
Note: The order in for (i in a) is arbitrary, so it cannot be used reliably to preserve order. An alternative would be to use a for(i=min;i<=max;i++) loop..
for example:
Code:
awk 'NR==FNR{A[$1]=++c; next} {s=""; for (i=1; i<=c; i++) {if(FNR==1) P[i]=A[$i]; s=s $(P[i]) OFS} print s}' headers input_file

Thank you all for the replies.

I can't seem to get the above working.

Here is some data, sorry this is hard to read but I thought it best to leave it in its original single space delimited format.
Code:
name col_1 col_2 col_3 col_4 col_5 col_6 col_7 col_8
name,1 2 1 1 0 1 0 11.75 9.6154
name,2 7 0 0 0 1 0 12.7917 8.6310
name,3 4 1 1 0.6 1 0 18.2769 4.6420
name,4 6 1 1 0 1 0 16.1389 7.7778
name,5 2 2 3 0.833333 1 0 21.5342 4.2924

headers_file,
Code:
col_1
col_6
col_3
col_4
col_8

desired output (in most cases, some columns in the original input will not be in output)
Code:
name	col_1	col_6	col_3	col_4	col_8
name,1	2	0	1	0	9.6154
name,2	7	0	0	0	8.6310
name,3	4	0	1	0.6	4.6420
name,4	6	0	1	0	7.7778
name,5	2	0	3	0.8333	4.2924

When I run the script above by I get,
Code:
col_1 col_1 col_1 col_1 col_1 col_1 
col_6 col_6 col_6 col_6 col_6 col_6 
col_3 col_3 col_3 col_3 col_3 col_3 
col_4 col_4 col_4 col_4 col_4 col_4 
col_8 col_8 col_8 col_8 col_8 col_8

The code suggestion posted by ripat has a similar issue but I haven't posted the results here because of the comment by Scrutinizer about the order of output.

Quote:
Originally Posted by drl
@LMHmedchem: with 300 posts, you should know that posting data samples, expected output, and your computing environment will help make replies easier and more likely to be applicable to your situation. Please do that in your future posts.
I certainly should have included an example with my post, sorry about that. I am currently running this under cygwin 2.3.1 but this will also run on openSuse 13.2 x86_64.

I know that the term csv is sometimes used to refer to generic delimited text data and not just comma separated data. I stay away from comma separation because many of the fields I use (chemical names) have commas ( 1,1,4,4-tetrabutylpiperazine ). The values in the name column could also have unmatched single quotes ( N,N,N',N'-tetramethylguanidine ) or parenthesis ( 1-(2-aminoethyl)piperazine ). I think that code that replaces space with comma would be problematic in my particular case. Yet another reason why an example of real data would have been useful for me to post.

LMHmedchem

Last edited by LMHmedchem; 11-25-2016 at 02:56 PM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Changing file content based on file header

Hi, I have several text files each containing some data as shown below: File1.txt >DataHeader Data... Data... File2.txt >DataHeader Data... Data... etc. What I want is to change the 'DataHeader' based on the file name. So the output should look like: File1.txt >File1 ... (1 Reply)
Discussion started by: Fahmida
1 Replies

2. Shell Programming and Scripting

Need to find a column from one file and print certain columns in second file

Hi, I need helping in finding some of the text in one file and some columns which have same column in file 1 EG cat file_1 aaaa bbbb cccc dddd eeee fffff gggg hhhh cat file_2 aaaa,abcd,effgh,ereref,name,age,sex,........... bbbb,efdfh,erere,afdafds,name,age,sex.............. (1 Reply)
Discussion started by: jpkumar10
1 Replies

3. Shell Programming and Scripting

Awk based script to find the median of all individual columns in a data file

Hi All, I have some data like below. Step1,Param1,Param2,Param3 1,2,3,4 2,3,4,5 2,4,5,6 3,0,1,2 3,0,0,0 3,2,1,3 ........ so on Where I need to find the median(arithmetic) of each column from Param1...to..Param3 for each set of Step1 values. (Sort each specific column, if the... (5 Replies)
Discussion started by: ks_reddy
5 Replies

4. Shell Programming and Scripting

Remove the file content based on the Header of the file

Hi All, I want to remove the content based on the header information . Please find the example below. File1.txt Name|Last|First|Location|DepId|Depname|DepLoc naga|rr|tion|hyd|1|wer|opr Nava|ra|tin|gen|2|wera|opra I have to search for the DepId and remove the data from the... (5 Replies)
Discussion started by: i150371485
5 Replies

5. Shell Programming and Scripting

awk based script to find the average of all the columns in a data file

Hi All, I need the modification for the below mentioned code (found in one more post https://www.unix.com/shell-programming-scripting/27161-script-generate-average-values.html) to find the average values for all the columns(but for a specific rows) and print the averages side by side. I have... (4 Replies)
Discussion started by: ks_reddy
4 Replies

6. Shell Programming and Scripting

Compare two files and find match and print the header of the second file

Hi, I have two input files; file1 and file2. I compare them based on matched values in 1 column and print selected columns of the second file (file2). I got the result but the header was not printed. i want the header of file2 to be printed together with the result. Then i did below codes:- ... (3 Replies)
Discussion started by: redse171
3 Replies

7. Shell Programming and Scripting

Make copy of text file with columns removed (based on header)

Hello, I have some tab delimited text files with a three header rows. The headers look like, (sorry the tabs look so messy). index group Name input input input input input input input input input input input... (9 Replies)
Discussion started by: LMHmedchem
9 Replies

8. Emergency UNIX and Linux Support

Average columns based on header name

Hi Friends, I have files with columns like this. This sample input below is partial. Please check below for main file link. Each file will have only two rows. ... (8 Replies)
Discussion started by: jacobs.smith
8 Replies

9. Shell Programming and Scripting

Find header in a text file and prepend it to all lines until another header is found

I've been struggling with this one for quite a while and cannot seem to find a solution for this find/replace scenario. Perhaps I'm getting rusty. I have a file that contains a number of metrics (exactly 3 fields per line) from a few appliances that are collected in parallel. To identify the... (3 Replies)
Discussion started by: verdepollo
3 Replies

10. UNIX for Beginners Questions & Answers

How to print multiple required columns dynamically in a file using the header name?

Hi All, i am trying to print required multiple columns dynamically from a fie. But i am able to print only one column at a time. i am new to shell script, please help me on this issue. i am using below script awk -v COLT=$1 ' NR==1 { for (i=1; i<=NF; i++) { ... (2 Replies)
Discussion started by: balu1234
2 Replies
All times are GMT -4. The time now is 01:32 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy