Hi,
I'd like to ask for some help with the following task, please:
there is a big file with a header (this is file.in):
HTML Code:
NAME A_1.X A_1.Y A_1.Z B_1.X B_1.Y B_1.Z
name1 AB 0.11 0.12 BB 0.45 0.67
name2 BB 0.34 0.56 AA 0.89 0.68
what I need is to recognize a pattern in the header of this file (pattern is in another file) and delete the column with that header
for example, the file with the pattern looks like this (this is file.with.patterns)
HTML Code:
A_1
A_2
C_4
D_7
so, it would recognize A_1 and will delete all the columns containing A_1; thus, the output would look like this (this is file.out):
HTML Code:
NAME B_1.X B_1.Y B_1.Z
name1 BB 0.45 0.67
name2 AA 0.89 0.68
I am not sure I've got the best approach. What I was thinking to do is to put all the columns whose header does not contain the specified pattern in one output file (so, those columns whose header does match the pattern will be let out, deleted):
HTML Code:
while read i
do
awk 'NR==1{for(a=1,a<=NF;a++) if ($a!~/$i/)f[n++]=a}
{for(a=0;a<=n;i++)printf"%s%s",a?":"",$f[a];print''} file.in >> file.out
done < file.with.patterns
one problem is that I would like to have all the columns whose header does not match the patterns in the file.with.patterns to be in the file.out and I am not sure if append sign (>>) would do that... it didn't really work well so far...
Another option I was thinking about is to establish the number of the columns whose header contains the pattern and then delete them with cut -f, but don't know how to do that.
Any ideas will be greatly appreciated!
Many thanks for your time!