I have files with columns like this. This sample input below is partial.
Please check below for main file link. Each file will have only two rows.
Basically, here is what I need to be done.
a. Start from second column which is 0.4% here.
b. Go until you hit "10" in the header name. If the header name is exactly 10.0%, then include that column too. If not, only include until the column before it. In this example, since we have 10.1% (29th column), we will be including columns starting from 0.4%(second) until 9.8% which is the 28th column. If the 29th column was to be 10.0%, then it would have been included too.
c. Average the values for these respective columns in the second row (data is not presented here - please click this link for total dataset - https://goo.gl/W8jND7). In this example, starting from 0.4%(second column) till 9.8%(28th column).
d. In the output, print first column which is "Gene", and this average value with column header being
e. Then start from 10.1% (29th column) and check until you hit "20" in the header name. Repeat steps b through d. And print output as
Repeat this until you have
f. After you hit 100%, it means one dataset is done.
g. If you observe my column header carefully here, there is another 0.4%-100% columns after the first 100%. I will be having 13 of these 0.4%-100%s in the input file at the above link.
i. I have multiple files, the headers can be
It varies from file to file. But the logic of averaging(if you hit "10", "20", etc) is always the same. And the number of samples 13 is also same which means each file will have 100%s for 13 times.
P.S: A Bonus of 1000 bits will be awarded to the effectively working solution.
Hi All,
I need the modification for the below mentioned code (found in one more post https://www.unix.com/shell-programming-scripting/27161-script-generate-average-values.html) to find the average values for all the columns(but for a specific rows) and print the averages side by side.
I have... (4 Replies)
Hi,
I have a file which looks like this:
FID IID MISS_PHENO N_MISS N_GENO F_MISS
12AB43131 12AB43131 N 17774 906341 0.01961
65HJ87451 65HJ87451 N 10149 906341 0.0112
43JJ21345 43JJ21345 N 2826 906341 0.003118I would... (11 Replies)
Hi to all,
I have two files. File1 has no header, two columns:
sample1 A
sample2 B
sample3 B
sample4 C
sample5 A
sample6 D
sample7 D
File2 has a header, except for the first 3 columns (chr,start,end). "sample1" is the header for the 4th ,5th ,6th columns, "sample2" is the header... (4 Replies)
Hello,
I have some tab delimited text files with a three header rows. The headers look like, (sorry the tabs look so messy).
index group Name input input input input input input input input input input input... (9 Replies)
Hi experts,
I want to group by average, for multiple columns starting column $7 until NF,
group by ($1-$5), please help
For just 7th column, I can do
awk '
NR>1{
arr += $7
count += 1
}
END{
for (a in arr) {
print a, arr/count
... (10 Replies)
I have this code below that only prints out certain columns from the first two rows (doesn't affect rows 3 and beyond). How can I do the same on a partial header pattern “G_TP” instead of having to know specific column numbers (e.g. 374-479)? I've tried many other commands within this pipe with no... (4 Replies)
Hello,
I have to fish out some specific columns from a file based on the header value. I have the list of columns I need in a different file. I thought I could read in the list of headers I need,
# file with header names of required columns in required order
headers_file=$2
# read contents... (11 Replies)
hello, I have three files in the following order
==> File1 <==
1 20977000 20977000 A C 1.00 0,15 15 45
1 115829313 115829313 G A 0.500 6,7 13 99
==> File2 <==
1 20977000 20977000 A C 1.00 0,13 13 39
1 115829313 ... (5 Replies)
I have files that have the following columns
chr pos ref alt sample 1 sample 2 sample 3
chr2 179644035 G A 1,107 0,1 58,67
chr7 151945167 G T 142,101 100,200 500,700
chr13 31789169 CTT CT,C 6,37,8 0,0,0 15,46,89
chr22 ... (3 Replies)