Using the attached file, the below awk command results in the output below:
I can not seem to produce the desired results and need some expert help. Thank you .
"Average Depth" contains 13 characters and you are allowing only 8 for justification. Same with "Average GC" which has 10. If you want to align better give it more. Something like:
Code:
awk -F'[ |=]' '
FNR>1{
id[$2] += $4
value[$2] += $5
occur[$2]++
}
END{
printf "%-8s%8s%15s%15s\n", "Gene", "Targets", "Average Depth", "Average GC"
for (i in id)
printf "%-8s%8d%15.1f%15.1f\n", i, occur[i],value[i]/occur[i],id[i]/occur[i]
}' file2.txt | head -30
As a general rule, when creating format strings to display what are expected to be aligned strings with an assumed maximum length built into the format specifiers; it is a good idea to include a physical field separator in the format string between such fields instead of just increasing the specified length assuming that the expected length will never be exceeded.
For example, the printf statement in the 1st post in this thread:
Code:
printf "%-8s%8d%8.1f%8.1f\n", i, occur[i],value[i]/occur[i],id[i]/occur[i]
would more safely be written:
Code:
printf "%-8s %7d %7.1f %7.1f\n", i, occur[i],value[i]/occur[i],id[i]/occur[i]
If all values being printed are in the expected ranges, the output from both of the above statements will be identical. But, if one or more of the values overflow the expected range, the separation between fields can disappear in the 1st form while field separation is maintained by the 2nd form.
A human may be able to figure out the intended fields either way, but if an awk script using default field delimiters, a shell script using read with the default IFS value, etc. tries to find fields in the output of the above format strings, fields can disappear with the first.
I am trying to use awk to match the NM_ in file with $1 of id which is tab-delimited. The NM_ will always be in the line of file that starts with > and be after the second _. When there is a match between each NM_ and id, then the value of $2 in id is substituted or used to update the NM_. Each NM_... (3 Replies)
In the tab-delimited files, I am trying to match
$1,$2,$3,$4,$5 in fiel1 with $1,$2,$3,$4,$5 in fiel2 and create and output file that lists what matches and what was not found (or doesn't match).
However the awk below seems to skip the first line and does not produce the desired output. I think... (2 Replies)
I am trying to output the matches between $1 of file1 to $3 of file2 into a new file match.
I am also wanting to output the mismatches between those same 2 files and fields to two separate new files called missing from file1 and missing from file2. The input files are tab-delimited, but the... (9 Replies)
I am trying to remove lines in the target.txt file if $5 before the - in that file matches sorted_list. I have tried grep and awk. Thank you :).
grep
grep -v -F -f targets.bed sort_list
grep -vFf sort_list targets
awk
awk -F, '
> FILENAME == ARGV {to_remove=1; next}
> ! ($5 in... (2 Replies)
Hi All,
I'd like to create a specific output filename for AWK.
The file I am processing with AWK looks like:
output_081012.csv*
27*TEXT*1.0*2.0*3.0
where * is my delimeter and the first line of the file is the output filename i'd like to create
is there a way to assign an awk... (10 Replies)
Can anyone please help with this? I have 2 files as given below.
If 2nd column of file1 has pattern foo1@a, find the matching 1st column in file2 & replace 2nd column of file1 with file2's value.
file1
abc_1 foo1@a ....
abc_1 soo2@a ...
def_2 soo2@a ....
def_2 foo1@a ........ (7 Replies)
Trying to sum field #6 when field #2 matches string as follows:
Input data:
2010-09-18-20.24.44.206117 UOWEXEC db2bp DB2XYZ hostname 1
2010-09-18-20.24.44.206117 UOWWAIT db2bp DB2XYZ hostname ... (3 Replies)
Hi,
I have the following text file:
8 T1mapping_flip02 ok 128 108 30 1 665000-000008-000001.dcm
9 T1mapping_flip05 ok 128 108 30 1 665000-000009-000001.dcm
10 T1mapping_flip10 ok 128 108 30 1 665000-000010-000001.dcm
11 T1mapping_flip15 ok 128 108 30... (2 Replies)
Hi guys!
I'll make this short... Is there any good way to get the day number that first matches the Monday column from the cal command output with awk (or any other text manipulator commands) ?
I'm sorry if my question wasn't clear at all.
For example...
One cal output would be
$... (6 Replies)