You can avoid the second sort if you use a simple regular array instead of a hash. The idea to sort the lengths and then print the leader is kind of neat, but only keeping the maximum saves memory.
If you want to keep the sort idea, use an array of arrays.
Actually I got a list of file end with *.txt
I want to use the same command apply to all the *.txt
Thus I try to find out the fastest way to write those same command in a script and then want to let them run automatics.
For example:
I got the file below:
file1.txt
file2.txt
file3.txt... (4 Replies)
Hi Experts,
I am adding a column of numbers with awk , however not getting correct output:
# awk '{sum+=$1} END {print sum}' datafile
2.15291e+06
How can I getthe output like : 2152910
Thank you..
# awk '{sum+=$1} END {print sum}' datafile
2.15079e+06 (3 Replies)
Hi
I have many problems with a script. I have a script that formats a text file but always prints the same error when i try to execute it
The code is that:
{
if (NF==17){
print $0
}else{
fields=NF;
all=$0;
while... (2 Replies)
I have two files which I would like to compare and then manipulate in a way.
File1:
pictures.txt 1.1 1.3
dance.txt 1.2 1.4
treehouse.txt 1.3 1.5
File2:
pictures.txt 1.5 ref2313 1.4 ref2345 1.3 ref5432 1.2 ref4244
dance.txt 1.6 ref2342 1.5 ref2352 1.4 ref0695 1.3 ref5738 1.2... (1 Reply)
Hi,
I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files.
To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Hello experts,
I'm stuck with this script for three days now. Here's what i need.
I need to split a large delimited (,) file into 2 files based on the value present in the last field.
Samp: Something.csv
bca,adc,asdf,123,12C
bca,adc,asdf,123,13C
def,adc,asdf,123,12A
I need this split... (6 Replies)
Hi,
I am trying to pass awk field to a command line executed within awk (need to convert a timestamp into formatted date).
All my attempts failed this far.
Here's an example.
It works fine with timestamp hard-codded into the command
echo "1381653229 something" |awk 'BEGIN{cmd="date -d... (4 Replies)
Good evening, Im newbie at unix specially with awk
From an scheduler program called Autosys i want to extract some data reading an inputfile that comprises jobs names, then formating the output to columns for example
1.
This is the inputfile:
$ more MapaRep.txt
ds_extra_nikira_usuarios... (18 Replies)
Discussion started by: alexcol
18 Replies
LEARN ABOUT DEBIAN
fastx_quality_stats
FASTX_QUALITY_STATS(1) User Commands FASTX_QUALITY_STATS(1)NAME
fastx_quality_stats - FASTX Statistics
DESCRIPTION
usage: fastx_quality_stats [-h] [-N] [-i INFILE] [-o OUTFILE] Part of FASTX Toolkit 0.0.13.2 by A. Gordon (gordon@cshl.edu)
[-h] = This helpful help screen. [-i INFILE] = FASTQ input file. default is STDIN. [-o OUTFILE] = TEXT output file. default is
STDOUT. [-N] = New output format (with more information per nucleotide/cycle).
The *OLD* output TEXT file will have the following fields (one row per column):
column = column number (1 to 36 for a 36-cycles read solexa file)
count = number of bases found in this column.
min = Lowest quality score value found in this column.
max = Highest quality score value found in this column.
sum = Sum of quality score values for this column.
mean = Mean quality score value for this column.
Q1 = 1st quartile quality score.
med = Median quality score.
Q3 = 3rd quartile quality score.
IQR = Inter-Quartile range (Q3-Q1).
lW = 'Left-Whisker' value (for boxplotting).
rW = 'Right-Whisker' value (for boxplotting).
A_Count = Count of 'A' nucleotides found in this column. C_Count = Count of 'C' nucleotides found in this column. G_Count = Count
of 'G' nucleotides found in this column. T_Count = Count of 'T' nucleotides found in this column. N_Count = Count of 'N' nucleo-
tides found in this column. max-count = max. number of bases (in all cycles)
The *NEW* output format:
cycle (previously called 'column') = cycle number max-count For each nucleotide in the cycle (ALL/A/C/G/T/N):
count = number of bases found in this column.
min = Lowest quality score value found in this column.
max = Highest quality score value found in this column.
sum = Sum of quality score values for this column.
mean = Mean quality score value for this column.
Q1 = 1st quartile quality score.
med = Median quality score.
Q3 = 3rd quartile quality score.
IQR = Inter-Quartile range (Q3-Q1).
lW = 'Left-Whisker' value (for boxplotting).
rW = 'Right-Whisker' value (for boxplotting).
SEE ALSO
The quality of this automatically generated manpage might be insufficient. It is suggested to visit
http://hannonlab.cshl.edu/fastx_toolkit/commandline.html
to get a better layout as well as an overview about connected FASTX tools.
fastx_quality_stats 0.0.13.2 May 2012 FASTX_QUALITY_STATS(1)