12-12-2007
printing only wanted rows in awk
Hi!
The fallowing awk script counts words from input file, then sorts these words to decreasing order of occurrences and also to alphabetical order. And then prints all these words out with the number of their occurrence. For example:
and 7
for 4
make 4
you 4
awk 1
....
Problem is that if the text file includes thousands of words then the output is also very long. And I'm only interested of first 10 most occurred word, which means that I'd like to print out only first 10 rows. I have tried to change the
printf command to print only first 10 sorted rows, but i have had no success
Is it even possible to achieve this goal by only changing the
printf command? Should i try something else?
script:
{
$0 = tolower($0)
gsub(/[^[:alnum:]_[:blank:]]/, "", $0)
for (i = 1; i <= NF; i++)
freq[$i]++
}
END {
sort = "sort -k 2nr"
for (word in freq)
printf "%s\t%d\n", word, freq[word] | sort
close(sort)
}
Thanks in advance!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hello,
I am trying to write a formatted report into a file using .ksh script and awk. Here is the command I am trying to run
echo "before awk" ${SRC_SCHEMA}
echo | awk '{printf "%-20s", ${SRC_SCHEMA} }' >>$REPORT_SQL_NAME
I get the following error
before awk ADW
awk: 0602-562 Field $()... (1 Reply)
Discussion started by: fastgoon
1 Replies
2. Shell Programming and Scripting
i have a file and i want to print the second variable and add qoutes to it
i do
awk -F"|" '{print $2}' star.unl.
i get the output xxxxxxx
but i need the variable($2) to be in quotes.like
"xxxxxxx"
how do i do there please (3 Replies)
Discussion started by: tomjones
3 Replies
3. Shell Programming and Scripting
i have a file containing a line
123456
is it possible to use AWK to print it out to look like
1 2 3 4 5 6 (8 Replies)
Discussion started by: tomjones
8 Replies
4. Shell Programming and Scripting
Hi,
I mainly work with altering columns with awk but now I encountered a problem with dealing with rows.
So what I want to do is only print rows that start with a specific name. For example:
## joe jack john
ty1 3 4
ty1 5 6
ty2 4 7
tym 5 6
tyz 7 9
Basically what I want... (4 Replies)
Discussion started by: phil_heath
4 Replies
5. UNIX for Dummies Questions & Answers
I know you can awk for columns but how do you awk for rows? Fo instance rows 10-20.
Any help much appreciated. (2 Replies)
Discussion started by: RAFC_99
2 Replies
6. Shell Programming and Scripting
Hallo,
i have a file which looks like this:
$1 $2 $3
Student1 55 Pass
55 Pass
35 Fail
Student2 55 Pass
55 Pass
35 Fail
i want that the $1 field... (3 Replies)
Discussion started by: saint2006
3 Replies
7. Shell Programming and Scripting
Hi I'm trying to compare 3 or more files based on similar values and outputting them into 3 columns.
For example:
file1
ABC
DEF
GHI
file2
DEF
DER
file3
ABC
DER
The output should come out like this
file1 file2 file3
ABC ABC (4 Replies)
Discussion started by: zerofire123
4 Replies
8. Shell Programming and Scripting
Hi
I am relatively new to awk so i am getting confused a lot
I am in need of help ... I am trying to append coloumns to the end of line using AWK
I tried using this command
awk -F "," '{for(s=7;s<=217;s++);$s="0";}1' OFS=, sam_sri_out
It is giving me an output like this...... (1 Reply)
Discussion started by: Sri3001
1 Replies
9. Shell Programming and Scripting
Hello, I have a file with nearly 57K lines. I want to filter the lines based on the range of values in a column. For e.g. print lines whose 3rd filed is >=0.02.
Input file:
LOC_Os09g32030 LOC_Os02g18880 0.0200037219149773 undirected NA NA
LOC_Os03g58630 LOC_Os09g35690 ... (1 Reply)
Discussion started by: Sanchari
1 Replies
10. Shell Programming and Scripting
I have a file of 100,000 entries that look like:
chr1 980547 980667 +
chr1:980547-980667
chr1 980728 980848 +
chr1:980728-980848
chr1 980793 980913 +
chr1:980793-980913
I am trying to reformat them to into 5 columns that are tab delineated:
chr1 980547 980667 + ... (3 Replies)
Discussion started by: cmccabe
3 Replies
LEARN ABOUT ULTRIX
spellout
spell(1) General Commands Manual spell(1)
Name
spell, spellin, spellout - check text for spelling errors
Syntax
spell [-v] [-b] [-x] [-d hlist] [+local-file] [-s hstop] [-h spellhist] [file...]
spellin [list]
spellout [-d] list
Description
The command collects words from the named documents, and looks them up in a spelling list. Words that are not on the spelling list and are
not derivable from words on the list (by applying certain inflections, prefixes or suffixes) are printed on the standard output. If no
files are specified, words are collected from the standard input.
The command ignores most and constructions.
Two routines help maintain the hash lists used by Both expect a set of words, one per line, from the standard input. The command combines
the words from the standard input and the preexisting list file and places a new list on the standard output. If no list file is speci-
fied, a new list is generated. The command looks up each word from the standard input and prints on the standard output those that are
missing from (or present on, with option -d) the hashed list file. For example, to verify that hookey is not on the default spelling list,
add it to your own private list, and then use it with
echo hookey | spellout /usr/dict/hlista
echo hookey | spellin /usr/dict/hlista > myhlist
spell -d myhlist <filename>
Options
-v Displays words not found in spelling list with all plausible derivations from spelling list.
-b Checks data according to British spelling. Besides preferring centre, colour, speciality, travelled, this option insists
upon -ise instead of -ize in words like standardise.
-x Precedes each word with an equal sign (=) and displays all plausible derivations.
-d hlist Specifies the file used for the spelling list.
-h spellhist Specifies the file used as the history file.
-s hstop Specifies the file used for the stop list.
+local-file Removes words found in local-file from the output of the command. The argument local-file is the name of a file provided by
the user that contains a sorted list of words, one per line. With this option, the user can specify a list of words for a
particular job that are spelled correctly.
The auxiliary files used for the spelling list, stop list, and history file may be specified by arguments following the -d, -s, and -h
options. The default files are indicated below. Copies of all output may be accumulated in the history file. The stop list filters out
misspellings (for example, thier=thy-y+ier) that would otherwise pass.
Restrictions
The coverage of the spelling list is uneven; new installations will probably wish to monitor the output for several months to gather local
additions.
The command works only with ASCII text files.
Files
/usr/dict/hlist[ab] hashed spelling lists, American & British, default for -d
/usr/dict/hstop hashed stop list, default for -s
/dev/null history file, default for -h
/tmp/spell.$$* temporary files
/usr/lib/spell
See Also
deroff(1), sed(1), sort(1), tee(1)
spell(1)