Sponsored Content
Full Discussion: Word Frequency Sort
Top Forums Shell Programming and Scripting Word Frequency Sort Post 302505918 by pravin27 on Friday 18th of March 2011 02:44:48 AM
Old 03-18-2011
Try this,
Code:
awk '{gsub(/[^[:alnum:]_[:blank:]]/, "", $0);for (i = 1; i <= NF; i++) {freq[$i]++}} END {for (word in freq){printf "%d\t%s\n", freq[word],word}}' inputfile | sort -nr

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to sort a word

Can you Tell me how to sort a word (alphabetically using shell scripts only not by using perl script) For example : input word is damodar Output : aaddmor (1 Reply)
Discussion started by: gyana_cboy
1 Replies

2. Shell Programming and Scripting

Determining Word Frequency of Specific Terms

Hello, I require a perl script that will read a .txt file that contains words like 224.199.207.IN-ADDR.ARPA. IN NS NS1.internet.com. 4.200.162.207.in-addr.arpa. IN PTR beeriftw.internet.com. arroyoeinternet.com. IN A 200.199.227.49 I want to focus on words: IN... (23 Replies)
Discussion started by: richsark
23 Replies

3. Shell Programming and Scripting

Word frequency with additional information

Hello everyone, I am using a chunk of code to display the frequency of a file name in a list of directories. The code looks like this: find . -name "*.log" | cut -d/ -f4 | cut -d. -f1 | awk '{print $1}' | sort | uniq -c | sort -nr The file paths would look something like this:... (1 Reply)
Discussion started by: ToeLint
1 Replies

4. Shell Programming and Scripting

word frequency counter - awk solution?

Dear all, i need your help on this. There is a text file, i need to count word frequency for each word with frequency >40 in each line of file and output it into another file with columns like this: word1,word2,word3, ...wordn 0,0,1 1,2,0 3,2,0 etc -- each raw represents... (13 Replies)
Discussion started by: irrevocabile
13 Replies

5. Shell Programming and Scripting

Help with calculating frequency of specific word in a string

Input file: #read_1 AWEAWQQRZZZQWQQWZ #read_2 ZZAQWRQTWQQQWADSADZZZ #read_3 POGZZZZZZADWRR . . Desired output file: #read_1 3 #read_1 1 #read_2 2 #read_2 3 #read_3 6 . . (3 Replies)
Discussion started by: perl_beginner
3 Replies

6. Shell Programming and Scripting

Script to sort large file with frequency

Hello, I have a very large file of around 2 million records which has the following structure: I have used the standard awk program to sort: # wordfreq.awk --- print list of word frequencies { # remove punctuation #gsub(/_]/, "", $0) for (i = 1; i <= NF; i++) freq++ } END { for (word... (3 Replies)
Discussion started by: gimley
3 Replies

7. Shell Programming and Scripting

Help with sort word and general numeric sort at the same time

Input file: 100%ABC2 3.44E-12 USA A2M%H02579 0E0 UK 100%ABC2 5.34E-8 UK 100%ABC2 3.25E-12 USA A2M%H02579 5E-45 UK Output file: 100%ABC2 3.44E-12 USA 100%ABC2 3.25E-12 USA 100%ABC2 5.34E-8 UK A2M%H02579 0E0 UK A2M%H02579 5E-45 UK Code try: sort -k1,1 -g -k2 -r input.txt... (2 Replies)
Discussion started by: perl_beginner
2 Replies

8. Shell Programming and Scripting

Shell scripting: frequency of specific word in a string and statistics

Hello friends, I need a BIG help from UNIX collective intelligence: I have a CSV file like this: VALUE,TIMESTAMP,TEXT 1,Sun May 05 16:13:05 +0000 2013,"RT @gracecheree: Praying God sends me a really great man one day. Gotta trust in his timing. 0,Sun May 05 16:13:05 +0000 2013,@sendi__... (19 Replies)
Discussion started by: kraterions
19 Replies

9. UNIX for Advanced & Expert Users

Sort words based on word count on each line

Hi Folks :) I have a .txt file with thousands of words. I'm trying to sort the lines in order based on number of words per line. Example from: word word word word word word word word word word word word word word word word to desired output: word (2 Replies)
Discussion started by: martinsmith
2 Replies

10. UNIX for Beginners Questions & Answers

How to align/sort the column pairs of an csv file, based on keyword word specified in another file?

I have a csv file as shown below, xop_thy 80 avr_njk 50 str_nyu 60 avr_irt 70 str_nhj 60 avr_ngt 50 str_tgt 80 xop_nmg 50 xop_nth 40 cyv_gty 40 cop_thl 40 vir_tyk 80 vir_plo 20 vir_thk 40 ijk_yuc 70 cop_thy 70 ijk_yuc 80 irt_hgt 80 I need to align/sort the csv file based... (7 Replies)
Discussion started by: dineshkumarsrk
7 Replies
WORD-LIST-COMPRESS(1)					 Aspell Abbreviated User's Manual				     WORD-LIST-COMPRESS(1)

NAME
word-list-compress - word list compressor/decompressor for GNU Aspell SYNOPSIS
word-list-compress c[ompress] | d[ecompress] DESCRIPTION
word-list-compress compresses or decompresses sorted word lists for use with the GNU Aspell spell checker. COMMANDS
-c, c, compress compress the plain text word list read from standard input. -d, d, decompress decompress the compressed word list read from standard input. EXAMPLES
Here are a few examples of how you can use word-list-compress word-list-compress d <wordlist.cwl >wordlist.txt Decompress file wordlist.cwl to text file wordlist.txt word-list-compress c <wordlist.wl >wordlist.cwl 2>errors.txt Compress wordlist.wl to wordlist.cwl and send any error messages to a text file named errors.txt LC_COLLATE=C sort -u <wordlist.txt | word-list-compress c >wordlist.cwl Sort a word list, then pipe it to word-list-compress to create a compressed binary wordlist.cwl file. word-list-compress d <words.cwl | aspell create master ./words.rws Decompress a wordlist, then pipe it to aspell(1) to create a spelling list. Please check the aspell(1) info manual for proper usage and options. TIPS
Word-list-compress is best used with sorted word list type files. It is not a general purpose compression program since the resulting files may actually increase in size. Word-list-compress accepts up to 255 text characters in the range of {0x21...0xFF}. If your word list requires a larger character set for certain languages or longer length for multi-word, scientific, medical, technical or other use, then it is recommended that you compress your word list using prezip-bin(1) DIAGNOSTICS
Word-list-compress normally exits with a return code of 0. If it encounters an error, a message is sent to standard error output (stderr), and word-list-compress exits with a non-zero return value. Error messages are listed below: (display help/usage message) Unknown command given on the command line so word-list-compress displays a usage message to standard error output. Corrupt Input This is only for the decompression command d. The input file is of an unknown format or the input file/stream is corrupted. You may have some valid output, but word-list-compress could not complete the process. If the input file is a compressed wordlist but you have no output file, then it may be a newer prezip-bin(1) version of compressed file, if so, try decompressing the file with prezip-bin(1) instead. Output Data Error The output is full, write protected, or has an error and can no longer be written to. SEE ALSO
aspell(1), aspell-import(1), prezip-bin(1), run-with-aspell(1) Aspell is fully documented in its Texinfo manual. See the `aspell' entry in info for more complete documentation. REPORTING BUGS
For help, see the Aspell homepage at <http://aspell.net> and send bug reports/comments to the Aspell user list at the above address. AUTHOR
This manual page was written by Aaron Lehmann <aaronl@vitelus.com>, Brian Nelson <pyro@debian.org> and Jose Da Silva <digital@joescat.com>. GNU
2005-09-05 WORD-LIST-COMPRESS(1)
All times are GMT -4. The time now is 02:20 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy