Sponsored Content
Full Discussion: Word Frequency Sort
Top Forums Shell Programming and Scripting Word Frequency Sort Post 302505890 by gimley on Thursday 17th of March 2011 11:22:51 PM
Old 03-18-2011
Word Frequency Sort

hello,
Here is a program for creating a word-frequency

# wf.gk --- program to generate word frequencies from a file
{
# remove punctuation: This will remove all punctuations from the file
gsub(/[^[:alnum:]_[:blank:]]/, "", $0)
#Start frequency analysis
for (i = 1; i <= NF; i++)
freq[$i]++
}
END
#Print output
{
for (word in freq)
printf "%s\t%d\n", word, freq[word]
}
The program runs fine but I cannot get the last part to print out the frequency first and then massage the data to sort from Highest to lowest.
Please help and if possible and if it is not too much trouble, could the code be commented to help me and others like me learn.
Many thanks in advance,

Gimley
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to sort a word

Can you Tell me how to sort a word (alphabetically using shell scripts only not by using perl script) For example : input word is damodar Output : aaddmor (1 Reply)
Discussion started by: gyana_cboy
1 Replies

2. Shell Programming and Scripting

Determining Word Frequency of Specific Terms

Hello, I require a perl script that will read a .txt file that contains words like 224.199.207.IN-ADDR.ARPA. IN NS NS1.internet.com. 4.200.162.207.in-addr.arpa. IN PTR beeriftw.internet.com. arroyoeinternet.com. IN A 200.199.227.49 I want to focus on words: IN... (23 Replies)
Discussion started by: richsark
23 Replies

3. Shell Programming and Scripting

Word frequency with additional information

Hello everyone, I am using a chunk of code to display the frequency of a file name in a list of directories. The code looks like this: find . -name "*.log" | cut -d/ -f4 | cut -d. -f1 | awk '{print $1}' | sort | uniq -c | sort -nr The file paths would look something like this:... (1 Reply)
Discussion started by: ToeLint
1 Replies

4. Shell Programming and Scripting

word frequency counter - awk solution?

Dear all, i need your help on this. There is a text file, i need to count word frequency for each word with frequency >40 in each line of file and output it into another file with columns like this: word1,word2,word3, ...wordn 0,0,1 1,2,0 3,2,0 etc -- each raw represents... (13 Replies)
Discussion started by: irrevocabile
13 Replies

5. Shell Programming and Scripting

Help with calculating frequency of specific word in a string

Input file: #read_1 AWEAWQQRZZZQWQQWZ #read_2 ZZAQWRQTWQQQWADSADZZZ #read_3 POGZZZZZZADWRR . . Desired output file: #read_1 3 #read_1 1 #read_2 2 #read_2 3 #read_3 6 . . (3 Replies)
Discussion started by: perl_beginner
3 Replies

6. Shell Programming and Scripting

Script to sort large file with frequency

Hello, I have a very large file of around 2 million records which has the following structure: I have used the standard awk program to sort: # wordfreq.awk --- print list of word frequencies { # remove punctuation #gsub(/_]/, "", $0) for (i = 1; i <= NF; i++) freq++ } END { for (word... (3 Replies)
Discussion started by: gimley
3 Replies

7. Shell Programming and Scripting

Help with sort word and general numeric sort at the same time

Input file: 100%ABC2 3.44E-12 USA A2M%H02579 0E0 UK 100%ABC2 5.34E-8 UK 100%ABC2 3.25E-12 USA A2M%H02579 5E-45 UK Output file: 100%ABC2 3.44E-12 USA 100%ABC2 3.25E-12 USA 100%ABC2 5.34E-8 UK A2M%H02579 0E0 UK A2M%H02579 5E-45 UK Code try: sort -k1,1 -g -k2 -r input.txt... (2 Replies)
Discussion started by: perl_beginner
2 Replies

8. Shell Programming and Scripting

Shell scripting: frequency of specific word in a string and statistics

Hello friends, I need a BIG help from UNIX collective intelligence: I have a CSV file like this: VALUE,TIMESTAMP,TEXT 1,Sun May 05 16:13:05 +0000 2013,"RT @gracecheree: Praying God sends me a really great man one day. Gotta trust in his timing. 0,Sun May 05 16:13:05 +0000 2013,@sendi__... (19 Replies)
Discussion started by: kraterions
19 Replies

9. UNIX for Advanced & Expert Users

Sort words based on word count on each line

Hi Folks :) I have a .txt file with thousands of words. I'm trying to sort the lines in order based on number of words per line. Example from: word word word word word word word word word word word word word word word word to desired output: word (2 Replies)
Discussion started by: martinsmith
2 Replies

10. UNIX for Beginners Questions & Answers

How to align/sort the column pairs of an csv file, based on keyword word specified in another file?

I have a csv file as shown below, xop_thy 80 avr_njk 50 str_nyu 60 avr_irt 70 str_nhj 60 avr_ngt 50 str_tgt 80 xop_nmg 50 xop_nth 40 cyv_gty 40 cop_thl 40 vir_tyk 80 vir_plo 20 vir_thk 40 ijk_yuc 70 cop_thy 70 ijk_yuc 80 irt_hgt 80 I need to align/sort the csv file based... (7 Replies)
Discussion started by: dineshkumarsrk
7 Replies
mlib_SignalMelCepstral_S16(3MLIB)			    mediaLib Library Functions				 mlib_SignalMelCepstral_S16(3MLIB)

NAME
mlib_SignalMelCepstral_S16 - perform cepstral analysis in mel frequency scale SYNOPSIS
cc [ flag... ] file... -lmlib [ library... ] #include <mlib.h> mlib_status mlib_SignalMelCepstral_S16(mlib_s16 *cepst, mlib_s32 cscale, const mlib_s16 *signal, void *state); DESCRIPTION
The mlib_SignalMelCepstral_S16() function performs cepstral analysis in mel frequency scale. The user supplied scaling factor will be used and the output will be saturated if necessary. The first two steps of mel scale cepstral analysis is the same as in general cepstral anaysis. After the logarithm of the spectrum magni- tude is obtained, it is converted into mel frequency scale before the inverse Fourier transform. +-----------+ +-----------+ | Linear | | Inverse | ... ------>| to |------->| Fourier |-----> X'(k) | Mel Scale | X''(m) | Transform | c(n) +-----------+ +-----------+ where X'(k) is defined in linear frequency scale and X''(m) is defined in mel frequency scale. The mel frequency scale is defined as following. freq_mel = melmul * LOG10(1 + freq_linear / meldiv) where freq_mel is the frequency in mel scale, freq_linear is the frequency in linear scale, melmul is the multiplying factor, muldiv is the dividing factor. Optionally, a bank of band pass filters in linear frequency scale can be used below the bank of band pass filters in mel frequency scale, as shown below in linear frequency scale. 0 f1 f2 f3 fp fp+1 fp+2 fp+3 fp+q |---|---|---| ... |---|----|-----| ... | ... -> freq where fp = melbgn, fp+q = melend, p = nlinear, q = nmel; the filters number 1 to p are defined in linear frequency scale which have equal bandwidth in linear frequency scale; the filters number p+1 to p+q are defined in mel frequency scale which have equal bandwidth in mel frequency scale and increasing bandwidth in linear frequency scale. See Digital Signal Processing by Alan V. Oppenheim and Ronald W. Schafer, Prentice Hall, 1974. See Fundamentals of Speech Recognition by Lawrence Rabiner and Biing-Hwang Juang, Prentice Hall, 1993. PARAMETERS
The function takes the following arguments: cepst The cepstral coefficients. cscale The scaling factor of cepstral coefficients, where actual_data = output_data * 2**(-scaling_factor). signal The input signal vector, the signal samples are in Q15 format. state Pointer to the internal state structure. RETURN VALUES
The function returns MLIB_SUCCESS if successful. Otherwise it returns MLIB_FAILURE. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Interface Stability |Committed | +-----------------------------+-----------------------------+ |MT-Level |MT-Safe | +-----------------------------+-----------------------------+ SEE ALSO
mlib_SignalMelCepstralInit_S16(3MLIB), mlib_SignalMelCepstral_S16_Adp(3MLIB), mlib_SignalMelCepstralFree_S16(3MLIB), attributes(5) SunOS 5.11 2 Mar 2007 mlib_SignalMelCepstral_S16(3MLIB)
All times are GMT -4. The time now is 03:06 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy