Sponsored Content
Top Forums Shell Programming and Scripting Word Occurrences script using awk Post 302923182 by Chubler_XL on Friday 31st of October 2014 12:20:13 AM
Old 10-31-2014
Perhaps you have some non-printing characters in the file.
Maybe it's from MSDOS and has LF characters, you could try dos2unix filename first

or try
Code:
{
    $0 = tolower($0)
    gsub(/\r/, x, $0)
    for ( i = 1; i <= NF; i++ )
    freq[$i]++
}
BEGIN { printf "%-20s %-6s\n", "Word", "Count"}
END {
    sort = "sort -k 2nr"
    for (word in freq)
        printf "%-20s %-6s\n", word, freq[word] | sort
    close(sort)
}


Last edited by Chubler_XL; 10-31-2014 at 01:30 AM..
This User Gave Thanks to Chubler_XL For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count the number of occurrences of the word

I am a newbie in UNIX shell script and seeking help on this UNIX function. Please give me a hand. Thanks. I have a large file. Named as 'MyFile'. It was tab-delmited. I am told to write a shell function that counts the number of occurrences of the ord “mysring” in the file 'MyFile'. (1 Reply)
Discussion started by: duke0001
1 Replies

2. Shell Programming and Scripting

awk and gsub - how to replace only the first X occurrences

I have a text (text.txt) and I would like to replace only the first 2 occurrences of a word (but I might need to replace more): For example, if text is this: CAR sweet head hat red yellow CAR book brown tiger CAR cow CAR CAR milk I would like to replace the word "CAR" with word... (12 Replies)
Discussion started by: bingel
12 Replies

3. Homework & Coursework Questions

Du without directory and Grep for occurrences of a word

Assistance on work Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Files stored in ... (1 Reply)
Discussion started by: alindner
1 Replies

4. Shell Programming and Scripting

Count occurrences in awk

Hello, I have an output from GDB with many entries that looks like this 0x00007ffff7dece94 39 in dl-fini.c 0x00007ffff7dece97 39 in dl-fini.c 0x00007ffff7ab356c 50 in exit.c 0x00007ffff7aed9db in _IO_cleanup () at genops.c:1022 115 in dl-fini.c 0x00007ffff7decf7b in _dl_sort_fini (l=0x0,... (6 Replies)
Discussion started by: ikke008
6 Replies

5. Shell Programming and Scripting

Script to count word occurrences, but exclude some?

I am trying to count the occurrences of ALL words in a file. However, I want to exclude certain words: short words (i.e. <3 chars), and words contained in an blacklist file. There is also a desire to count words that are capitalized (e.g. proper names). I am not 100% sure where the line on... (5 Replies)
Discussion started by: Cronk
5 Replies

6. UNIX for Dummies Questions & Answers

BASH - Counting word occurrences in a Web Page

Hi all, I have to do a script bash (for university) that counts all word occurrences in a specific web page. anyone can help me?. Thanks :) (1 Reply)
Discussion started by: piacentero
1 Replies

7. UNIX for Dummies Questions & Answers

Awk: Counting occurrences between two files

Hi, I have two text files (1.txt and 2.txt). 2.txt contains two columns which are extracted from 1.txt using a simple if(condition) print. I want to: - count how many times the values contained in 2.txt appear in 1.txt -if they appear just one time, I have to delete the entire row in... (5 Replies)
Discussion started by: Pintug
5 Replies

8. Shell Programming and Scripting

awk Group By and count string occurrences

Hi Gurus, I'm scratching my head over and over and couldn't find the the right way to compose this AWK properly - PLEASE HELP :confused: Input: c,d,e,CLICK a,b,c,CLICK a,b,c,CONV c,d,e,CLICK a,b,c,CLICK a,b,c,CLICK a,b,c,CONV b,c,d,CLICK c,d,e,CLICK c,d,e,CLICK b,c,d,CONV... (6 Replies)
Discussion started by: Royi
6 Replies

9. UNIX for Advanced & Expert Users

Find 2 occurrences of a word and print file names

I was thinking something like this but it always gets rid of the file location. grep -roh base. | wc -l find . -type f -exec grep -o base {} \; | wc -l Would this be a job for awk? Would I need to store the file locations in an array? (3 Replies)
Discussion started by: cokedude
3 Replies

10. UNIX for Beginners Questions & Answers

awk or sed script to count number of occurrences and creating an average

Hi Friends , I am having one problem as stated file . Having an input CSV file as shown in the code U_TOP_LOGIC/U_HPB2/U_HBRIDGE2/i_core/i_paddr_reg_2_/Q,1,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,0,0... (4 Replies)
Discussion started by: kshitij
4 Replies
dos2unix(1)							   User Commands						       dos2unix(1)

NAME
dos2unix - convert text file from DOS format to ISO format SYNOPSIS
dos2unix [-ascii] [-iso] [-7] [-437 | -850 | -860 | -863 | -865] originalfile convertedfile DESCRIPTION
The dos2unix utility converts characters in the DOS extended character set to the corresponding ISO standard characters. This command can be invoked from either DOS or SunOS. However, the filenames must conform to the conventions of the environment in which the command is invoked. If the original file and the converted file are the same, dos2unix will rewrite the original file after converting it. OPTIONS
The following options are supported: -ascii Removes extra carriage returns and converts end of file characters in DOS format text files to conform to SunOS require- ments. -iso This is the default. It converts characters in the DOS extended character set to the corresponding ISO standard charac- ters. -7 Converts 8 bit DOS graphics characters to 7 bit space characters so that SunOS can read the file. On non-i386 systems, dos2unix will attempt to obtain the keyboard type to determine which code page to use. Otherwise, the default is US. The user may override the code page with one of the following options: -437 Use US code page -850 Use multilingual code page -860 Use Portuguese code page -863 Use French Canadian code page -865 Use Danish code page OPERANDS
The following operands are required: originalfile The original file in DOS format that is being converted to ISO format. convertedfile The new file in ISO format that has been converted from the original DOS file format. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ SEE ALSO
unix2dos(1), ls(1), attributes(5) DIAGNOSTICS
File filename not found, or no read permission The input file you specified does not exist, or you do not have read permission. Check with the SunOS command, ls -l (see ls(1)). Bad output filename filename, or no write permission The output file you specified is either invalid, or you do not have write permission for that file or the directory that contains it. Check also that the drive or diskette is not write-protected. Error while writing to temporary file An error occurred while converting your file, possibly because there is not enough space on the current drive. Check the amount of space on the current drive using the DIR command. Also be certain that the default diskette or drive is write-enabled (not write-pro- tected). Notice that when this error occurs, the original file remains intact. Translated temporary file name = filename. Could not rename temporary file to filename. The program could not perform the final step in converting your file. Your converted file is stored under the name indicated on the second line of this message. SunOS 5.10 14 Sep 2000 dos2unix(1)
All times are GMT -4. The time now is 12:26 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy