Finding the number of unique words in a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Finding the number of unique words in a file
# 8  
Old 09-01-2010
Assuming a unix format text file with multiple lines and one or more words per line. Words separated by space(s) and or tab(s).

Code:
cat filename|tr '\n' ' '|tr '\t' ' '|tr -s ' '|tr ' ' '\n'|sort|uniq|wc -l

Change newline to space.
Change tab to space.
Change multiple spaces to one space.
Change space to newline.
sort.
Remove duplicates with "uniq".
Count the number of unique words.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace all string matches in file with unique random number

Hello Take this file... Test01 Ref test Version 01 Test02 Ref test Version 02 Test66 Ref test Version 66 Test99 Ref test Version 99 I want to substitute every occurrence of Test{2} with a unique random number, so for example, if I was using sed, substitution would be something... (1 Reply)
Discussion started by: funkman
1 Replies

2. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Hi , I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l (1 Reply)
Discussion started by: jmarx
1 Replies

3. Shell Programming and Scripting

problem to count number of words from file

hi every one i have written this simple shell for counting number of word that user need to find from file but i have get several error when run it. can someone tell me the problem ? echo "Enter the file name" read file echo "enter word" read word for i in \`cat $file` do if then... (1 Reply)
Discussion started by: nimafire
1 Replies

4. Shell Programming and Scripting

Finding my lost file by searching for words in it

Got a question for you guys...I am searching through a public directory (that has tons of files) trying to find a file that I was working on a longggggg time ago. I can't remember what it is called, but I do remember the content. It should contains words like this: Joe Pulvo botnet zeus... (5 Replies)
Discussion started by: statichazard
5 Replies

5. Shell Programming and Scripting

Finding consecutive same words in a file

Hi All, I tried this but I am having trouble formulating this: I have a file that looks like this (this is a sample file words can be different): network router frame network router computer card host computer card One can see that in this file "network" and "router" occur... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

6. UNIX for Advanced & Expert Users

Count number of unique patterns from a log file

Hello Everyone I need your help in fixing this issue., I have a log file which has data of users logging in to an application. I want to search for a particular pattern in the log ISSessionValidated=N If this key word is found , the above 8 lines will contain the name of the user who's... (12 Replies)
Discussion started by: xtechkid
12 Replies

7. Shell Programming and Scripting

Split file by number of words

Dear all I am trying to divide a file using the number of words as a condition. Alternatively, I would at least like to be able to retrieve the first x words of a given file. Any tips? Thanks in advance. (7 Replies)
Discussion started by: aavv
7 Replies

8. Shell Programming and Scripting

Need help with finding unique string in log file

Shell script help Here is 3 sample lines from a log file <date> INFO <java.com.blah> abcd:ID= user login <date> DEBUG <java.com.blah> <nlah bla> abcd:ID=123 user login <date> INFO <java.com.blah> abcd:ID=3243 user login I want to find unique "ID" from this log... (3 Replies)
Discussion started by: gubbu
3 Replies

9. Shell Programming and Scripting

how to read all the unique words in a text file

How can i read all the unique words in a file, i used - cat comment_file.txt | /usr/xpg6/bin/tr -sc 'A-Za-z' '/012' and cat comment_file.txt | /usr/xpg6/bin/tr -sdc 'A-Za-z' '/012' but they didnt worked..... (5 Replies)
Discussion started by: aditya.ece1985
5 Replies

10. Shell Programming and Scripting

how to find capital letter names in a file without finding words at start of sentence

Hi, I want to be able to list all the names in a file which begin with a capital letter, but I don't want it to list words that begin a new sentence. Is there any way round this? Thanks for your help. (1 Reply)
Discussion started by: kev269
1 Replies
Login or Register to Ask a Question
LOOK(1) 						      General Commands Manual							   LOOK(1)

NAME
look - find lines in a sorted list SYNOPSIS
look [ -dfnixtc ] [ string ] [ file ] DESCRIPTION
Look consults a sorted file and prints all lines that begin with string. It uses binary search. The following options are recognized. Options dfnt affect comparisons as in sort(1). -i Interactive. There is no string argument; instead look takes lines from the standard input as strings to be looked up. -x Exact. Print only lines of the file whose key matches string exactly. -d `Directory' order: only letters, digits, tabs and blanks participate in comparisons. -f Fold. Upper case letters compare equal to lower case. -n Numeric comparison with initial string of digits, optional minus sign, and optional decimal point. -t[c] Character c terminates the sort key in the file. By default, tab terminates the key. If c is missing the entire line comprises the key. If no file is specified, /lib/words is assumed, with collating sequence df. FILES
/lib/words SOURCE
/sys/src/cmd/look.c SEE ALSO
sort(1), grep(1) DIAGNOSTICS
The exit status is "not found" if no match is found, and "no dictionary" if file or the default dictionary cannot be opened. LOOK(1)