Sponsored Content
Full Discussion: Count unique words
Top Forums UNIX for Beginners Questions & Answers Count unique words Post 302991682 by imranrasheedamu on Wednesday 15th of February 2017 10:06:48 AM
Old 02-15-2017
Count unique words

Dear all,

I would like to know how to list and count unique words in thousands number of text files.

Please help me out
thanks in advance
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to read all the unique words in a text file

How can i read all the unique words in a file, i used - cat comment_file.txt | /usr/xpg6/bin/tr -sc 'A-Za-z' '/012' and cat comment_file.txt | /usr/xpg6/bin/tr -sdc 'A-Za-z' '/012' but they didnt worked..... (5 Replies)
Discussion started by: aditya.ece1985
5 Replies

2. Shell Programming and Scripting

Finding the number of unique words in a file

find the number of unique words in a file using sort com- mand. (7 Replies)
Discussion started by: abhikamune
7 Replies

3. Shell Programming and Scripting

Shell script to find out words, replace them and count words

hello, i 'd like your help about a bash script which: 1. finds inside the html file (it is attached with my post) the code number of the Latest Stable Kernel, 2.finds the link which leads to the download location of the Latest Stable Kernel version, (the right link should lead to the file... (3 Replies)
Discussion started by: alex83
3 Replies

4. Homework & Coursework Questions

unique words in files of folder and its subfolders

Hello, I tried to count all unique words of all files in one folder and its subfolders. Can anybody say me, why this doesnt work: ls| find -d | cat | tr "\ " "\n"| uniq -u | wc -l ??? Cat writes only the names of those files, but not the wors, which should be in them. Thanks for any advice. ... (9 Replies)
Discussion started by: Dworza
9 Replies

5. Shell Programming and Scripting

display unique words.

I am having a file with duplicate words how can I eliminate them ant,bat bat,cat cat a.txt | grep -bat | awk '{print $1}' expecting o/p as ant,bat,cat How can I display the output as ant,bat,cat in a single line and no duplicates exists. (2 Replies)
Discussion started by: shikshavarma
2 Replies

6. Shell Programming and Scripting

Unique words in each line

In each row there could be repetition of a word. I want to delete all repetitions and keep unique occurrences. Example: a+b+c ab+c ab+c abbb+c ab+bbc a+bbbc aaa aaa aaa Output: a+b+c ab+c abbb+c ab+bbc a+bbbc aaa (6 Replies)
Discussion started by: Viernes
6 Replies

7. Shell Programming and Scripting

awk to count using each unique value

Im looking for an awk script that will take the unique values in column 5, then print and count the unique values in column 6. CA001011500 11111 11111 -9999 201301 AAA CA001012040 11111 11111 -9999 201301 AAA CA001012573 11111 11111 -9999 201301 BBB CA001012710 11111 11111 -9999 201301... (4 Replies)
Discussion started by: ncwxpanther
4 Replies

8. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Hi , I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l (1 Reply)
Discussion started by: jmarx
1 Replies

9. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Discussion started by: angshuman
4 Replies

10. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Hello, I have a dictionary which I am building for the Open Source Community. The data structure is as under HEADWORD=PARTOFSPEECH=ENGLISH MEANING as shown in the example below अ=m=Prefix signifying negation. अँहँ=ind=Interjection expressing disapprobation. अं=int=An interjection... (2 Replies)
Discussion started by: gimley
2 Replies
wc(1)							      General Commands Manual							     wc(1)

NAME
wc - count words, lines, and bytes or characters in a file SYNOPSIS
[file]... DESCRIPTION
The command counts lines, words, and bytes or characters in the named files, or in the standard input if no file names are specified. It also keeps a total count for all named files. A word is a string of characters delimited by spaces, tabs, or newlines. Options recognizes the following options: Report the number of bytes in each input file. Report the number of newline characters in each input file. Report the number of characters in each input file. Report the number of words in each input file. The and options are mutually exclusive. Otherwise, the and or options can be used in any combination to specify that a subset of lines, words, and bytes or characters are to be reported. When any option is specified, reports only the information requested. If no option is specified, the default output is When a file is specified on the command line, its name is printed along with the counts. Standard Output By default, the standard output contains an entry for each input file in the form: newlines words bytes file If the option is specified, the number of characters replaces the bytes field in this format. If any option is specified, the fields for the unspecified options are omitted. If no file operand is specified, neither the file name nor the preceding blank character is written. If more than one file operand is specified, an additional line is written at the end of the output, of the same format as the other lines, except that the word (in the POSIX locale) is written instead of a file name and the total of each column is written as appropriate. Under UNIX Standard environment, a word is a string of characters delimited by spaces, tabs, newline, carriage-return, vertical tab, or form-feed. RETURN VALUE
exits with one of the following values: Successful completion. An error occurred. EXTERNAL INFLUENCES
For information about the UNIX Standard environment, see standards(5). Environment Variables determines the range of graphics and space characters, and the interpretation of text as single- and/or multibyte characters. determines the language in which messages are displayed. If or is not specified in the environment or is null, they default to the value of If is not specified or is null, it defaults to (see lang(5)). If any internationalization variable contains an invalid setting, they all default to See environ(5). International Code Set Support Single- and multibyte character code sets are supported. with a newline character, the count will be off by one. WARNINGS
The command counts the number of newlines to determine the line count. If a text file has a final line that is not terminated with a new- line character, the count will be off by one. EXAMPLES
Print the number of words and characters in The following is printed when the above command is executed: where words is the number of words and chars is the number of characters in SEE ALSO
standards(5). STANDARDS CONFORMANCE
wc(1)
All times are GMT -4. The time now is 10:21 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy