Sponsored Content
Full Discussion: Count unique words
Top Forums UNIX for Beginners Questions & Answers Count unique words Post 302991742 by imranrasheedamu on Thursday 16th of February 2017 02:05:47 AM
Old 02-16-2017
Dear Joeyg

I have a list of thousands of text files like


Code:
3_March_2013_Front19.txt
10_May_2014_Page326.txt
5_October_2013_Sports36.txt
27_September_2010_Health314.txt
19_December_2012_Page316.txt
31_October_2012_Entertainment1094.txt
15_April_2013_Front14.txt
1_March_2013_Science&Technology33.txt
6_March_2012_MuslimWorld2.txt
19_October_2012_MuslimWorld4.txt
7_February_2012_International312.txt
23_August_2012_Front8.txt
24_July_2012_National22.txt
25_September_2012_Front20.txt
3_October_2014_Page35.txt

So, I would like to count the of total number and unique words for all files based on fourth field of the filename.

e.g.

Code:
if(filename==National)
count total and unique words

if(filename==International)
count total and unique words

if(filename==Health)
count total and unique words

and so on...

Please help me


Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 02-16-2017 at 04:14 AM.. Reason: Added CODE tags.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to read all the unique words in a text file

How can i read all the unique words in a file, i used - cat comment_file.txt | /usr/xpg6/bin/tr -sc 'A-Za-z' '/012' and cat comment_file.txt | /usr/xpg6/bin/tr -sdc 'A-Za-z' '/012' but they didnt worked..... (5 Replies)
Discussion started by: aditya.ece1985
5 Replies

2. Shell Programming and Scripting

Finding the number of unique words in a file

find the number of unique words in a file using sort com- mand. (7 Replies)
Discussion started by: abhikamune
7 Replies

3. Shell Programming and Scripting

Shell script to find out words, replace them and count words

hello, i 'd like your help about a bash script which: 1. finds inside the html file (it is attached with my post) the code number of the Latest Stable Kernel, 2.finds the link which leads to the download location of the Latest Stable Kernel version, (the right link should lead to the file... (3 Replies)
Discussion started by: alex83
3 Replies

4. Homework & Coursework Questions

unique words in files of folder and its subfolders

Hello, I tried to count all unique words of all files in one folder and its subfolders. Can anybody say me, why this doesnt work: ls| find -d | cat | tr "\ " "\n"| uniq -u | wc -l ??? Cat writes only the names of those files, but not the wors, which should be in them. Thanks for any advice. ... (9 Replies)
Discussion started by: Dworza
9 Replies

5. Shell Programming and Scripting

display unique words.

I am having a file with duplicate words how can I eliminate them ant,bat bat,cat cat a.txt | grep -bat | awk '{print $1}' expecting o/p as ant,bat,cat How can I display the output as ant,bat,cat in a single line and no duplicates exists. (2 Replies)
Discussion started by: shikshavarma
2 Replies

6. Shell Programming and Scripting

Unique words in each line

In each row there could be repetition of a word. I want to delete all repetitions and keep unique occurrences. Example: a+b+c ab+c ab+c abbb+c ab+bbc a+bbbc aaa aaa aaa Output: a+b+c ab+c abbb+c ab+bbc a+bbbc aaa (6 Replies)
Discussion started by: Viernes
6 Replies

7. Shell Programming and Scripting

awk to count using each unique value

Im looking for an awk script that will take the unique values in column 5, then print and count the unique values in column 6. CA001011500 11111 11111 -9999 201301 AAA CA001012040 11111 11111 -9999 201301 AAA CA001012573 11111 11111 -9999 201301 BBB CA001012710 11111 11111 -9999 201301... (4 Replies)
Discussion started by: ncwxpanther
4 Replies

8. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Hi , I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l (1 Reply)
Discussion started by: jmarx
1 Replies

9. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Discussion started by: angshuman
4 Replies

10. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Hello, I have a dictionary which I am building for the Open Source Community. The data structure is as under HEADWORD=PARTOFSPEECH=ENGLISH MEANING as shown in the example below अ=m=Prefix signifying negation. अँहँ=ind=Interjection expressing disapprobation. अं=int=An interjection... (2 Replies)
Discussion started by: gimley
2 Replies
WORDS(5)						     Linux Programmers Manual							  WORDS(5)

NAME
galician-minimos - a list of Galician words, using the "minimos" standard DESCRIPTION
/usr/share/dict/galician-minimos is an ASCII file which contains an alphabetic list of words, one per line. FILES
/etc/dictionaries-common/words is a symbolic link to a /usr/share/dict/<language> file. /usr/share/dict/words is a symbolic link to /etc/dictionaries-common/words, and is the name by which other software should refer to the system word list. See select-default- wordlist(8) for more information. The directory /usr/share/dict can contain word lists for many languages, with name of the language in English, e.g., /usr/share/dict/french and /usr/share/dict/danish contain respectively lists of French and Danish words if they exist. Such lists should be coded using the ISO 8859-1 character set encoding. SEE ALSO
ispell(1), select-default-wordlist(8), and the Filesystem Hierarchy Standard. HISTORY
The words lists are not specific, and may be generated from any number of sources. The system word list used to be /usr/dict/words. For compatibility, software should check that location if /usr/share/dict/words does not exist. AUTHOR
Word lists are collected and maintained by various authors. Linux 14 Oct 2002 WORDS(5)
All times are GMT -4. The time now is 10:35 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy