Finding the number of unique words in a file Post: 302449985

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to find capital letter names in a file without finding words at start of sentence

Hi, I want to be able to list all the names in a file which begin with a capital letter, but I don't want it to list words that begin a new sentence. Is there any way round this? Thanks for your help.

2. Shell Programming and Scripting

how to read all the unique words in a text file

How can i read all the unique words in a file, i used - cat comment_file.txt | /usr/xpg6/bin/tr -sc 'A-Za-z' '/012' and cat comment_file.txt | /usr/xpg6/bin/tr -sdc 'A-Za-z' '/012' but they didnt worked.....

3. Shell Programming and Scripting

Need help with finding unique string in log file

Shell script help Here is 3 sample lines from a log file <date> INFO <java.com.blah> abcd:ID= user login <date> DEBUG <java.com.blah> <nlah bla> abcd:ID=123 user login <date> INFO <java.com.blah> abcd:ID=3243 user login I want to find unique "ID" from this log...

4. Shell Programming and Scripting

Split file by number of words

Dear all I am trying to divide a file using the number of words as a condition. Alternatively, I would at least like to be able to retrieve the first x words of a given file. Any tips? Thanks in advance.

5. UNIX for Advanced & Expert Users

Count number of unique patterns from a log file

Hello Everyone I need your help in fixing this issue., I have a log file which has data of users logging in to an application. I want to search for a particular pattern in the log ISSessionValidated=N If this key word is found , the above 8 lines will contain the name of the user who's...

6. Shell Programming and Scripting

Finding consecutive same words in a file

Hi All, I tried this but I am having trouble formulating this: I have a file that looks like this (this is a sample file words can be different): network router frame network router computer card host computer card One can see that in this file "network" and "router" occur...

7. Shell Programming and Scripting

Finding my lost file by searching for words in it

Got a question for you guys...I am searching through a public directory (that has tons of files) trying to find a file that I was working on a longggggg time ago. I can't remember what it is called, but I do remember the content. It should contains words like this: Joe Pulvo botnet zeus...

8. Shell Programming and Scripting

problem to count number of words from file

hi every one i have written this simple shell for counting number of word that user need to find from file but i have get several error when run it. can someone tell me the problem ? echo "Enter the file name" read file echo "enter word" read word for i in \`cat $file` do if then...

9. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Hi , I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l

10. Shell Programming and Scripting

Replace all string matches in file with unique random number

Hello Take this file... Test01 Ref test Version 01 Test02 Ref test Version 02 Test66 Ref test Version 66 Test99 Ref test Version 99 I want to substitute every occurrence of Test{2} with a unique random number, so for example, if I was using sed, substitution would be something...

LEARN ABOUT V7

sort

SORT(1) 						      General Commands Manual							   SORT(1)

NAME

       sort - sort or merge files

SYNOPSIS

       sort [ -_________x ] [ +pos1  [ -pos2 ] ] ...  [ -o name ] [ -T directory ] [ name ] ...

DESCRIPTION

       Sort  sorts lines of all the named files together and writes the result on the standard output.	The name `-' means the standard input.	If
       no input files are named, the standard input is sorted.

       The default sort key is an entire line.	Default ordering is lexicographic by  bytes  in  machine  collating  sequence.	 The  ordering	is
       affected globally by the following options, one or more of which may appear.

       b    Ignore leading blanks (spaces and tabs) in field comparisons.

       d    `Dictionary' order: only letters, digits and blanks are significant in comparisons.

       f    Fold upper case letters onto lower case.

       i    Ignore characters outside the ASCII range 040-0176 in nonnumeric comparisons.

       n    An initial numeric string, consisting of optional blanks, optional minus sign, and zero or more digits with optional decimal point, is
	    sorted by arithmetic value.  Option n implies option b.

       r    Reverse the sense of comparisons.

       tx   `Tab character' separating fields is x.

       The notation +pos1 -pos2 restricts a sort key to a field beginning at pos1 and ending just before pos2.	Pos1 and pos2 each have  the  form
       m.n,  optionally followed by one or more of the flags bdfinr, where m tells a number of fields to skip from the beginning of the line and n
       tells a number of characters to skip further.  If any flags are present they override all the global ordering options for this key.  If the
       b  option  is  in  effect n is counted from the first nonblank in the field; b is attached independently to pos2.  A missing .n means .0; a
       missing -pos2 means the end of the line.  Under the -tx option, fields are strings separated by x; otherwise fields are	nonempty  nonblank
       strings separated by blanks.

       When  there  are multiple sort keys, later keys are compared only after all earlier keys compare equal.	Lines that otherwise compare equal
       are ordered with all bytes significant.

       These option arguments are also understood:

       c    Check that the input file is sorted according to the ordering rules; give no output unless the file is out of sort.

       m    Merge only, the input files are already sorted.

       o    The next argument is the name of an output file to use instead of the standard output.  This file may  be  the  same  as  one  of  the
	    inputs.

       T    The next argument is the name of a directory in which temporary files should be made.

       u    Suppress all but one in each set of equal lines.  Ignored bytes and bytes outside keys do not participate in this comparison.

       Examples.  Print in alphabetical order all the unique spellings in a list of words.  Capitalized words differ from uncapitalized.

	       sort -u +0f +0 list

       Print the password file (passwd(5)) sorted by user id number (the 3rd colon-separated field).

	       sort -t: +2n /etc/passwd

       Print the first instance of each month in an already sorted file of (month day) entries.  The options -um with just one input file make the
       choice of a unique representative from a set of equal lines predictable.

	       sort -um +0 -1 dates

FILES

       /usr/tmp/stm*, /tmp/*: first and second tries for temporary files

SEE ALSO

       uniq(1), comm(1), rev(1), join(1)

DIAGNOSTICS

       Comments and exits with nonzero status for various trouble conditions and for disorder discovered under option -c.

BUGS

       Very long lines are silently truncated.

																	   SORT(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to find capital letter names in a file without finding words at start of sentence

Discussion started by: kev269

2. Shell Programming and Scripting

how to read all the unique words in a text file

Discussion started by: aditya.ece1985

3. Shell Programming and Scripting

Need help with finding unique string in log file

Discussion started by: gubbu

4. Shell Programming and Scripting

Split file by number of words

Discussion started by: aavv