Sponsored Content
Top Forums Shell Programming and Scripting Frequency of Words in a File, sed script from 1980 Post 302977050 by Don Cragun on Monday 11th of July 2016 03:38:20 PM
Old 07-11-2016
Quote:
Originally Posted by cfajohnson
Where do you think tr is getting its input?
Good point. A better chance at a working script might be any one of the following three commands:
Code:
{ tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | head -n ${1:-25}
} < book7.txt

or:
Code:
(tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | head -n ${1:-25}) < book7.txt

or:
Code:
tr -cs A-Za-z\' '\n' < book7.txt | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | head -n ${1:-25}

This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

sed option to delete two words within a file

Could someone please help me with the following. I'm trying to figure out how to delete two words within a specific file using sed. The two words are directory and named. I have tried the following: sed '//d' sedfile sed '//d' sedfile both of these options do not work..... ... (4 Replies)
Discussion started by: klannon
4 Replies

2. UNIX for Dummies Questions & Answers

sed replace words in file and keep some

lets see if i can explain this in a good way. im trying to replace some words in a file but i need to know what the words are that is beeing replaced. not sure if sed can do this. file.name.something.1DATA01.something.whatever sed "s/./.DATA?????/g" need to know what the first . is... (2 Replies)
Discussion started by: cas
2 Replies

3. UNIX for Dummies Questions & Answers

sed how to delete between two words within a file

I'm hoping someone could help me out please :) I have several .txt files with several hundred lines in each that look like this: 10241;</td><td>10241</td><td class="b">x2801;</td><td>2801</td><td>TEXT-1</td></tr> 10242;</td><td>10242</td><td... (4 Replies)
Discussion started by: martinsmith
4 Replies

4. Shell Programming and Scripting

Using Sed to Delete Words in a File

This is a Nagios situation. So i have a list of servers in one file called Servers.txt And in another file called hostgroups.cfg, i want to remove each and every one of the servers in the Servers.txt file. The problem is, the script I wrote is having a problem removing the exact servers in... (5 Replies)
Discussion started by: SkySmart
5 Replies

5. Shell Programming and Scripting

SED - delete words between two possible words

Hi all, I want to make an script using sed that removes everything between 'begin' (including the line that has it) and 'end1' or 'end2', not removing this line. Let me paste an 2 examples: anything before any string begin few lines of content end1 anything after anything before any... (4 Replies)
Discussion started by: meuser
4 Replies

6. Shell Programming and Scripting

count frequency of words in a file

I need to write a shell script "cmn" that, given an integer k, print the k most common words in descending order of frequency. Example Usage: user@ubuntu:/$ cmn 4 < example.txt :b: (3 Replies)
Discussion started by: mohit_iitk
3 Replies

7. Shell Programming and Scripting

Script to sort large file with frequency

Hello, I have a very large file of around 2 million records which has the following structure: I have used the standard awk program to sort: # wordfreq.awk --- print list of word frequencies { # remove punctuation #gsub(/_]/, "", $0) for (i = 1; i <= NF; i++) freq++ } END { for (word... (3 Replies)
Discussion started by: gimley
3 Replies

8. Shell Programming and Scripting

Creating Frequency of words from a file by accessing a corpus

Hello, I have a large file of syllables /strings in Urdu. Each word is on a separate line. Example in English: be at for if being attract I need to identify the frequency of each of these strings from a large corpus (which I cannot attach unfortunately because of size limitations) and... (7 Replies)
Discussion started by: gimley
7 Replies

9. Shell Programming and Scripting

Assigning the same frequency to more than one words in a file

I have a file of names with the following structure NAME FREQUENCY NAME NAME FREQUENCY NAME NAME NAME FREQUENCY i.e. more than one name is assigned the same frequency. An example will make this clear SANDHYA DAS 6901 ARATI DAS 6201 KALPANA DAS 4714 GITA DAS 4550 BISWANATH DAS 3949... (4 Replies)
Discussion started by: gimley
4 Replies

10. Shell Programming and Scripting

Write Linux script to convert timestamps older than 1.1.1970 to 1.1.1980

I am having problems because some of my files have timestamps that are earlier that 1.1.1970, the Unix start of time convention. So I would like to write a script that finds all files in home folder and subfolders with timestamps earlier than 1.1.1970 and converts them to 1.1.1980. I... (3 Replies)
Discussion started by: francus
3 Replies
spell(1)						      General Commands Manual							  spell(1)

Name
       spell, spellin, spellout - check text for spelling errors

Syntax
       spell [-v] [-b] [-x] [-d hlist] [+local-file] [-s hstop] [-h spellhist] [file...]

       spellin [list]

       spellout [-d] list

Description
       The command collects words from the named documents, and looks them up in a spelling list.  Words that are not on the spelling list and are
       not derivable from words on the list (by applying certain inflections, prefixes or suffixes) are printed on the	standard  output.   If	no
       files are specified, words are collected from the standard input.

       The command ignores most and constructions.

       Two  routines help maintain the hash lists used by Both expect a set of words, one per line, from the standard input.  The command combines
       the words from the standard input and the preexisting list file and places a new list on the standard output.  If no list  file	is  speci-
       fied,  a  new  list  is generated.  The command looks up each word from the standard input and prints on the standard output those that are
       missing from (or present on, with option -d) the hashed list file.  For example, to verify that hookey is not on the default spelling list,
       add it to your own private list, and then use it with
       echo  hookey  |	spellout  /usr/dict/hlista
       echo  hookey  |	spellin  /usr/dict/hlista  >  myhlist
       spell  -d  myhlist <filename>

Options
       -v	      Displays words not found in spelling list with all plausible derivations from spelling list.

       -b	      Checks  data  according  to British spelling.  Besides preferring centre, colour, speciality, travelled, this option insists
		      upon -ise instead of -ize in words like standardise.

       -x	      Precedes each word with an equal sign (=) and displays all plausible derivations.

       -d hlist       Specifies the file used for the spelling list.

       -h spellhist   Specifies the file used as the history file.

       -s hstop       Specifies the file used for the stop list.

       +local-file    Removes words found in local-file from the output of the command.  The argument local-file is the name of a file provided by
		      the  user  that contains a sorted list of words, one per line.  With this option, the user can specify a list of words for a
		      particular job that are spelled correctly.

       The auxiliary files used for the spelling list, stop list, and history file may be specified by arguments following  the  -d,  -s,  and	-h
       options.   The  default files are indicated below.  Copies of all output may be accumulated in the history file.  The stop list filters out
       misspellings (for example, thier=thy-y+ier) that would otherwise pass.

Restrictions
       The coverage of the spelling list is uneven; new installations will probably wish to monitor the output for several months to gather  local
       additions.

       The command works only with ASCII text files.

Files
       /usr/dict/hlist[ab] hashed spelling lists, American &			 British, default for -d
       /usr/dict/hstop	   hashed stop list, default for -s
       /dev/null	   history file, default for -h
       /tmp/spell.$$*	   temporary files
       /usr/lib/spell

See Also
       deroff(1), sed(1), sort(1), tee(1)

																	  spell(1)
All times are GMT -4. The time now is 07:38 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy