Creating Frequency of words from a file by accessing a corpus Post: 302836275

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Creating String from words in a file

Hi i have a file called search.txt Which contains text like Car Bus Cat Dog Now i have to create a string from the file which should look like Car,Bus,Cat,Dog ( appending , is essential part) String must be stored in some variable so i can pass it as argument to some other...

2. Shell Programming and Scripting

Splitting Concatenated Words in Input File with Words from a Master File

Hello, I have a complex problem. I have a file in which words have been joined together: Theboy ranslowly I want to be able to correctly split the words using a lookup file in which all the words occur: the boy ran slowly slow put child ly The lookup file which is meant for look up...

3. Shell Programming and Scripting

count frequency of words in a file

I need to write a shell script "cmn" that, given an integer k, print the k most common words in descending order of frequency. Example Usage: user@ubuntu:/$ cmn 4 < example.txt :b:

4. Shell Programming and Scripting

Splitting concatenated words in input file with words from the same file

Dear all, I am working with names and I have a large file of names in which some words are written together (upto 4 or 5) and their corresponding single forms are also present in the word-list. An example would make this clear annamarie mariechristine johnsmith johnjoseph smith john smith...

5. Shell Programming and Scripting

Assigning the same frequency to more than one words in a file

I have a file of names with the following structure NAME FREQUENCY NAME NAME FREQUENCY NAME NAME NAME FREQUENCY i.e. more than one name is assigned the same frequency. An example will make this clear SANDHYA DAS 6901 ARATI DAS 6201 KALPANA DAS 4714 GITA DAS 4550 BISWANATH DAS 3949...

6. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Hi , I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l

7. UNIX for Dummies Questions & Answers

Replace the words in the file to the words that user type?

Hello, I would like to change my setting in a file to the setting that user input. For example, by default it is ONBOOT=ON When user key in "YES", it would be ONBOOT=YES -------------- This code only adds in the entire user input, but didn't replace it. How do i go about...

8. Shell Programming and Scripting

Frequency of Words in a File, sed script from 1980

tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | sed ${1:-25} < book7.txt This is not my script, it can be found way back from 1980 but once it worked fine to give me the most used words in a text file. Now the shell is complaining about an error in sed sed: -e...

9. HP-UX

Problems creating and accessing with user

Hi, I have created the user 'mastersa' in several servers. I need to change the user ID to '0'. However, after doing this, I am not able to login (Access denied). Even after I change the password, I still get this error. Why is this? Also, when I attempt to delete the user account, I get...

10. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Hi All, I need one help to replace particular words in file based on if finds another words in that file . i.e. my self is peter@king. i am staying at north sydney. we all are peter@king. How to replace peter to sham if it finds @king in any line of that file. Please help me...

LEARN ABOUT HPUX

adjust

adjust(1)						      General Commands Manual							 adjust(1)

NAME

       adjust - simple text formatter

SYNOPSIS

       column] tabsize] [files]...

DESCRIPTION

       The  command is a simple text formatter for filling, centering, left and right justifying, or only right justifying text paragraphs, and is
       designed for interactive use.  It reads the concatenation of input files (or standard input if none are given)  and  produces  on  standard
       output  a formatted version of its input, with each paragraph formatted separately.  If is given as an input filename, reads standard input
       at that point (use as an argument to separate from options.)

       reads text from input lines as a series of words separated by space characters, tabs, or newlines.  Text lines are grouped into	paragraphs
       separated  by blank lines.  By default, text is copied directly to the output, subject only to simple filling (see below) with a right mar-
       gin of 72, and leading spaces are converted to tabs where possible.

   Options
       The command recognizes the following command-line options:

	      Do not convert leading space characters to tabs on output;
			(output contains no tabs, even if there were tabs in input).

	      Center text on each line.
			Lines are pre- and post-processed, but no filling is performed.

	      Justify text.
			After filling, insert spaces in each line as needed to right justify it (except in the last line of each paragraph)  while
			keeping the justified left margin.

	      After filling text, adjust the indentation of each line for a smooth right
			margin (ragged left margin).

	      Set the right fill margin to the given column number, instead of 72.
			Text  is  filled,  and optionally right justified, so that no output line extends beyond this column (if possible).  If is
			given, the current right margin of the first line of each paragraph is used for that and all subsequent lines in the para-
			graph.

			By  default,  text  is centered on column 40.  With the option sets the middle column of the centering "window", but auto-
			sets the right side as before (which then determines the center of the "window").

	      Set the tab size to other than the default (eight columns).

       Only one of the and options is allowed in a single command line.

   Details
       Before doing anything else to a line of input text, first handles backspaces, rubbing out preceding characters in the usual way.  Next,	it
       ignores all nonprintable characters except tab.	It then expands all tabs to spaces.

       For simple text filling, the first word of the first line of each paragraph is indented the same amount as in the input line.  Each word is
       then carried to the output followed by one space.  "Words" ending in terminal_character[quote][closing_character] are followed by two  spa-
       ces,  where  terminal_character	is any of or quote is a single closing quote or double-quote character (), and close is any of or Here are
       some examples:

       does not place two spaces after a pair of single closing quotes following a terminal_character).

       starts a new output line whenever adding a word (other than the first one) to the current line would exceed the right margin.

       understands indented first lines of paragraphs (such as this one) when filling.	The second and subsequent  lines  of  each  paragraph  are
       indented the same amount as the second line of the input paragraph if there is a second line, else the same as the first line.

       also has a rudimentary understanding of tagged paragraphs
		 (such	as  this  one)	when filling.  If the second line of a paragraph is indented more than the first, and the first line has a
		 word beginning at the same indentation as the second line, the input column position of the tag word or words (prior to  the  one
		 matching the second line indentation) is preserved.

       Tag  words  are	passed	through  without  change of column position, even if they extend beyond the right margin.  The rest of the line is
       filled or right justified from the position of the first nontag word.

       When is given, uses an intelligent algorithm to insert spaces in output lines where they are most needed, until the  lines  extend  to  the
       right  margin.	First, all one space word separators are examined.  One space is added to each separator, starting with the one having the
       most letters between it and the preceding and following separators, until the modified line reaches the right margin.   If  all	one  space
       separators are increased to two spaces and more spaces must be inserted, the algorithm is repeated with two space separators, and so on.

       Output  line  indentation  is  held  to	one less than the right margin.  If a single word is larger than the line size (right margin minus
       indentation), that word appears on a line by itself, properly indented, and extends beyond the right margin.  However,  if  is  used,  such
       words are still right justified, if possible.

       If the current locale defines class names and (see iswctype(3C)), formats the text in accordance with the character classification and mar-
       gin settings (see and options).

EXTERNAL INFLUENCES

   Environment Variables
       provides a default value for the internationalization variables that are unset or null.	If is unset or null, the default value of "C" (see
       lang(5))  is  used.   If  any of the internationalization variables contains an invalid setting, will behave as if all internationalization
       variables are set to "C".  See environ(5).

       If set to a nonempty string value, overrides the values of all the other internationalization variables.

       determines the interpretation of text as single and/or multi-byte characters, the classification of characters as printable, and the  char-
       acters matched by character class expressions in regular expressions.

       determines  the	locale that should be used to affect the format and contents of diagnostic messages written to standard error and informa-
       tive messages written to standard output.

       determines the location of message catalogs for the processing of

   International Code Set Support
       Single- and multi-byte character code sets are supported.

DIAGNOSTICS

       complains to standard error and later returns a nonzero value if any input file cannot be opened (it skips the file).   It  does  the  same
       (but quits immediately) if the argument to or is out of range, or if the program is improperly invoked.

       Input  lines  longer  than are silently split (before tab expansion) or truncated (afterwards).	Lines that are too wide to center begin in
       column 1 (no leading spaces).

EXAMPLES

       This command is useful for filtering text while in vi(1).  For example,

       reformats the rest of the current paragraph (from the current line down), evening the lines.

       The command:

       (where denotes control characters) sets up a useful "finger macro".  Typing (Ctrl-X) reformats the entire current paragraph.

       is a simple way to break text into separate words without whitespace, except for tagged-paragraph tags.

WARNINGS

       This program is designed to be simple and fast.	It does not recognize backslash to escape whitespace or other  characters.   It  does  not
       recognize  tagged  paragraphs  where the tag is on a line by itself.  It knows that lines end in newline or null, and how to deal with tabs
       and backspaces, but it does not do anything special with other characters such as form feed (they are simply ignored).  For complex  opera-
       tions, standard text processors are likely to be more appropriate.

       This  program  could  be  implemented instead as a set of independent programs, fill, center, and justify (with the option).  However, this
       would be much less efficient in actual use, especially given the program's special knowledge of tagged paragraphs and last lines  of  para-
       graphs.

AUTHOR

       was developed by HP.

SEE ALSO

       nroff(1).

																	 adjust(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Creating String from words in a file

Discussion started by: deepakthaman

2. Shell Programming and Scripting

Splitting Concatenated Words in Input File with Words from a Master File

Discussion started by: gimley

3. Shell Programming and Scripting

count frequency of words in a file

Discussion started by: mohit_iitk

4. Shell Programming and Scripting

Splitting concatenated words in input file with words from the same file

Discussion started by: gimley

5. Shell Programming and Scripting

Assigning the same frequency to more than one words in a file

Discussion started by: gimley

6. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Discussion started by: jmarx

7. UNIX for Dummies Questions & Answers

Replace the words in the file to the words that user type?

Discussion started by: malfolozy

8. Shell Programming and Scripting

Frequency of Words in a File, sed script from 1980

Discussion started by: 1in10

9. HP-UX

Problems creating and accessing with user

Discussion started by: anaigini45

10. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Discussion started by: Rajib Podder

LEARN ABOUT HPUX

adjust