Creating Frequency of words from a file by accessing a corpus
Hello,
I have a large file of syllables /strings in Urdu. Each word is on a separate line.
Example in English:
I need to identify the frequency of each of these strings from a large corpus (which I cannot attach unfortunately because of size limitations) and identify the frequency of each string.
Is there a perl or awk script which can do the job.
Many thanks for your help
Hi i have a file called search.txt
Which contains text like
Car
Bus
Cat
Dog
Now i have to create a string from the file which should look like
Car,Bus,Cat,Dog
( appending , is essential part) String must be stored in some variable so i can pass it as argument to some other... (5 Replies)
Hello,
I have a complex problem. I have a file in which words have been joined together:
Theboy ranslowly
I want to be able to correctly split the words using a lookup file in which all the words occur:
the
boy
ran
slowly
slow
put
child
ly
The lookup file which is meant for look up... (21 Replies)
I need to write a shell script "cmn" that, given an integer k, print the k most common words in descending order of frequency.
Example Usage:
user@ubuntu:/$ cmn 4 < example.txt :b: (3 Replies)
Dear all,
I am working with names and I have a large file of names in which some words are written together (upto 4 or 5) and their corresponding single forms are also present in the word-list.
An example would make this clear
annamarie
mariechristine
johnsmith
johnjoseph smith
john
smith... (8 Replies)
I have a file of names with the following structure
NAME FREQUENCY
NAME NAME FREQUENCY
NAME NAME NAME FREQUENCY
i.e. more than one name is assigned the same frequency. An example will make this clear
SANDHYA DAS 6901
ARATI DAS 6201
KALPANA DAS 4714
GITA DAS 4550
BISWANATH DAS 3949... (4 Replies)
Hi ,
I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l (1 Reply)
Hello,
I would like to change my setting in a file to the setting that user input.
For example, by default it is
ONBOOT=ON
When user key in "YES", it would be
ONBOOT=YES
--------------
This code only adds in the entire user input, but didn't replace it.
How do i go about... (5 Replies)
tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | sed ${1:-25} < book7.txt
This is not my script, it can be found way back from 1980 but once it worked fine to give me the most used words in a text file.
Now the shell is complaining about an error in sed
sed: -e... (5 Replies)
Hi,
I have created the user 'mastersa' in several servers.
I need to change the user ID to '0'. However, after doing this, I am not able to login (Access denied).
Even after I change the password, I still get this error.
Why is this?
Also, when I attempt to delete the user account, I get... (5 Replies)
Hi All,
I need one help to replace particular words in file based on if finds another words in that file .
i.e.
my self is peter@king.
i am staying at north sydney.
we all are peter@king.
How to replace peter to sham if it finds @king in any line of that file.
Please help me... (8 Replies)
Discussion started by: Rajib Podder
8 Replies
LEARN ABOUT SUSE
dpbindic
DPBINDIC(1) General Commands Manual DPBINDIC(1)NAME
dpbindic - Convert a binary-form dictionary into a text-form dictionary
SYNOPSYS
dpbindic [ -xiu [ frequency ] ] binary-file [ text-file ]
DESCRIPTION
dpbindic outputs the file information of the binary-form dictionary file specified in binary-file . At this time, the word information of
the dictionary can be output in text form to the standard output. To do so, use test-file to specify the text-form dictionary used as the
source of binary-form dictionary file. If this specification is omitted, the text dictionary file information in the binary dictionary
file will be output. The standard grammar file name is /usr/local/canna/lib/dic/hyoujun.gram. It will be used if the grammar file name
specification is omitted. The output format of word information data is specified using an option.
OPTIONS -x Outputs the data without using omission symbol @, which is used when the initial word represents the reading.
-i Replaces the reading and word for output.
-u Outputs the candidates used in conversion. Outputs all candidates having frequency or more. If frequency is omitted, all candi-
dates having frequency 1 will be output.
EXAMPLES
(1) If the text-form dictionary file name is omitted:
%dibindic iroha.cbd
(Text dictionary file name = Directory size + Word size, packed)
iroha.swd = 2985 + 5306 pak a4
iroha.mwd = 36276 + 113139 pak a4
(2) If the text-form dictionary file name iroha.mwd is specified:
%dpbindic iroha.cbd iroha.mwd
(Text dictionary file name = Directory size + Word size, packed)
iroha.mwd = 36276 + 113139 pak a4
SEE ALSO mkbindic(1), dicar(1)DPBINDIC(1)