Wed, 14 May 2008 08:00:00 GMT Soothsayer is a predictive text input system. Many folks reading that sentence will think of the word completion offered by mobile phones. Soothsayer is different from such mobile phone systems in that it tries to use context and other statistical information to offer predictions instead of just presenting a list of words that might match the first few letters you type.
Hi dears
i use bash shell
i have INPUT.txt
like this
number of columns different in one
some row have 12 , some 11 columns
see last column
INPUT.txt
CodeGender Age Grade Dialect Session Sentence Start End Length Phonemic Phonetic
63 M 27 BS/BA TEHRANI 3 4 298320 310050... (2 Replies)
I need to search a string for some specific text which is no big deal using grep. My problem is when the search fails to find the text. I need to add text like "na" when my search does not match.
I have tried this command but it does not work when I put the command in a loop in a bash script:
... (12 Replies)
We are getting the following diagela error messages every half hour from our P6 P520 AIX server after incorrectly accidentally configuring both HMC ports on the FSP with the same IP address a month ago:
B1A38B24: External environment Predictive Error, general. Refer to the
system... (0 Replies)
MMSEG(1) User Contributed Perl Documentation MMSEG(1)NAME
mmseg - maximum matching segment Chinese text.
SYNOPSIS
mmseg -d dict_file [option]... [corpus_file]...
DESCRIPTION
mmseg is a tool for segmenting Chinese text into words using maximum matching algorithm. mmseg segments corpus_file, or standard input if
no filename is specified, and write the segmented result to standard output.
OPTIONS -d dict_file
Use dict_file as lexicon. A default lexicon can be found at /usr/share/sunpinyin-slm/dict.utf8.
-f,--format (text|bin)
Output Format, can be 'text' or 'bin'. default 'bin'. Normally, in text mode, word text are output, while in binary mode, binary short
integer of the word-ids are written to stdout.
-s, --stok STOK_ID
Sentence token id. Default 10. It will be written to output in binary mode after every sentence.
-i, --show-id
Show Id info. Under text output format mode, attach id after known words. If under binary mode, print id(s) in text.
-a, --ambiguious-id AMBI-ID
Ambiguious means ABC => A BC or AB C. If specified (AMBI-ID != 0), The sequence ABC will not be segmented, in binary mode, the AMBI-ID
is written out; in text mode, "<ambi>ABC</ambi>" will be output. Default is 0.
NOTES
Under binary mode, consecutive id of 0 are merged into one 0. Under text mode, no space are inserted between unknown-words.
AUTHOR
Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>.
SEE ALSO slmseg(1), ids2ngram (1).
perl v5.14.2 2012-06-09 MMSEG(1)