This is another technique:
Producing:
All the strings are placed into a scalar, with the separator being a newline, then the scalar is split into the array.
Because there is some processing, you could also place quotes around the strings later, saving some keystrokes on entry. However, then you would lose the flexibility of placing any characters of your choice around the strings -- I used both single and double quotes above, for example.
It is convenient if your sentences are short. If you had long sequences, you'd need to invent a continuation character to allow lines to be joined ... cheers, drl
I have to add a variable value to an array, something like this:
......
@my_array_name = $value_of_this_variable;
This doesnt seem to work, any ideas why?
Thanks! (4 Replies)
hi every body,
i donot know how to assign a array varible with a file
see i having file
more file
property1 Name
property2 Address
the above two line are tab Space seperated between the property and its value
i want to seperate it and assign to... (1 Reply)
Hi ,
i have a text file that contain a story
How do i extract the out all the sentences that contain the word Mon. in C++
I only want to show those sentences that contain the word mon
eg.
Monkey on a tree.
Rabbit jumping around the tree.
I am very rich, I have lots of money.
Today... (1 Reply)
Hi,
I have an array with 3 words in it and i have to match all the array contents and display the exact matched sentence i.e all 3 words should match with the sentence.
Here are sentences.
$arr1="Our data suggests that epithelial shape and growth control are unequally affected depending... (5 Replies)
I have an array as follows:
Space: ABC
Name: def
Age: 22
Type: new
Name: fgh
Age: 34
Type: old
Space: XYZ
Name: pqr
Age: 44
Type: new
:
:
How can I separate the array with elements starting from Space:ABC until Space: XYZ & put them in a different array & so on... (4 Replies)
Hi All,
I'm writing a nagios check that will see if our ldap servers are in sync...
I got the status data into a nested array, I would like to search key of each array and if "OK" is NOT present, echo other key=>values in the current array to a variable
so...eg...let take the single array... (1 Reply)
I wrote a simply perl that searched a file for a particualr value and if it found it, rite it and the next three lines to a file. Now I have been asked to check those next three lines for a different value and only write those lines if it finds the second value.
I was thinking the best way to... (1 Reply)
I know that
@food = %fruit;
Works. But how do I assign %fruit and %veggies to @food ? (2 Replies)
Discussion started by: popeye
2 Replies
LEARN ABOUT DEBIAN
ucto
ucto(1) General Commands Manual ucto(1)NAME
ucto - Unicode Tokenizer
SYNOPSYS
ucto [[options]] [input-file] [[output-file]]
DESCRIPTION
ucto ucto tokenizes text files: it separates words from punctuation, splits sentences (and optionally paragraphs), and finds paired quotes.
Ucto is preconfigured with tokenisation rules for several languages.
OPTIONS -c configfile
read settings from a file
-d value
set debug mode to 'value'
-e value
set input encoding. (default UTF8)
-f
disable filtering of special characters
-L language
Automatically selects a configuration file by language code. e.g. 'fr' will select the file tokconfig-fr from the installation
directory
-l
Convert to all lowercase
-u
Convert to all uppercase
-n
Assume one sentence per line on input
-m
Emit one sentence per line on output
--passthru
Don't tokenize, but perform input decoding and simple token role detection
-P
Disable Paragraph Detection
-Q
Enable Quote Detection. (this is experimental and may lead to unexpected results)
-S
Disable Sentence Detection
-s <string>
Set End-of-sentence marker. (Default <utt>)
-V
Show version information
-v
set Verbose mode
-x <DocId>
Output FoLiA XML, use the specified Document ID. (this disables usage of most other options: -nulPQvsS)
-F
Read a FoLiA XML document, tokenize it, and output the modified doc. (this disables usage of most other options: -nulPQvsS)
BUGS
likely
AUTHORS
Maarten van Gompel proycon@anaproy.nl
Ko van der Sloot Timbl@uvt.nl
2011 november 28 ucto(1)