kinosearch1::analysis::stemmer(3pm) [debian man page]
KinoSearch1::Analysis::Stemmer(3pm) User Contributed Perl Documentation KinoSearch1::Analysis::Stemmer(3pm)NAME
KinoSearch1::Analysis::Stemmer - reduce related words to a shared root
SYNOPSIS
my $stemmer = KinoSearch1::Analysis::Stemmer->new( language => 'es' );
my $polyanalyzer = KinoSearch1::Analysis::PolyAnalyzer->new(
analyzers => [ $lc_normalizer, $tokenizer, $stemmer ],
);
DESCRIPTION
Stemming reduces words to a root form. For instance, "horse", "horses", and "horsing" all become "hors" -- so that a search for 'horse'
will also match documents containing 'horses' and 'horsing'.
This class is a wrapper around Lingua::Stem::Snowball, so it supports the same languages.
METHODS
new
Create a new stemmer. Takes a single named parameter, "language", which must be an ISO two-letter code that Lingua::Stem::Snowball
understands.
COPYRIGHT
Copyright 2005-2010 Marvin Humphrey
LICENSE, DISCLAIMER, BUGS, etc.
See KinoSearch1 version 1.00.
perl v5.14.2 2011-11-15 KinoSearch1::Analysis::Stemmer(3pm)
Check Out this Related Man Page
KinoSearch1::Analysis::Token(3pm) User Contributed Perl Documentation KinoSearch1::Analysis::Token(3pm)NAME
KinoSearch1::Analysis::Token - unit of text
SYNOPSIS
# private class - no public API
PRIVATE CLASS
You can't actually instantiate a Token object at the Perl level -- however, you can affect individual Tokens within a TokenBatch by way of
TokenBatch's (experimental) API.
DESCRIPTION
Token is the fundamental unit used by KinoSearch1's Analyzer subclasses. Each Token has 4 attributes: text, start_offset, end_offset, and
pos_inc (for position increment).
The text of a token is a string.
A Token's start_offset and end_offset locate it within a larger text, even if the Token's text attribute gets modified -- by stemming, for
instance. The Token for "beating" in the text "beating a dead horse" begins life with a start_offset of 0 and an end_offset of 7; after
stemming, the text is "beat", but the end_offset is still 7.
The position increment, which defaults to 1, is a an advanced tool for manipulating phrase matching. Ordinarily, Tokens are assigned
consecutive position numbers: 0, 1, and 2 for "three blind mice". However, if you set the position increment for "blind" to, say, 1000,
then the three tokens will end up assigned to positions 0, 1, and 1001 -- and will no longer produce a phrase match for the query '"three
blind mice"'.
COPYRIGHT
Copyright 2006-2010 Marvin Humphrey
LICENSE, DISCLAIMER, BUGS, etc.
See KinoSearch1 version 1.00.
perl v5.14.2 2011-11-15 KinoSearch1::Analysis::Token(3pm)
I have a huge matrix file containing some 1.5 million rows and 6000 columns. The matrix looks something like this:
1 2 3
4 5 6
7 8 9
3 4 5
I want to add all the numbers in the columns of this matrix and display the result to my stdout. This means that the numbers in the first column are:
... (2 Replies)
Hi,
Just trying to get to grips with sed and awk for some reporting for work and I need some assistance:
I have a file that lists policy names on the first line and then on the second line whether the policy is active or not.
Policy Name: Policy1
Active: yes
Policy... (8 Replies)
version info :
vi availabe with RHEL 5.4
I have a text file with 10,000 lines. I want to copy lines from 5000th line to 7000th and redirect to a file. Any idea how I can do this?
Note:
The above scenario is just an example. In my actual requirement, the file has 14 million lines and I want... (9 Replies)
Hi everyone,
I know the following questions are noobish questions but I am asking them because I am confused about the basics of history behind UNIX and LINUX.
Ok onto business, my questions are-:
Was/Is UNIX ever an open source operating system ?
If UNIX was... (21 Replies)
Hello,
I couldn't find an actual introduction thread, so I decided to just put this here.
I go by d0wngrade online. I have been programming in multiple languages for about 15+ years. I started with standard web design languages like HTML and CSS, but I then advanced from design to development... (2 Replies)
Hi guys...
The first active code line in AudioScope.sh is set -u .
This causes a complete exit if a variable is used/found but has not been allocated at the start of the program.
However, apart from writing code to do the task, is there a switch to to check which variables have been... (17 Replies)
Hi.
In thread https://www.unix.com/shell-programming-and-scripting/267833-grouping-counting.html rovf and I had a mini-discussion on grep and awk.
Here is a demo script that compares the awk and grep approaches for this single problem:
#!/usr/bin/env bash
# @(#) s2 Demonstrate group... (1 Reply)
Hello,
I have to fish out some specific columns from a file based on the header value. I have the list of columns I need in a different file. I thought I could read in the list of headers I need,
# file with header names of required columns in required order
headers_file=$2
# read contents... (11 Replies)
For those interested in installing dash shell on OSX Lion to help test POSIX compliancy of shell scripts, it is quite easy. I did it like this:
If you don't have gcc on your system:
0. Download and install the Command Line Tools for Xcode package from Sign In - Apple *
1. Download the dash... (2 Replies)
Hello and thanks in advance for any help anyone can offer me
I'm trying to learn the find command and thought I was understanding it... Apparently I was wrong. I was doing compound searches and I started getting weird results with the -size test. I was trying to do a search on a 1G file owned by... (14 Replies)
I have data of an excel files as given below,
file1
org1_1 1 1 2.5 100
org1_2 1 2 5.5 98
org1_3 1 3 7.2 88
file2
org2_1 1 1 2.5 100
org2_2 1 2 5.5 56
org2_3 1 3 7.2 70
I have multiple excel files as above shown.
I have to copy column 1, column 4 and paste into a new excel file as... (26 Replies)
Dear All,
Taking a break from Vue.js coding for the site, SEO and YT videos; and hopefully addressing some well deserved criticism from some here that I have been too focused on the visual aspects of the forums versus the substance and the community....
While the "current generation... (9 Replies)
Hi all...
Well guys and gals, I jumped in at the deep end and found things that PERL cannot do by default.
Many tricky terminal escape codes are not catered for so I had to create workarounds.
One thing I searched for was this:
Passing perl variable to shell command
AND, @Neo this was... (15 Replies)