Got a question for you guys...I am searching through a public directory (that has tons of files) trying to find a file that I was working on a longggggg time ago. I can't remember what it is called, but I do remember the content. It should contains words like this:
Joe
Pulvo
botnet
zeus
command
control
There are several hundred files in this directory, so I was trying to figure out a command that will locate this file, which contains all of these words. I wanted to put these 6 words into a file using VI, and then use that file to find what I am looking for.
I know I could do:
but I would rather use the file to make sure all of the words are in it. I think I need to write a short script so I can replace 'String to find' with an ambiguous term, which is my problem. Could I just use $n instead of 'String to find' and increment $n so it will move lines down each time, thus searching for a new word?
Since I tend to have organizational problems, it would be nice to have this script work for any input, so my goal is to not hard code specific words into the algorithm.
Hi,
I want to be able to list all the names in a file which begin with a capital letter, but I don't want it to list words that begin a new sentence. Is there any way round this?
Thanks for your help. (1 Reply)
Hi guys,
i need to search the most commonly occuring words in a file and display their counts of about 30000 words and the words shud not be of typ specified in file 2 e. words like is,for,the,an,he,she etc...
k.
file1:
ALICE was beginning to get very tired of sitting by... (2 Replies)
Hi all,
I would like to print words in a file seperated by whitespaces containing a specific pattern like "="
e.g. I have a file1 containing strings like
%cat file1
The= some= in
wish= born
<eof> .I want to display only those words containing = i.e The= , some=,wish=
... (5 Replies)
Hi,
i need to pick up dates and times from the file names which are of unequal length. The dates and time are delimited by dot. I am interested in getting the strings between the delimeter for fields -3, -4, -5 from behind (rear) so that the out put looks like :
071118.011300.556
I have... (2 Replies)
Hi , i am a new with perl, i want to made a script that find in file rows that start with specil words, as an example a line will start with"
.............................................
specialword aaa=2 bbb=5
.............................................
and to put this in a new file... (3 Replies)
Hi again
I have figured out how to be able to sort through lines in a file with multiple words in any order and display them using this command:
cat file | grep -i $OPTION1 | grep -i $OPTION2 | grep -i $OPTION3 OPTION1 is 2008, OPTION2 is Mar, OPTION 3 is Tue
Result:
Tue Mar 25... (4 Replies)
Hi All,
I tried this but I am having trouble formulating this:
I have a file that looks like this (this is a sample file words can be different):
network
router
frame
network
router
computer
card
host
computer
card
One can see that in this file "network" and "router" occur... (3 Replies)
I have a text which I divided them into sentences and now printed them in a rows.
I want to get the list of most of words ( the, and, a) and print 5 words after them (so 6 with the word itself). I have created an acceptfile with those rows and using grep but I have rows that have these words more... (2 Replies)
Couldn't find my PC on network. Root of evil the was in bad patch-cable. (0 Replies)
Discussion started by: useretail
0 Replies
LEARN ABOUT DEBIAN
plucene::analysis::porterstemfilter
Plucene::Analysis::PorterStemFilter(3pm) User Contributed Perl Documentation Plucene::Analysis::PorterStemFilter(3pm)NAME
Plucene::Analysis::PorterStemFilter - Porter stemming on the token stream
SYNOPSIS
# isa Plucene::Analysis:::TokenFilter
my $token = $porter_stem_filter->next;
DESCRIPTION
This class transforms the token stream as per the Porter stemming algorithm.
Note: the input to the stemming filter must already be in lower case, so you will need to use LowerCaseFilter or LowerCaseTokenizer farther
down the Tokenizer chain in order for this to work properly!
The Porter Stemmer implements Porter Algorithm for normalization of English words by stripping their extensions and is used to generalize
the searches. For example, the Porter algorithm maps both 'search' and 'searching' (as well as 'searchnessing') to 'search' such that a
query for 'search' will also match documents that contains the word 'searching'.
Note that the Porter algorithm is specific to the English language and may give unpredictable results for other languages. Also, make sure
to use the same analyzer during the indexing and the searching.
You can find more information on the Porter algorithm at www.tartarus.org/~martin/PorterStemmer.
A nice online demonstration of the Porter algorithm is available at www.scs.carleton.ca/~dquesnel/java/stuff/PorterApplet.html.
METHODS
next
my $token = $porter_stem_filter->next;
Returns the next input token, after being stemmed.
perl v5.12.4 2011-08-14 Plucene::Analysis::PorterStemFilter(3pm)