11-16-2018
I foresee problems with the approach of excluding common words. "Damaged" is an important word, but also common in your data. "Not" is also common and kind of vital. And when your data changes, so will whatever words you exclude.
And how important many words are, depends on context. Data is not lost from deleting "left" from "door left open", but it is lost from "left door open".
You can build lists of exclusions and special words until the cows come home, and then one funny case will come along which blows it all out of the water. Add one more special case for that word and special case special cases for any odd but valid ways that word might be used. Rinse and repeat until you lose your mind or your code gains sentience.
I'm not sure true English language processing can be implemented in a tinkertoy.
Deleting common words like "the" and "is", that's certainly doable.
Last edited by Corona688; 11-16-2018 at 12:19 PM..
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I have a cron process that runs daily and generates a log file. The process writes the date it ran and also any errors to the log file.
I need to write a script that will check if the process ran yesterday and also look for the keyword 'ERROR'. If it did not run yesterday or if it found... (0 Replies)
Discussion started by: tatchel
0 Replies
2. Shell Programming and Scripting
I have a shell script main.sh which inturn call the python script ofdm.py, I want to pass two variables from shell script to python script for its execution. How do i achieve this ?????
Eg:
main.sh
a=3 b=3;
c= a+b
exec python ofdm.py
ofdm.py
d=c+a
Thanks in Anticipation (4 Replies)
Discussion started by: shashi792
4 Replies
3. Shell Programming and Scripting
Hi All
I have a function in a linux script like this
clean_up()
{
db2 -x "UPDATE ${DB_SCHEMA_NAME}.ETL_DAILY SET ETL_STATUS = 'SUCCESSFUL' WHERE PROCESS_DATE = '${INT_RUN_DATE}' AND BATCH_NO = ${CM_BATCH} AND APP_ID = ${APP_ID} AND APP_VERSION = '${APP_VERSION}'" > ${TMPOUT}
... (3 Replies)
Discussion started by: vee_789
3 Replies
4. Shell Programming and Scripting
Hello,
I need a shell script which takes search keyword as input and then searches logs in six different servers and provide me the logs where in it found the keyword.
Can anyone help???? (1 Reply)
Discussion started by: tomlui2010
1 Replies
5. Shell Programming and Scripting
experts, i wrote a python script to do a certain job, i tried it and it is working fine, i want this script to be executed automatically after a ksh script, the problem is when i execute the ksh script my python script runes perfectly after the ksh script as I have include it at the end of the ksh... (1 Reply)
Discussion started by: q8devilish
1 Replies
6. Shell Programming and Scripting
Hi
I want to implement something like this:
if( keyword1 exists)
then
check if(keyword2 exists in the same line)
then replace keyword 2 with New_Keyword
else
Add New_Keyword at the end of line
end if
eg:
Check for Keyword JUNGLE and add/replace... (7 Replies)
Discussion started by: dashing201
7 Replies
7. Shell Programming and Scripting
My script triggers and e-mail if keywords supplied to it were found.
Problem is if it find the same keyword continously (due to continous server errors), it triggers mails and fillup my mail box with same message (which is not required)
I want my script to NOT to send an e-mail if it finds the... (13 Replies)
Discussion started by: Rajeshneemkar
13 Replies
8. Shell Programming and Scripting
I just learning shell script. Need your shell script expertise to help me. I would like to stemming the words by matching the root words first between both files and replace all words by "I" character but replace "B" character after root words and "E" before root words in affix_words.txt.
... (18 Replies)
Discussion started by: paranrat
18 Replies
9. Shell Programming and Scripting
I have bash shell script which is internally calling python script.I would like to know how long python is taking to execute.I am not allowed to do changes in python script.Please note i need to know execution time of python script which is getting executed inside shell .I need to store execution... (2 Replies)
Discussion started by: Adfire
2 Replies
10. Windows & DOS: Issues & Discussions
Hi all,
I am trying to run below python code for connecting remote windows machine from unix to run an python file exist on that remote windows machine..
Below is the code I am trying:
#!/usr/bin/env python
import wmi
c = wmi.WMI("xxxxx", user="xxxx", password="xxxxxxx")... (1 Reply)
Discussion started by: onenessboy
1 Replies
LEARN ABOUT DEBIAN
american-english-small
american-english-small(5) Users' Manual american-english-small(5)
NAME
american-english-small - a list of English words
DESCRIPTION
/usr/share/dict/american-english-small is an ASCII file which contains an alphabetic list of words, one per line.
FILES
There may be any number of word lists in /usr/share/dict/. /etc/dictionaries-common/words is a symbolic link to the currently-chosen
/usr/share/dict/<language> file. /usr/share/dict/words is a symbolic link to /etc/dictionaries-common/words, and is the name by which
other software should refer to the system word list. See select-default-wordlist(8) for more information, and/or to change the currently-
chosen word list.
The directory /usr/share/dict can contain word lists for many languages, with name of the language in English, e.g., /usr/share/dict/french
and /usr/share/dict/danish contain respectively lists of French and Danish words if they exist. Such lists should be coded using the ISO
8859-1 character set encoding.
SEE ALSO
ispell(1), select-default-wordlist(8), and the Filesystem Hierarchy Standard.
HISTORY
The words lists are not specific, and may be generated from any number of sources.
The system word list used to be /usr/dict/words. For compatibility, software should check that location if /usr/share/dict/words does not
exist.
AUTHOR
Word lists are collected and maintained by various authors. The Debian English word lists are built from the SCOWL (Spell- Checker Ori-
ented Word Lists) package, whose upstream editor is Kevin Atkinson <kevina@users.sourceforge.net>.
Debian 16 June 2003 american-english-small(5)