Sponsored Content
Top Forums Programming Python Script for keyword and Stemming Post 303025988 by Corona688 on Friday 16th of November 2018 11:00:06 AM
Old 11-16-2018
I foresee problems with the approach of excluding common words. "Damaged" is an important word, but also common in your data. "Not" is also common and kind of vital. And when your data changes, so will whatever words you exclude.

And how important many words are, depends on context. Data is not lost from deleting "left" from "door left open", but it is lost from "left door open".

You can build lists of exclusions and special words until the cows come home, and then one funny case will come along which blows it all out of the water. Add one more special case for that word and special case special cases for any odd but valid ways that word might be used. Rinse and repeat until you lose your mind or your code gains sentience.

I'm not sure true English language processing can be implemented in a tinkertoy.

Deleting common words like "the" and "is", that's certainly doable.

Last edited by Corona688; 11-16-2018 at 12:19 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need help in checking the date and looking for a keyword in a script

Hi, I have a cron process that runs daily and generates a log file. The process writes the date it ran and also any errors to the log file. I need to write a script that will check if the process ran yesterday and also look for the keyword 'ERROR'. If it did not run yesterday or if it found... (0 Replies)
Discussion started by: tatchel
0 Replies

2. Shell Programming and Scripting

Passing variable from shell script to python script

I have a shell script main.sh which inturn call the python script ofdm.py, I want to pass two variables from shell script to python script for its execution. How do i achieve this ????? Eg: main.sh a=3 b=3; c= a+b exec python ofdm.py ofdm.py d=c+a Thanks in Anticipation (4 Replies)
Discussion started by: shashi792
4 Replies

3. Shell Programming and Scripting

Error for "continue" keyword in Linux script.

Hi All I have a function in a linux script like this clean_up() { db2 -x "UPDATE ${DB_SCHEMA_NAME}.ETL_DAILY SET ETL_STATUS = 'SUCCESSFUL' WHERE PROCESS_DATE = '${INT_RUN_DATE}' AND BATCH_NO = ${CM_BATCH} AND APP_ID = ${APP_ID} AND APP_VERSION = '${APP_VERSION}'" > ${TMPOUT} ... (3 Replies)
Discussion started by: vee_789
3 Replies

4. Shell Programming and Scripting

Shell script to search a keyword in six different servers

Hello, I need a shell script which takes search keyword as input and then searches logs in six different servers and provide me the logs where in it found the keyword. Can anyone help???? (1 Reply)
Discussion started by: tomlui2010
1 Replies

5. Shell Programming and Scripting

Python script called by a shell script

experts, i wrote a python script to do a certain job, i tried it and it is working fine, i want this script to be executed automatically after a ksh script, the problem is when i execute the ksh script my python script runes perfectly after the ksh script as I have include it at the end of the ksh... (1 Reply)
Discussion started by: q8devilish
1 Replies

6. Shell Programming and Scripting

Search for a Keyword in file and replace another keyword or add at the end of line

Hi I want to implement something like this: if( keyword1 exists) then check if(keyword2 exists in the same line) then replace keyword 2 with New_Keyword else Add New_Keyword at the end of line end if eg: Check for Keyword JUNGLE and add/replace... (7 Replies)
Discussion started by: dashing201
7 Replies

7. Shell Programming and Scripting

I want my script to NOT to send an e-mail if it finds the same keyword more than twice.

My script triggers and e-mail if keywords supplied to it were found. Problem is if it find the same keyword continously (due to continous server errors), it triggers mails and fillup my mail box with same message (which is not required) I want my script to NOT to send an e-mail if it finds the... (13 Replies)
Discussion started by: Rajeshneemkar
13 Replies

8. Shell Programming and Scripting

Stemming of words that contained affixes by using shell script

I just learning shell script. Need your shell script expertise to help me. I would like to stemming the words by matching the root words first between both files and replace all words by "I" character but replace "B" character after root words and "E" before root words in affix_words.txt. ... (18 Replies)
Discussion started by: paranrat
18 Replies

9. Shell Programming and Scripting

Capture run time of python script executed inside shell script

I have bash shell script which is internally calling python script.I would like to know how long python is taking to execute.I am not allowed to do changes in python script.Please note i need to know execution time of python script which is getting executed inside shell .I need to store execution... (2 Replies)
Discussion started by: Adfire
2 Replies

10. Windows & DOS: Issues & Discussions

How to execute python script on remote with python way..?

Hi all, I am trying to run below python code for connecting remote windows machine from unix to run an python file exist on that remote windows machine.. Below is the code I am trying: #!/usr/bin/env python import wmi c = wmi.WMI("xxxxx", user="xxxx", password="xxxxxxx")... (1 Reply)
Discussion started by: onenessboy
1 Replies
american-english-huge(5)					   Users' Manual					  american-english-huge(5)

NAME
american-english-huge - a list of English words DESCRIPTION
/usr/share/dict/american-english-huge is an ASCII file which contains an alphabetic list of words, one per line. FILES
There may be any number of word lists in /usr/share/dict/. /etc/dictionaries-common/words is a symbolic link to the currently-chosen /usr/share/dict/<language> file. /usr/share/dict/words is a symbolic link to /etc/dictionaries-common/words, and is the name by which other software should refer to the system word list. See select-default-wordlist(8) for more information, and/or to change the currently- chosen word list. The directory /usr/share/dict can contain word lists for many languages, with name of the language in English, e.g., /usr/share/dict/french and /usr/share/dict/danish contain respectively lists of French and Danish words if they exist. Such lists should be coded using the ISO 8859-1 character set encoding. SEE ALSO
ispell(1), select-default-wordlist(8), and the Filesystem Hierarchy Standard. HISTORY
The words lists are not specific, and may be generated from any number of sources. The system word list used to be /usr/dict/words. For compatibility, software should check that location if /usr/share/dict/words does not exist. AUTHOR
Word lists are collected and maintained by various authors. The Debian English word lists are built from the SCOWL (Spell- Checker Ori- ented Word Lists) package, whose upstream editor is Kevin Atkinson <kevina@users.sourceforge.net>. Debian 16 June 2003 american-english-huge(5)
All times are GMT -4. The time now is 05:23 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy