Regex to identify unique words in a dictionary database
Hello,
I have a dictionary which I am building for the Open Source Community. The data structure is as under
as shown in the example below
I need to identify within the headwords, only those which are single headwords and not those where a Headword is made of more than a single headword
Thus
should be identified but the following which have more than one word are not valid and should not be identified
A regex in Perl or Unix would be really useful. Thanks a lot.
hi,
is it possible to find the number of occurences of a pattern between two paranthesis.
for e.g
i have a file as below.
>>{
>>hi
>>GoodMorning
>>how are you?
>>}
>>is it good,
>>tell me yes, if it is good
In the above file, its clear the occurence of word "Good"... (17 Replies)
First of all, please have mercy on me. I am not a noob to programming, but I am about as noob as you can get with regex. That being said, I have a problem.
I've got a string that looks something like this:
Publication - Bob M. Jones, Tony X. Stark, and Fred D. Man, \"Really Awesome Article... (1 Reply)
I have a text file in UTF-8 format which has the following data structure
HEADWORD=gloss1,gloss2,gloss3 etc
I want to convert it so that all the glosses of the HeadWord appear on separate lines
HEADWORD=gloss1
HEADWORD=gloss2
HEADWORD=gloss3
An example will illustrate the requirement... (4 Replies)
Hi,
I have written the following python snippet to store the capital letter starting words into a dictionary as key and no of its appearances as a value in this dictionary against the key.
#!/usr/bin/env python
import sys
import re
hash = {} # initialize an empty dictinonary
for line in... (1 Reply)
I am reworking a Marathi-English dictionary to be out on open-source. My dictionary has the Headword in Marathi, followed by its Part of Speech and subsequently by its English glosses as in the examples below;
अकरसणें v i To contract, shrink.
अकरा a Eleven.
अकराळ a Frightful, terrible.
विकराळ... (2 Replies)
Hi,
I have a huge unsorted text file. We wanted to identify the unique field values in a line and consider those fields as a primary key for a table in upstream system.
Basically, the process or script should fetch the values from each line that are unique compared to the rest of the lines in... (13 Replies)
I am working on Sindhi: a perso-Arabic script and since it shares the Unicode-block with over 400 other languages, quite often the database contains characters which are not wanted: illegal characters.
I have identified the character set of Sindhi which is given below:
For clarity's sake, each... (8 Replies)
Hi
In a file I have string in multiple lines. Like below:
<?=test.getObjectName("L", "testTBL","D") ?>
<?=test.getObjectName("L", "testTBL","testDB", "D") ?>
I want to use regex to search for the pattern "<?=test.getObjectName...?>"
If the parenthesis has 3 parameters then return 2nd... (5 Replies)