Sponsored Content
Top Forums Shell Programming and Scripting Regex to identify unique words in a dictionary database Post 302992051 by Don Cragun on Tuesday 21st of February 2017 02:01:48 AM
Old 02-21-2017
The regular expression (RE) you need depends on what tool you're using and what you want the RE to do.

If you were using awk and wanted an ERE to select lines from your file that just have one headword, you might try:
Code:
awk '/^[^ =]*=/' file

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

dictionary words in vim

how can i get the dictionary words in vim using keyboard keys? and how can i get the current directory filename? (1 Reply)
Discussion started by: lakshmananindia
1 Replies

2. Shell Programming and Scripting

How to identify the occurence of a pattern between a unique character?

hi, is it possible to find the number of occurences of a pattern between two paranthesis. for e.g i have a file as below. >>{ >>hi >>GoodMorning >>how are you? >>} >>is it good, >>tell me yes, if it is good In the above file, its clear the occurence of word "Good"... (17 Replies)
Discussion started by: divak
17 Replies

3. UNIX for Dummies Questions & Answers

Use Regex to identify / format a complex string

First of all, please have mercy on me. I am not a noob to programming, but I am about as noob as you can get with regex. That being said, I have a problem. I've got a string that looks something like this: Publication - Bob M. Jones, Tony X. Stark, and Fred D. Man, \"Really Awesome Article... (1 Reply)
Discussion started by: egill
1 Replies

4. Shell Programming and Scripting

Script to create unique look-up for headers for a Dictionary

I have a text file in UTF-8 format which has the following data structure HEADWORD=gloss1,gloss2,gloss3 etc I want to convert it so that all the glosses of the HeadWord appear on separate lines HEADWORD=gloss1 HEADWORD=gloss2 HEADWORD=gloss3 An example will illustrate the requirement... (4 Replies)
Discussion started by: gimley
4 Replies

5. Shell Programming and Scripting

Counting all words that start with a capital letter in a string using python dictionary

Hi, I have written the following python snippet to store the capital letter starting words into a dictionary as key and no of its appearances as a value in this dictionary against the key. #!/usr/bin/env python import sys import re hash = {} # initialize an empty dictinonary for line in... (1 Reply)
Discussion started by: royalibrahim
1 Replies

6. Shell Programming and Scripting

Identifying single words in a dictionary database

I am reworking a Marathi-English dictionary to be out on open-source. My dictionary has the Headword in Marathi, followed by its Part of Speech and subsequently by its English glosses as in the examples below; अकरसणें v i To contract, shrink. अकरा a Eleven. अकराळ a Frightful, terrible. विकराळ... (2 Replies)
Discussion started by: gimley
2 Replies

7. Shell Programming and Scripting

How to identify varying unique fields values from a text file in UNIX?

Hi, I have a huge unsorted text file. We wanted to identify the unique field values in a line and consider those fields as a primary key for a table in upstream system. Basically, the process or script should fetch the values from each line that are unique compared to the rest of the lines in... (13 Replies)
Discussion started by: manikandan23
13 Replies

8. Shell Programming and Scripting

Regex to identify illegal characters in a perso-arabic database

I am working on Sindhi: a perso-Arabic script and since it shares the Unicode-block with over 400 other languages, quite often the database contains characters which are not wanted: illegal characters. I have identified the character set of Sindhi which is given below: For clarity's sake, each... (8 Replies)
Discussion started by: gimley
8 Replies

9. UNIX for Beginners Questions & Answers

Regex to identify pattern

Hi In a file I have string in multiple lines. Like below: <?=test.getObjectName("L", "testTBL","D") ?> <?=test.getObjectName("L", "testTBL","testDB", "D") ?> I want to use regex to search for the pattern "<?=test.getObjectName...?>" If the parenthesis has 3 parameters then return 2nd... (5 Replies)
Discussion started by: dashing201
5 Replies
CWDREG(1)						      General Commands Manual							 CWDREG(1)

NAME
cwdreg - To register characters/words into the binary format dictionary. SYNOPSIS
cwdreg [-D server ] -n envname -d dicno < textdic OR cwdreg [-D server ] -n envname -L filename < textdic DEFAULT PATH
/usr/local/bin/cWnn4/cwdreg DESCRIPTION
This function allows user to register characters/words into the specified binary dictionary, with either dictionary number dicno or dictio- nary filename filename specified. server is the machine name of the server. If this is not specified, the default cserver indicated by the environment variable CSERVER will be taken. "-n envname " must be specified. envname is the environment name. You may execute "cwnnstat -E" to see the current environment name. Either "-d dicno " or "-L filename " must be specified. dicno is the dictionary number. filename is the filename of the dictionary. "-L" is used for when the dictionary is from the local machine. "<" means to pipe the textdic as an input to "cwdreg" command. textdic is the text file which user enters the characters/words to be registered. The format of this text file must be the same as that in the system text format dictionary. That is, -------------------------------------------------- | Pinyin Word Cixing Frequency | | : : : : | -------------------------------------------------- Refer to cWnn manual for details on dictionary. By using "cwdreg", all the characters/words in textdic will be registered into the specified binary dictionary permanently. NOTE
1. The parts in [ ] are options. They may be omitted. 13 May 1992 CWDREG(1)
All times are GMT -4. The time now is 01:13 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy