Sponsored Content
Top Forums Shell Programming and Scripting Regex to identify unique words in a dictionary database Post 302992051 by Don Cragun on Tuesday 21st of February 2017 02:01:48 AM
Old 02-21-2017
The regular expression (RE) you need depends on what tool you're using and what you want the RE to do.

If you were using awk and wanted an ERE to select lines from your file that just have one headword, you might try:
Code:
awk '/^[^ =]*=/' file

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

dictionary words in vim

how can i get the dictionary words in vim using keyboard keys? and how can i get the current directory filename? (1 Reply)
Discussion started by: lakshmananindia
1 Replies

2. Shell Programming and Scripting

How to identify the occurence of a pattern between a unique character?

hi, is it possible to find the number of occurences of a pattern between two paranthesis. for e.g i have a file as below. >>{ >>hi >>GoodMorning >>how are you? >>} >>is it good, >>tell me yes, if it is good In the above file, its clear the occurence of word "Good"... (17 Replies)
Discussion started by: divak
17 Replies

3. UNIX for Dummies Questions & Answers

Use Regex to identify / format a complex string

First of all, please have mercy on me. I am not a noob to programming, but I am about as noob as you can get with regex. That being said, I have a problem. I've got a string that looks something like this: Publication - Bob M. Jones, Tony X. Stark, and Fred D. Man, \"Really Awesome Article... (1 Reply)
Discussion started by: egill
1 Replies

4. Shell Programming and Scripting

Script to create unique look-up for headers for a Dictionary

I have a text file in UTF-8 format which has the following data structure HEADWORD=gloss1,gloss2,gloss3 etc I want to convert it so that all the glosses of the HeadWord appear on separate lines HEADWORD=gloss1 HEADWORD=gloss2 HEADWORD=gloss3 An example will illustrate the requirement... (4 Replies)
Discussion started by: gimley
4 Replies

5. Shell Programming and Scripting

Counting all words that start with a capital letter in a string using python dictionary

Hi, I have written the following python snippet to store the capital letter starting words into a dictionary as key and no of its appearances as a value in this dictionary against the key. #!/usr/bin/env python import sys import re hash = {} # initialize an empty dictinonary for line in... (1 Reply)
Discussion started by: royalibrahim
1 Replies

6. Shell Programming and Scripting

Identifying single words in a dictionary database

I am reworking a Marathi-English dictionary to be out on open-source. My dictionary has the Headword in Marathi, followed by its Part of Speech and subsequently by its English glosses as in the examples below; अकरसणें v i To contract, shrink. अकरा a Eleven. अकराळ a Frightful, terrible. विकराळ... (2 Replies)
Discussion started by: gimley
2 Replies

7. Shell Programming and Scripting

How to identify varying unique fields values from a text file in UNIX?

Hi, I have a huge unsorted text file. We wanted to identify the unique field values in a line and consider those fields as a primary key for a table in upstream system. Basically, the process or script should fetch the values from each line that are unique compared to the rest of the lines in... (13 Replies)
Discussion started by: manikandan23
13 Replies

8. Shell Programming and Scripting

Regex to identify illegal characters in a perso-arabic database

I am working on Sindhi: a perso-Arabic script and since it shares the Unicode-block with over 400 other languages, quite often the database contains characters which are not wanted: illegal characters. I have identified the character set of Sindhi which is given below: For clarity's sake, each... (8 Replies)
Discussion started by: gimley
8 Replies

9. UNIX for Beginners Questions & Answers

Regex to identify pattern

Hi In a file I have string in multiple lines. Like below: <?=test.getObjectName("L", "testTBL","D") ?> <?=test.getObjectName("L", "testTBL","testDB", "D") ?> I want to use regex to search for the pattern "<?=test.getObjectName...?>" If the parenthesis has 3 parameters then return 2nd... (5 Replies)
Discussion started by: dashing201
5 Replies
wnnatod(1)							   User Commands							wnnatod(1)

NAME
wnnatod - Convert an EUC text dictionary to a binary dictionary SYNOPSIS
/usr/bin/wnnatod [-s num] [-R] [-S] [-U] [-r] [-N] [-n] [-P filename] [-p filename] [-I] [-e] [-h filename] binary_dictionary_filename DESCRIPTION
wnnatod reads a Japanese EUC text dictionary from the standard input, converts it to a binary dictionary and writes it to the specified binary_dictionary_filename. OPTIONS
The following options are available. -s num Specifies the amount of memory to allocate (in words). num should be a little over the number of words in the dictionary. Normally you do not need to specify this option. The default is 70,000. If wnnatod fails, notifying memory shortage, retry the command with -s option. -R Converts the EUC text dictionary to a reverse-searchable binary dictionary (default). -S Converts the EUC text dictionary to a fixed-format dictionary. -U Converts the EUC text dictionary to an editable dictionary. -r Reverses the order of Kana and Kanji when converting the EUC text dictionary. -N Sets the dictionary password to "*". -n Sets the frequency password to "*". -P filename Specifies the file name of the dictionary password. -p filename Specifies the file name of the frequency password. -I Creates a system dictionary. -e Registers an entry's reading (Hiragana) as word in the binary dictionary if the reading and the word are the same (that is, the word consists of only Hiragana). With this option, you cannot convert a text dictionary to a reverse-searchable binary dictionary. -h filename Specifies the file name that contains part of speech information. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | |Availability |SUNWjwncu | +-----------------------------+-----------------------------+ SEE ALSO
wnndictutil(1), wnndtoa(1), wnnotow(1), wnntouch(1) SunOS 5.10 2 Mar 1998 wnnatod(1)
All times are GMT -4. The time now is 04:48 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy