Here's a perl program, though, I couldn't test it with the actual data (urdu and hindi characters). It works for ASCII characters input (a=b,c,b.......)
Code:
#! /usr/bin/perl
use warnings;
use strict;
my ($line, @lr, %hindi_words);
open I, "< file.txt";
while ($line = <I>) {
chomp ($line);
undef %hindi_words;
@lr = split ('=', $line);
for (split(',', $lr[1])) {
$hindi_words{$_} = 1;
}
print "$lr[0]=", join(',', keys(%hindi_words)), "\n";
}
close I;
By the way, for this program logically similar words like आबादिओं,आबादियों or आज कल,आजकल or ऑबजेक्शन,ऑब्जेक्शन are different.
This User Gave Thanks to balajesuri For This Post:
Hi,
I am having almost the same problem as junior member 'oupsforum' (refer to subjuct "deleting double entry in a log file"), only that I am using Sun Sorlaris Unix which the uniq command does not has the flag -w. So I am not able to ignore certain portion of the line when the uniq doing the... (3 Replies)
Hi All,
How to prevent starting of processes that have duplicate entries in cron file, i have written a shell script to validate with "ps |grep" command before starting the process, but still when same process started at same time, it may not be able to detect the existing process.
Sample... (3 Replies)
Can anyone help me how can i print only the unique entry in a line?
MI_AP MI_AP MI_CM MI_MF
RC_NAP MBS_AP SF_RAN MBS_AP NT_CAR
so that it will on output the one unique entry per line.
MI_AP MI_CM MI_MF
RC_NAP MBS_AP SF_RAN NT_CAR
I can't find the same situation on the knowledge... (5 Replies)
I have file where it contains 2 columns. In two columns the first column is repeated more than once. I wanted to take the unique record in first column
and the corresponding second column value .
The below is the example of the file:
8244100320012955|000b063471a4... (4 Replies)
Hello everyone,
I want to compare the first line of a file(ABC) with that of a folder,XYZ(folder contents) and want that line to be deleted from the file(ABC) if that entry doesn't exist in the folder(XYZ)
I want to put this in a loop. please can anyone help
thanks (6 Replies)
Hi *,
I need to delete duplicate lease entries in file according to MAC/IP.
I'm having tempfile which contains many lease info and need to have one entry for each IP(not more than that), if it contains more than one entry for same set, need to be deleted that entry...
EX:
lease... (4 Replies)
Hi masters
Is there any way to edit or delete an entry in inittab file without using vi or any editors?
We can use commands instead or any shell script ..
If any one can help deeply appreciated
Thanks a lot
sai (3 Replies)
Hi,
I need to delete duplicate records in a file that is around 30MB. Below is what I need. Below are the entries of input file and the output file that I need. Each section of input file is separated by an empty line. Some of these sections have duplicate uid values. I want to retain only one... (4 Replies)
Hi
Im trying to scan a file for certain entries and remove their corresponding lines completely. What I have now is this,
for USER in user1 user2 user3 user4
do
sed '/$USER/d' /etc/sudoers
done
However this doesn't remove the entries at all. Is there another way for this?
Thanks... (2 Replies)
Hi i have a file like
110.10
120.10
-1120
110.10
and the lines are having more than 10k.
do we have anycommand to check the duplicate entries in the file.
I applied the while loop by greping each line with whole file,
but it is taking huge amount of time as the file size is large.
... (5 Replies)
Discussion started by: saluja.deepak
5 Replies
LEARN ABOUT SUNOS
wnnatod
wnnatod(1) User Commands wnnatod(1)NAME
wnnatod - Convert an EUC text dictionary to a binary dictionary
SYNOPSIS
/usr/bin/wnnatod [-s num] [-R] [-S] [-U] [-r] [-N] [-n] [-P filename] [-p filename] [-I] [-e] [-h filename] binary_dictionary_filename
DESCRIPTION
wnnatod reads a Japanese EUC text dictionary from the standard input, converts it to a binary dictionary and writes it to the specified
binary_dictionary_filename.
OPTIONS
The following options are available.
-s num Specifies the amount of memory to allocate (in words). num should be a little over the number of words in the dictionary.
Normally you do not need to specify this option. The default is 70,000. If wnnatod fails, notifying memory shortage, retry
the command with -s option.
-R Converts the EUC text dictionary to a reverse-searchable binary dictionary (default).
-S Converts the EUC text dictionary to a fixed-format dictionary.
-U Converts the EUC text dictionary to an editable dictionary.
-r Reverses the order of Kana and Kanji when converting the EUC text dictionary.
-N Sets the dictionary password to "*".
-n Sets the frequency password to "*".
-P filename Specifies the file name of the dictionary password.
-p filename Specifies the file name of the frequency password.
-I Creates a system dictionary.
-e Registers an entry's reading (Hiragana) as word in the binary dictionary if the reading and the word are the same (that is,
the word consists of only Hiragana). With this option, you cannot convert a text dictionary to a reverse-searchable
binary dictionary.
-h filename Specifies the file name that contains part of speech information.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
|Availability |SUNWjwncu |
+-----------------------------+-----------------------------+
SEE ALSO wnndictutil(1), wnndtoa(1), wnnotow(1), wnntouch(1)SunOS 5.10 2 Mar 1998 wnnatod(1)