Hello,
I want to create a test bed for Urdu ligatural forms. One of the main components is to create a delimiter list. These are forms after which no connectors can be formed.
What I need is a tool which will take a running text or a list of words in a file and split them as soon as a delimiter is encountered. A sample will explain the process:
I am using Latin script for easy facilitation.
DELIMITERS:Let us assume that the delimiters are:
Code:
a,e,i,o,u
Each delimiter separated by a comma
INPUT:
Code:
baker
convoluted
perspicacity
EXPECTED OUTPUT
Code:
ba ke r
co nvo lu te d
pe rspi ca ci ty
i.e. after each delimiter the string is splitted and a space is placed.
Please note that if I had put
Code:
aeo
as a delimiter. Then a string such as :
Code:
archaeological
would be split as
Code:
a rchaeo lo gi ca l
At present I use a macro to do the job. But the process is extremely slow.
An AWK or PERL Script would be of great help, since my OS is Windows.
Many thanks
p.s. Just in case someone is interested in tweaking Urdu, a sample delimiter list is provided below:
[user@host ~]$ cat file
baker
convoluted
perspicacity
[user@host ~]$ cat test.pl
#! /usr/bin/perl
my @delims = qw / a e i o u /;
my ($str, $x) = (undef, undef);
open I, "< file";
while ($str = <I>) {
chomp ($str);
for $x(split('', $str)) {
(grep {$_ eq $x} @delims) ? print "$x " : print "$x";
}
print "\n";
}
close I;
[user@host ~]$
[user@host ~]$ ./test.pl
ba ke r
co nvo lu te d
pe rspi ca ci ty
[user@host ~]$
Hello,
It worked beautifully for the English samples. However the momnet I plugged in the Urdu delimiters, it did not work.
I suppose this is because PERL does not support UTF8. I even tried saving the script as UTF8 with no Byte Order mark, but it did not work.
The only change I made in the script was to replace it with my delimiters.
Code:
my @delims = qw / ا ڈ ذ ر ڑ ز ژ و ے إ ۓ ؤ /;
each separated by a space as in your case
Just for testing here is a small sample on which I tried
Basically even if the script is alien, you should see a space between the ligatural forms, but the script spews out the sample file as such.
How do you get around this issue?
Any help or suggestions, please.
Many thanks
I have a file which is separated by delimiter "|", but the prob is one of my column do contain delimiter as description so how can i differentiate it?
PS : the delmiter does have backslash coming before it, if occurring in column
Annual|Beleagured|Desc|Denver... (2 Replies)
Hi Team,
I am trying to get the data in below format
Jan 01 | 19:00:32 | xyz | abc | sometext | string
however I am not sure of the total number strings which can come in the record hence i cant use something like below as it can end $6 or it can go further
cat file| awk... (8 Replies)
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
Here is what I am supposed to do, word for word from my assignment page:
1. Create/modify and print a... (2 Replies)
I have a file having lines like:
14: <a="b" val="c"/>
18: <a="x" val="d"/>
54: <a="b" val="c"/>
58: <a="x" val="e"/>
I need to create a file with output:
14
d
54
e
So basically, for every odd line I need 1st word if delimiter is ':' and for every even... (14 Replies)
Line from input file
a : b : c " d " e " f : g : h " i " j " k " l
output
k b a
Its taking 7th word when " is the delimiter, 2nd and 1st word when : is the delimiter and returning all in one line.... I am on solaris
Thanks..... (1 Reply)
I'm trying to do a split using two delimiters. The first delimiter is ": " (or we could call it :\s). The second is "\n".
How can or these delimiters so I can toss the values into an array without issue?
I tried @array = split /:\s|\n/, $myvar;
This doesn't seem to be working.
Any an... (3 Replies)
Hello,
this thread is more about scripting style than a specific issue.
I've to grep from a output some lines and from them obtain a specific entry delimited by < and >.
This is my way :
1) grep -i user list | awk '{FS="<";print $NF}' | sed -e 's/>//g'
2) grep -i user list | cut -d","... (10 Replies)
Hi,
Can someone help me with creating a bash shell script.
I need to create a script that gets a positive number n as an argument.
The script must create n directories in the current directory with names like map_1, map_2 etcetera. Each directory must be contained within its predecessor. So... (7 Replies)