I'm not familiar with perl's unicode capabilities, but assuming that the unicode 2026 character is encoded in utf8, a non-unicode/utf-8 aware approach will have to match a sequence of three bytes: 0xE2 0x80 0xA6.
HI
Hi I have a character string which contains some special characters and I need it to display as a hex string.
For example, the sample i/p string: ×¥ïA Å gïÛý and
the o/p should be : D7A5EF4100C5010067EFDBFD
Any pointers or sample code pls. (5 Replies)
Am not able to display the corresponding character for the hex value using the format specifier into a file
Could you please help me with that
>cat other
a|\xc2\xbo
>cat write.pl
#! /opt/third-party/bin/perl
open(FILE2, "< other") || die "Unable to open file other\n";
while (... (7 Replies)
Hi,
I'm trying to get one field out of many as follows:
A string of multiple fields separated with "/" characters:
"/ab=12/cd=34/12=ab/34=cd/ef=pick-this.one/gh=blah/ij=something/"
I want to pick up the field "ef=pick-this.one" which has no regular pattern except it starts with "ef=xxxx"... (3 Replies)
Am trying to remove urls from text strings in PERL. I have the following but it does not seem to work:
$remarks =~ s/www\.\s+\.com//gi;
In English, I want to look for www. then I want to delete the www. and everything after it until I hit a space (but not including the space).
It's not... (2 Replies)
Hello all. I need help...
How can I cenvert this 42ec93df826c804ea531c56594db453d54daad4b to normal text? What convertor I have to use?
Thanks. (12 Replies)
Hi Everyone,
I am looking for neat way to grep a non-empty string that basically contains a hostname, which might be in FWDN form or without the domain, for example:
hostname.internal.domainname.net
The file I am parsing contains blan lines (^$) and also series of "-" which in other places... (2 Replies)
Hi,
i want to convert number 5860533159 to hexadecimal. i need to use perl.
i used
$foo = 5860533159;
$hexval3 = sprintf("%#x", $foo);
i am getting value as 0xffffffff.
i need to get value as 0x15D50A3A7. when i converted using google calculator, i got the correct value, expected... (9 Replies)
i have a script in which i need to skip comments, and i am able to achieve it partially...
IN text file:
{****************************
{test : test...test }
Script:
while (<$fh>)
{
push ( @data, $_);
}
if ( $data =~ m/(^{\*+$)/ ){
}
With the above match i am... (5 Replies)
cat clinvar_00-latest.vcf | perl -aF/\\t/ -lne '/CLNSRCID=(\d+)/ and print join("\t",@F,$1)' > OMIM.txt
The above code finds the text CLNSRCID=, but only outputs those records in which there is a numerical value only.
For example, the first match is CLNSRCID=103320.0001 in line 4 of the... (1 Reply)
Hi,
I'm looking to split the following hex string into rows of four elements.
I've tried the following but it doesn't seem to work. How can I tell sed to match based on a pair of number(s) and letter(s), and add a newline every 4 pairs?
In addition, I need to add another newline after every... (5 Replies)
Discussion started by: sand1234
5 Replies
LEARN ABOUT DEBIAN
unicode
unicode(3tcl) Unicode normalization unicode(3tcl)__________________________________________________________________________________________________________________________________________________NAME
unicode - Implementation of Unicode normalization
SYNOPSIS
package require Tcl 8.3
package require unicode 1.0
::unicode::fromstring string
::unicode::tostring uclist
::unicode::normalize form uclist
::unicode::normalizeS form string
_________________________________________________________________DESCRIPTION
This is an implementation in Tcl of the Unicode normalization forms.
COMMANDS
::unicode::fromstring string
Converts string to list of integer Unicode character codes which is used in unicode for internal string representation.
::unicode::tostring uclist
Converts list of integers uclist back to Tcl string.
::unicode::normalize form uclist
Normalizes Unicode characters list ulist according to form and returns the normalized list. Form form takes one of the following
values: D (canonical decomposition), C (canonical decomposition, followed by canonical composition), KD (compatibility decomposi-
tion), or KC (compatibility decomposition, followed by canonical composition).
::unicode::normalizeS form string
A shortcut to ::unicode::tostring [unicode::normalize $form [::unicode::fromstring $string]]. Normalizes Tcl string and returns
normalized string.
EXAMPLES
% ::unicode::fromstring "u0410u0411u0412u0413"
1040 1041 1042 1043
% ::unicode::tostring {49 50 51 52 53}
12345
%
% ::unicode::normalize D {7692 775}
68 803 775
% ::unicode::normalizeS KD "u1d2c"
A
%
REFERENCES
[1] "Unicode Standard Annex #15: Unicode Normalization Forms", (http://unicode.org/reports/tr15/)
AUTHORS
Sergei Golovan
BUGS, IDEAS, FEEDBACK
This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category string-
prep of the Tcllib SF Trackers [http://sourceforge.net/tracker/?group_id=12883]. Please also report any ideas for enhancements you may
have for either package and/or documentation.
SEE ALSO stringprep(3tcl)KEYWORDS
normalization, unicode
COPYRIGHT
Copyright (c) 2007, Sergei Golovan <sgolovan@nes.ru>
stringprep 1.0.0 unicode(3tcl)