ustr-import(1) [centos man page]

ustr(1) 						    Ustr String Library, tools							   ustr(1)

NAME

       ustr-import - ustr string library import tool

SYNOPSIS

       ustr-import [--32|--64] [-d][d] [-c] [-b x] [-e 1|0] [-s 1|0] section

DESCRIPTION

	This tool lets you use the Ustr string library without incuring dependancies on the library itself, so API/ABI compatability is 100%
       (nothing changes unless you do it) and installing your application doesn't require the library to be pre-installed.

OPTIONS

       --32   If you installed with multilib, this runs the 32 bit varient (and installs the variable multilib build code as ustr-conf.h).

       --64   If you installed with multilib, this runs the 64 bit varient (and installs the variable multilib build code as ustr-conf.h).

       -d     Turn debugging on, USTR_ASSERT() now runs code etc.

       -d     Turn extra debugging on, including End of String (EOS) markers that takeup space. Note that you can do -dd to add both at once.

       -c     Use C files, this requires that you alter the build system to compile the C files and link them into your application. The default
	      is to just provide headers that you can just include.

       -b     Specify the default reference count byte size: 0, 1, 2 or 4 (or 8 on 64 bit platforms). Note that 2 bytes is the minimum if you have
	      explicit size storage.

       -e     Specify the default exact sized allocations flag, without this flag allocations are rounded up to the neared half power of two.

       -s     Specify the default explicit size storage flag, without this flag allocations have an implicit size based on their length with it a
	      size value is stored with the string (thus taking significantly larger space for small strings, but this doesn't require
	      reallocating the string when growing and shrinking the string). Note that turning this on also increases the minimum sizes for
	      length and reference count storage.

SECTIONS

       all    All of the following sections are included.

       b      Working with binary numbers in NBO format.

       cmp    Comparing, strcmp() for Ustr's, although the Ustr versions are safer and much faster.

       cntl   Control options dynamically.

       fmt    Formatted output, sprintf() for Ustr's.

       gdb    Copy just the .gdbinit file to the local dir.

       io     Input Output.

       ins    Inserting data.

       main   The core functions, including strcat(), strdup() and delete for Ustr's. Always safer and often much faster.

       parse  Parsing integers, Ie. Nice versions of strtol().

       pool   A bundled memory pool API, to use with the ustrp functions.

       replace
	      Replacing all occurances of data.

       sc     Shortcut functions for Ustr's.

       set    Setting data, strcpy() for Ustr's.

       split  Slit the data, strtok() / strsep() for Ustr's.

       spn    Spanning, strspn() / strcspn() for Ustr's.

       srch   Searching, strchr() / strrchr() / strstr() for Ustr's, although the Ustr versions are safer and much faster.

       sub    Substituting data.

       utf8   Working with UTF8.

FILES

       /ustr/include/ustr-conf.h /ustr/include/ustr-conf-debug.h
	      In multilib. this is the header to choose the correct conf.h header based on the byte size.

       /ustr/include/ustr*.h
	      The default "extern" header files.

       /usr/share/ustr-*/ustr-*-internal.h
	      Internal functions, used the implement the public interfaces.

       /usr/share/ustr-*/ustr-*-code.h
	      The code behind the public interfaces.

       /usr/share/ustr-*/ustr-*-code.c
	      The C files, which use the code header files to create objects.

       /usr/share/ustr-*/.gdbinit
	      The GDB init file containing macros to help inspect Ustr's in the debugger.

SEE ALSO

       ustr(3),ustr_const(3)

ustr-import 1.0.4						    03-Aug-2007 							   ustr(1)

Check Out this Related Man Page

String(3)						User Contributed Perl Documentation						 String(3)

NAME

       Unicode::String - String of Unicode characters (UTF-16BE)

SYNOPSIS

	use Unicode::String qw(utf8 latin1 utf16be);

	$u = utf8("string");
	$u = latin1("string");
	$u = utf16be("string");

	print $u->utf32be;   # 4 byte characters
	print $u->utf16le;   # 2 byte characters + surrogates
	print $u->utf8;      # 1-4 byte characters

DESCRIPTION

       A "Unicode::String" object represents a sequence of Unicode characters.	Methods are provided to convert between various external formats
       (encodings) and "Unicode::String" objects, and methods are provided for common string manipulations.

       The functions utf32be(), utf32le(), utf16be(), utf16le(), utf8(), utf7(), latin1(), uhex(), uchr() can be imported from the
       "Unicode::String" module and will work as constructors initializing strings of the corresponding encoding.

       The "Unicode::String" objects overload various operators, which means that they in most cases can be treated like plain strings.

       Internally a "Unicode::String" object is represented by a string of 2 byte numbers in network byte order (big-endian). This representation
       is not visible by the API provided, but it might be useful to know in order to predict the efficiency of the provided methods.

   METHODS
   Class methods
       The following class methods are available:

       Unicode::String->stringify_as
       Unicode::String->stringify_as( $enc )
	   This method is used to specify which encoding will be used when "Unicode::String" objects are implicitly converted to and from plain
	   strings.

	   If an argument is provided it sets the current encoding.  The argument should have one of the following: "ucs4", "utf32", "utf32be",
	   "utf32le", "ucs2", "utf16", "utf16be", "utf16le", "utf8", "utf7", "latin1" or "hex".  The default is "utf8".

	   The stringify_as() method returns a reference to the current encoding function.

       $us = Unicode::String->new
       $us = Unicode::String->new( $initial_value )
	   This is the object constructor.  Without argument, it creates an empty "Unicode::String" object.  If an $initial_value argument is
	   given, it is decoded according to the specified stringify_as() encoding, UTF-8 by default.

	   In general it is recommended to import and use one of the encoding specific constructor functions instead of invoking this method.

   Encoding methods
       These methods get or set the value of the "Unicode::String" object by passing strings in the corresponding encoding.  If a new value is
       passed as argument it will set the value of the "Unicode::String", and the previous value is returned.  If no argument is passed then the
       current value is returned.

       To illustrate the encodings we show how the 2 character sample string of "Xm" (micro meter) is encoded for each one.

       $us->utf32be
       $us->utf32be( $newval )
	   The string passed should be in the UTF-32 encoding with bytes in big endian order.  The sample "Xm" is "xB5m" in this
	   encoding.

	   Alternative names for this method are utf32() and ucs4().

       $us->utf32le
       $us->utf32le( $newval )
	   The string passed should be in the UTF-32 encoding with bytes in little endian order.  The sample "Xm" is is "xB5m" in
	   this encoding.

       $us->utf16be
       $us->utf16be( $newval )
	   The string passed should be in the UTF-16 encoding with bytes in big endian order. The sample "Xm" is "xB5m" in this encoding.

	   Alternative names for this method are utf16() and ucs2().

	   If the string passed to utf16be() starts with the Unicode byte order mark in little endian order, the result is as if utf16le() was
	   called instead.

       $us->utf16le
       $us->utf16le( $newval )
	   The string passed should be in the UTF-16 encoding with bytes in little endian order.  The sample "Xm" is is "xB5m" in this
	   encoding.  This is the encoding used by the Microsoft Windows API.

	   If the string passed to utf16le() starts with the Unicode byte order mark in big endian order, the result is as if utf16le() was called
	   instead.

       $us->utf8
       $us->utf8( $newval )
	   The string passed should be in the UTF-8 encoding. The sample "Xm" is "xC2xB5m" in this encoding.

       $us->utf7
       $us->utf7( $newval )
	   The string passed should be in the UTF-7 encoding. The sample "Xm" is "+ALU-m" in this encoding.

	   The UTF-7 encoding only use plain US-ASCII characters for the encoding.  This makes it safe for transport through 8-bit stripping
	   protocols.  Characters outside the US-ASCII range are base64-encoded and '+' is used as an escape character.  The UTF-7 encoding is
	   described in RFC 1642.

	   If the (global) variable $Unicode::String::UTF7_OPTIONAL_DIRECT_CHARS is TRUE, then a wider range of characters are encoded as
	   themselves.	It is even TRUE by default.  The characters affected by this are:

	      ! " # $ % & * ; < = > @ [ ] ^ _ ` { | }

       $us->latin1
       $us->latin1( $newval )
	   The string passed should be in the ISO-8859-1 encoding. The sample "Xm" is "xB5m" in this encoding.

	   Characters outside the "x00" .. "xFF" range are simply removed from the return value of the latin1() method.  If you want more
	   control over the mapping from Unicode to ISO-8859-1, use the "Unicode::Map8" class.	This is also the way to deal with other 8-bit
	   character sets.

       $us->hex
       $us->hex( $newval )
	   The string passed should be plain ASCII where each Unicode character is represented by the "U+XXXX" string and separated by a single
	   space character.  The "U+" prefix is optional when setting the value.  The sample "Xm" is "U+00b5 U+006d" in this encoding.

   String Operations
       The following methods are available:

       $us->as_string
	   Converts a "Unicode::String" to a plain string according to the setting of stringify_as().  The default stringify_as() encoding is
	   "utf8".

       $us->as_num
	   Converts a "Unicode::String" to a number.  Currently only the digits in the range 0x30 .. 0x39 are recognized.  The plan is to
	   eventually support all Unicode digit characters.

       $us->as_bool
	   Converts a "Unicode::String" to a boolean value.  Only the empty string is FALSE.  A string consisting of only the character U+0030 is
	   considered TRUE, even if Perl consider "0" to be FALSE.

       $us->repeat( $count )
	   Returns a new "Unicode::String" where the content of $us is repeated $count times.  This operation is also overloaded as:

	     $us x $count

       $us->concat( $other_string )
	   Concatenates the string $us and the string $other_string.  If $other_string is not an "Unicode::String" object, then it is first passed
	   to the Unicode::String->new constructor function.  This operation is also overloaded as:

	     $us . $other_string

       $us->append( $other_string )
	   Appends the string $other_string to the value of $us.  If $other_string is not an "Unicode::String" object, then it is first passed to
	   the Unicode::String->new constructor function.  This operation is also overloaded as:

	     $us .= $other_string

       $us->copy
	   Returns a copy of the current "Unicode::String" object.  This operation is overloaded as the assignment operator.

       $us->length
	   Returns the length of the "Unicode::String".  Surrogate pairs are still counted as 2.

       $us->byteswap
	   This method will swap the bytes in the internal representation of the "Unicode::String" object.

	   Unicode reserve the character U+FEFF character as a byte order mark.  This works because the swapped character, U+FFFE, is reserved to
	   not be valid.  For strings that have the byte order mark as the first character, we can guaranty to get the byte order right with the
	   following code:

	      $ustr->byteswap if $ustr->ord == 0xFFFE;

       $us->unpack
	   Returns a list of integers each representing an UCS-2 character code.

       $us->pack( @uchr )
	   Sets the value of $us as a sequence of UCS-2 characters with the characters codes given as parameter.

       $us->ord
	   Returns the character code of the first character in $us.  The ord() method deals with surrogate pairs, which gives us a result-range
	   of 0x0 .. 0x10FFFF.	If the $us string is empty, undef is returned.

       $us->chr( $code )
	   Sets the value of $us to be a string containing the character assigned code $code.  The argument $code must be an integer in the range
	   0x0 .. 0x10FFFF.  If the code is greater than 0xFFFF then a surrogate pair created.

       $us->name
	   In scalar context returns the official Unicode name of the first character in $us.  In array context returns the name of all characters
	   in $us.  Also see Unicode::CharName.

       $us->substr( $offset )
       $us->substr( $offset, $length )
       $us->substr( $offset, $length, $subst )
	   Returns a sub-string of $us.  Works similar to the builtin substr() function.

       $us->index( $other )
       $us->index( $other, $pos )
	   Locates the position of $other within $us, possibly starting the search at position $pos.

       $us->chop
	   Chops off the last character of $us and returns it (as a "Unicode::String" object).

FUNCTIONS

       The following functions are provided.  None of these are exported by default.

       byteswap2( $str, ... )
	   This function will swap 2 and 2 bytes in the strings passed as arguments.  If this function is called in void context, then it will
	   modify its arguments in-place.  Otherwise, the swapped strings are returned.

       byteswap4( $str, ... )
	   The byteswap4 function works similar to byteswap2, but will reverse the order of 4 and 4 bytes.

       latin1( $str )
       utf7( $str )
       utf8( $str )
       utf16le( $str )
       utf16be( $str )
       utf32le( $str )
       utf32be( $str )
	   Constructor functions for the various Unicode encodings.  These return new "Unicode::String" objects.  The provided argument should be
	   encoded correspondingly.

       uhex( $str )
	   Constructs a new "Unicode::String" object from a string of hex values.  See hex() method above for description of the format.

       uchar( $num )
	   Constructs a new one character "Unicode::String" object from a Unicode character code.  This works similar to perl's builtin chr()
	   function.

SEE ALSO

       Unicode::CharName, Unicode::Map8

       <http://www.unicode.org/>

       perlunicode

COPYRIGHT

       Copyright 1997-2000,2005 Gisle Aas.

       This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.12.1							    2005-10-26								 String(3)

Linux and UNIX Man Pages

ustr-import(1) [centos man page]

Check Out this Related Man Page