Sponsored Content
Full Discussion: Character Sets
Top Forums Shell Programming and Scripting Character Sets Post 302092456 by Andrek on Tuesday 10th of October 2006 03:43:03 AM
Old 10-10-2006
Don't know how helpfull this is but the "locale" command will list out your current local env and show what character set is in place.
There is a perl function called perllocale which may help?
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

FILE SETS in unix

Hi all, Pls. let me know whether there is any concept called "FILE SETS" in unix? Because, I am using ETL tool DataStage which creates FILE SETS. While I am able to view the data of such a file set in the tool, the "cat" command on this FILESET lists only the Metadata and not the data content... (2 Replies)
Discussion started by: Aparna_A
2 Replies

2. AIX

IP Security file sets

hello, we are implementing ip security on several of our aix 5.2-09 boxes and i am unable to locate the prerequisite file sets. does anyone know where i can find these? i have the original 5.2 cd's but these file sets are not on any of the cd's. Any thoughts or suggestions? (3 Replies)
Discussion started by: zuessh
3 Replies

3. Virtualization and Cloud Computing

Clouds (Partially Order Sets) - Streams (Linearly Ordered Sets) - Part 2

timbass Sat, 28 Jul 2007 10:07:53 +0000 Originally posted in Yahoo! CEP-Interest Here is my follow-up note on posets (partially ordered sets) and tosets (totally or linearly ordered sets) as background set theory for event processing, and in particular CEP and ESP. In my last note, we... (0 Replies)
Discussion started by: Linux Bot
0 Replies

4. Programming

How An Application Sets The Ip Options???

Hello Friends, I'm involved in test the UDP/IP source code. As you might be knowing, IPv4 provides several options: like Loose Source and Record Route (LSRR), Strict Source and Record Route (SSRR) etc. I wanted to test the above mentioned IP options. My strategy is to write a test application... (3 Replies)
Discussion started by: aamirglb
3 Replies

5. Shell Programming and Scripting

differentiating two sets

Hi Suppose i have a set of files like this set1 a.cpp@@main/5 b.cpp@@main/6 set 2 m.cpp@@main/51 n.hpp@@main/51 a.cpp@@main/15 b.cpp@@main/2 there may be files with same name in 2 sets. i need to list the files in set1 which have last numeric field less than the same file in... (15 Replies)
Discussion started by: skyineyes
15 Replies

6. Shell Programming and Scripting

differentiating two sets for filenames????

set 1 ./abc@@/main/61 ./def.cpp@@/main/13 ./fgh.cpp@@/main/16 ./ijk.cpp@@/main/12 ./mln.cpp@@/main/9 ./uvw.cpp@@/main/30 set2 ./eww@@/main/61 ./def.cpp@@/main/13 ./xxx.cpp@@/main/26 ./kkk.cpp@@/main/72 ./qqq.cpp@@/main/19 ./fgh.cpp@@/main/16 I have two sets with filenames in... (13 Replies)
Discussion started by: skyineyes
13 Replies

7. Shell Programming and Scripting

How to translate character and font sets ?

Hi, below is an example of dialog script from the net, I would like to run from a command line in putty terminal opened session. The issue is some characters get replaced by dots. Could you advise me a solution to edit the following string into window character set accepted by putty ? I... (2 Replies)
Discussion started by: jack2
2 Replies

8. Solaris

FSS and processor sets

I read somewhere which says """FSS can be assigned to processor sets, resulting in more sensitive control of priorities on a server than raw processor sets"" can any one tell me how we can assign FSS to processor set and how it works ? Thanx (2 Replies)
Discussion started by: fugitive
2 Replies

9. UNIX for Advanced & Expert Users

sets the remote server's name

Hi all, does any one have any idea on how to sets the remote server's name on ubuntu terminal tabs, without making any changes to the remote server? for example if i'm working on ssh root@test1 i would like it to be shown on the tittle's tab and if i connect on another it would do the same... (7 Replies)
Discussion started by: charli1
7 Replies
iconv_unicode(5)					Standards, Environments, and Macros					  iconv_unicode(5)

NAME
iconv_unicode - code set conversion tables for Unicode DESCRIPTION
The following code set conversions are supported: CODE SET CONVERSIONS SUPPORTED ------------------------------ FROM Code Set TO Code Set Code FROM Target Code TO Filename Filename Element Element ISO 8859-1 (Latin 1) 8859-1 UTF-8 UTF-8 ISO 8859-2 (Latin 2) 8859-2 UTF-8 UTF-8 ISO 8859-3 (Latin 3) 8859-3 UTF-8 UTF-8 ISO 8859-4 (Latin 4) 8859-4 UTF-8 UTF-8 ISO 8859-5 (Cyrillic) 8859-5 UTF-8 UTF-8 ISO 8859-6 (Arabic) 8859-6 UTF-8 UTF-8 ISO 8859-7 (Greek) 8859-7 UTF-8 UTF-8 ISO 8859-8 (Hebrew) 8859-8 UTF-8 UTF-8 ISO 8859-9 (Latin 5) 8859-9 UTF-8 UTF-8 ISO 8859-10 (Latin 6) 8859-10 UTF-8 UTF-8 Japanese EUC eucJP UTF-8 UTF-8 Chinese/PRC EUC (GB 2312-1980) gb2312 UTF-8 UTF-8 ISO-2022 iso2022 UTF-8 UTF-8 Korean EUC ko_KR-euc Korean UTF-8 ko_KR-UTF-8 ISO-2022-KR ko_KR-iso2022-7 Korean UTF-8 ko_KR_UTF-8 Korean Johap (KS C 5601-1987) ko_KR-johap Korean UTF-8 ko_KR-UTF-8 Korean Johap (KS C 5601-1992) ko_KR-johap92 Korean UTF-8 ko_KR-UTF-8 Korean UTF-8 ko_KR-UTF-8 Korean EUC ko_KR-euc Korean UTF-8 ko_KR-UTF-8 Korean Johap ko_KR-johap (KS C 5601-1987) Korean UTF-8 ko_KR-UTF-8 Korean Johap ko_KR-johap92 (KS C 5601-1992) KOI8-R (Cyrillic) KOI8-R UCS-2 UCS-2 KOI8-R (Cyrillic) KOI8-R UTF-8 UTF-8 PC Kanji (SJIS) PCK UTF-8 UTF-8 PC Kanji (SJIS) SJIS UTF-8 UTF-8 UCS-2 UCS-2 KOI8-R (Cyrillic) KOI8-R UCS-2 UCS-2 UCS-4 UCS-4 CODE SET CONVERSIONS SUPPORTED ------------------------------ FROM Code Set TO Code Set Code FROM Target Code TO Filename Filename Element Element UCS-2 UCS-2 UTF-7 UTF-7 UCS-2 UCS-2 UTF-8 UTF-8 UCS-4 UCS-4 UCS-2 UCS-2 UCS-4 UCS-4 UTF-16 UTF-16 UCS-4 UCS-4 UTF-7 UTF-7 UCS-4 UCS-4 UTF-8 UTF-8 UTF-16 UTF-16 UCS-4 UCS-4 UTF-16 UTF-16 UTF-8 UTF-8 UTF-7 UTF-7 UCS-2 UCS-2 UTF-7 UTF-7 UCS-4 UCS-4 UTF-7 UTF-7 UTF-8 UTF-8 UTF-8 UTF-8 ISO 8859-1 (Latin 1) 8859-1 UTF-8 UTF-8 ISO 8859-2 (Latin 2) 8859-2 UTF-8 UTF-8 ISO 8859-3 (Latin 3) 8859-3 UTF-8 UTF-8 ISO 8859-4 (Latin 4) 8859-4 UTF-8 UTF-8 ISO 8859-5 (Cyrillic) 8859-5 UTF-8 UTF-8 ISO 8859-6 (Arabic) 8859-6 UTF-8 UTF-8 ISO 8859-7 (Greek) 8859-7 UTF-8 UTF-8 ISO 8859-8 (Hebrew) 8859-8 UTF-8 UTF-8 ISO 8859-9 (Latin 5) 8859-9 UTF-8 UTF-8 ISO 8859-10 (Latin 6) 8859-10 UTF-8 UTF-8 Japanese EUC eucJP UTF-8 UTF-8 Chinese/PRC EUC gb2312 (GB 2312-1980) UTF-8 UTF-8 ISO-2022 iso2022 UTF-8 UTF-8 KOI8-R (Cyrillic) KOI8-R UTF-8 UTF-8 PC Kanji (SJIS) PCK UTF-8 UTF-8 PC Kanji (SJIS) SJIS UTF-8 UTF-8 UCS-2 UCS-2 UTF-8 UTF-8 UCS-4 UCS-4 UTF-8 UTF-8 UTF-16 UTF-16 UTF-8 UTF-8 UTF-7 UTF-7 UTF-8 UTF-8 Chinese/PRC EUC zh_CN.euc (GB 2312-1980) CODE SET CONVERSIONS SUPPORTED ------------------------------ FROM Code Set TO Code Set Code FROM Target Code TO Filename Filename Element Element UTF-8 UTF-8 ISO 2022-CN zh_CN.iso2022-7 UTF-8 UTF-8 Chinese/Taiwan Big5 zh_TW-big5 UTF-8 UTF-8 Chinese/Taiwan EUC zh_TW-euc (CNS 11643-1992) UTF-8 UTF-8 ISO 2022-TW zh_TW-iso2022-7 Chinese/PRC EUC zh_CN.euc UTF-8 UTF-8 (GB 2312-1980) ISO 2022-CN zh_CN.iso2022-7 UTF-8 UTF-8 Chinese/Taiwan Big5 zh_TW-big5 UTF-8 UTF-8 Chinese/Taiwan EUC zh_TW-euc UTF-8 UTF-8 (CNS 11643-1992) ISO 2022-TW zh_TW-iso2022-7 UTF-8 UTF-8 EXAMPLES
Example 1: The library module filename In the conversion library, /usr/lib/iconv (see iconv(3C)), the library module filename is composed of two symbolic elements separated by the percent sign (%). The first symbol specifies the code set that is being converted; the second symbol specifies the target code, that is, the code set to which the first one is being converted. In the conversion table above, the first symbol is termed the "FROM Filename Element". The second symbol, representing the target code set, is the "TO Filename Element". For example, the library module filename to convert from the Korean EUC code set to the Korean UTF-8 code set is ko_KR-euc%ko_KR-UTF-8 FILES
/usr/lib/iconv/*.so conversion modules SEE ALSO
iconv(1), iconv(3C), iconv(5) Chernov, A., Registration of a Cyrillic Character Set, RFC 1489, RELCOM Development Team, July 1993. Chon, K., H. Je Park, and U. Choi, Korean Character Encoding for Internet Messages, RFC 1557, Solvit Chosun Media, December 1993. Goldsmith, D., and M. Davis, UTF-7 - A Mail-Safe Transformation Format of Unicode, RFC 1642, Taligent, Inc., July 1994. Lee, F., HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and ASCII characters, RFC 1843, Stanford University, August 1995. Murai, J., M. Crispin, and E. van der Poel, Japanese Character Encoding for Internet Messages, RFC 1468, Keio University, Panda Program- ming, June 1993. Nussbacher, H., and Y. Bourvine, Hebrew Character Encoding for Internet Messages, RFC 1555, Israeli Inter-University, Hebrew University, December 1993. Ohta, M., Character Sets ISO-10646 and ISO-10646-J-1, RFC 1815, Tokyo Institute of Technology, July 1995. Ohta, M., and K. Handa, ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP, RFC 1554, Tokyo Institute of Technology, December 1993. Reynolds, J., and J. Postel, ASSIGNED NUMBERS, RFC 1700, University of Southern California/Information Sciences Institute, October 1994. Simonson, K., Character Mnemonics & Character Sets, RFC 1345, Rationel Almen Planlaegning, June 1992. Spinellis, D., Greek Character Encoding for Electronic Mail Messages, RFC 1947, SENA S.A., May 1996. The Unicode Consortium, The Unicode Standard, Version 2.0, Addison Wesley Developers Press, July 1996. Wei, Y., Y. Zhang, J. Li, J. Ding, and Y. Jiang, ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages, RFC 1842, AsiaInfo Services Inc., Harvard University, Rice University, University of Maryland, August 1995. Yergeau, F., UTF-8, a transformation format of Unicode and ISO 10646, RFC 2044, Alis Technologies, October 1996. Zhu, H., D. Hu, Z. Wang, T. Kao, W. Chang, and M. Crispin, Chinese Character Encoding for Internet Messages, RFC 1922, Tsinghua University, China Information Technology Standardization Technical Committee (CITS), Institute for Information Industry (III), University of Washing- ton, March 1996. NOTES
ISO 8859 character sets using Latin alphabetic characters are distinguished as follows: ISO 8859-1 (Latin 1) For most West European languages, including: Albanian Finnish Italian Catalan French Norwegian Danish German Portuguese Dutch Galician Spanish English Irish Swedish Faeroese Icelandic ISO 8859-2 (Latin 2) For most Latin-written Slavic and Central European languages: Czech Polish Slovak German Rumanian Slovene Hungarian Croatian ISO 8859-3 (Latin 3) Popularly used for Esperanto, Galician, Maltese, and Turkish. ISO 8859-4 (Latin 4) Introduces letters for Estonian, Latvian, and Lithuanian. It is an incomplete predecessor of ISO 8859-10 (Latin 6). ISO 8859-9 (Latin 5) Replaces the rarely needed Icelandic letters in ISO 8859-1 (Latin 1) with the Turkish ones. ISO 8859-10 (Latin 6) Adds the last Inuit (Greenlandic) and Sami (Lappish) letters that were not included in ISO 8859-4 (Latin 4) to complete coverage of the Nordic area. SunOS 5.10 18 Apr 1997 iconv_unicode(5)
All times are GMT -4. The time now is 09:50 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy