Sponsored Content
Top Forums Programming Issue with Keyboard or Char Encoding During Migration Post 303046235 by hicksd8 on Tuesday 28th of April 2020 07:35:53 AM
Old 04-28-2020
Hi All,

As Neo says I have been spending a bit of time on this migration integrity issue.

The irritating "Thingy" (white diamond with question mark in the middle) is officially the Unicode symbol called "Replacement character". The character set inserts this as a placeholder for a character that it doesn't understand. IMHO, the issue here is simply that the migration script (or whatever process) SHOULD understand all the characters on our old site. Yes, we already have "Replacement characters" on the old site switch probably emanated from a long ago upgrade from ascii to Unicode, or from Unicode version x to Unicode version y. As Neo says, replacement character symbols in our old site must be ignored because there's nothing we can do about them now apart from manually edit them out as time goes on.

However, I believe that the currently used (Discourse provided??) process is stuffed because it doesn't understand some of the perfectly correct text on our old site. It even screws up a thread title on the old site containing the replacement character symbol - look at this......

Post migration
How to grep i?1/2 symbol? - Shell Programming and Scripting - UNIX.COM Community

Pre migration
How to grep � symbol?

So the process doesn't even understand it's own Unicode character set!!!!

So FWIW, I've come to the conclusion that trying to modify our old dB is futile as the process will probably find something else to screw up.

Indeed, if you follow the first link I posted on this thread further back, others are having the same issue.

That's my update thus far. I'll report back again as my investigation continues.

EDIT: Replacement character symbol is U+FFFD
 

7 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

how2 get single char from keyboard w/o enter

I am writing a bash shell menu and would like to get a char immediately after a key is pressed. This script does not work but should give you an idea of what I am trying to do.... Thanks for the help #! /bin/bash ANSWER="" echo -en "Choose item...\n" until do $ANSWER = $STDIN ... (2 Replies)
Discussion started by: jwzumwalt
2 Replies

2. Shell Programming and Scripting

Encoding of a text issue

I created one file on windows system and is visible as : TestTable,INSERT,večilnin1ईगल受害者是第,2010-02-02 10:10:10.612447,137277,ईगल受害者是第večilnin!@#$%^&*()_+=-{}] But when send this file to unix system, the file is visible as : TestTable,INSERT,žvečilnin1ई-ल -害...是第,2010-02-02 ... (4 Replies)
Discussion started by: Shaishav Shah
4 Replies

3. Solaris

Solaris 10 p2v migration issue

Hi All, We need to move Physical Solaris 10 system to Virtual Solaris 10(p2v). Both the servers having Solaris 10(Generic_147440-25) means physical server which we are going to move is having Solaris 10 and this physical server will be converted as a virtualserver on another physical server... (9 Replies)
Discussion started by: sb200
9 Replies

4. UNIX for Dummies Questions & Answers

Strange Keyboard and Mouse Issue

Hello All, PC: CuBox-i (*i.MX6) Mini-PC OS: openSUSE 13.1 (Bottle) (armv7hl) Kernel: 3.14.14-cubox-i # uname -a Linux CuBox-HQ 3.14.14-cubox-i #1 SMP Sat Sep 13 03:48:24 UTC 2014 armv7l armv7l armv7l GNU/LinuxSo I've been having this random issue happen on this PC where a few strange... (12 Replies)
Discussion started by: mrm5102
12 Replies

5. AIX

AIX Migration issue with EMC ODM sets

Hi Experts , I want to start migrating our AIX 6.1 to AIX 7.1 . I am planning to use alt_disk_migration . Chris gibson has awesome documentation in the internet. However I am running into an issue with EMC odm filesets . So my current OS is AIX 6.1. and I have this : lslpp -l | grep EMC ... (7 Replies)
Discussion started by: JME2015
7 Replies

6. Shell Programming and Scripting

AIX to RHEL migration - awk treating 0e[0-9]+ as 0 instead of string issue

Greetings Experts, We are migrating from AIX to RHEL Linux. I have created a script to verify and report the NULLs and SPACEs in the key columns and duplicates on key combination of "|" delimited set of big files. Following is the code that was successfully running in AIX. awk -F "|" 'BEGIN {... (5 Replies)
Discussion started by: chill3chee
5 Replies

7. Solaris

View file encoding then change encoding.

Hi all!! Im using command file -i myfile.xml to validate XML file encoding, but it is just saying regular file . Im expecting / looking an output as UTF8 or ANSI / ASCII Is there command to display the files encoding? Thank you! (2 Replies)
Discussion started by: mrreds
2 Replies
Character(3m17n)						 The m17n Library						  Character(3m17n)

NAME
Character - Character objects and API for them. Defines #define MCHAR_MAX Maximum character code. Functions MSymbol mchar_define_property (const char *name, MSymbol type) Define a character property. void * mchar_get_prop (int c, MSymbol key) Get the value of a character property. int mchar_put_prop (int c, MSymbol key, void *val) Set the value of a character property. MCharTable * mchar_get_prop_table (MSymbol key, MSymbol *type) Get the char-table for a character property. Variables: Keys of character properties These symbols are used as keys of character properties. MSymbol Mscript Key for script. MSymbol Mname Key for character name. MSymbol Mcategory Key for general category. MSymbol Mcombining_class Key for canonical combining class. MSymbol Mbidi_category Key for bidi category. MSymbol Msimple_case_folding Key for corresponding single lowercase character. MSymbol Mcomplicated_case_folding Key for corresponding multiple lowercase characters. MSymbol Mcased Key for values used in case operation. MSymbol Msoft_dotted Key for values used in case operation. MSymbol Mcase_mapping Key for values used in case operation. MSymbol Mblock Key for script block name. Detailed Description Character objects and API for them. The m17n library represents a character by a character code (an integer). The minimum character code is 0. The maximum character code is defined by the macro MCHAR_MAX. It is assured that MCHAR_MAX is not smaller than 0x3FFFFF (22 bits). Characters 0 to 0x10FFFF are equivalent to the Unicode characters of the same code values. A character can have zero or more properties called character properties. A character property consists of a key and a value, where key is a symbol and value is anything that can be cast to (void *). 'The character property that belongs to character C and whose key is K' may be shortened to 'the K property of C'. Define Documentation #define MCHAR_MAX Maximum character code. The macro MCHAR_MAX gives the maximum character code. Variable Documentation MSymbol Mscript Key for script. The symbol Mscript has the name 'script' and is used as the key of a character property. The value of such a property is a symbol representing the script to which the character belongs. Each symbol that represents a script has one of the names listed in the Unicode Technical Report #24. MSymbol Mname Key for character name. The symbol Mname has the name 'name' and is used as the key of a character property. The value of such a property is a C-string representing the name of the character. MSymbol Mcategory Key for general category. The symbol Mcategory has the name 'category' and is used as the key of a character property. The value of such a property is a symbol representing the general category of the character. Each symbol that represents a general category has one of the names listed as abbreviations for General Category in Unicode. MSymbol Mcombining_class Key for canonical combining class. The symbol Mcombining_class has the name 'combining-class' and is used as the key of a character property. The value of such a property is an integer that represents the canonical combining class of the character. The meaning of each integer that represents a canonical combining class is identical to the one defined in Unicode. MSymbol Mbidi_category Key for bidi category. The symbol Mbidi_category has the name 'bidi-category' and is used as the key of a character property. The value of such a property is a symbol that represents the bidirectional category of the character. Each symbol that represents a bidirectional category has one of the names listed as types of Bidirectional Category in Unicode. MSymbol Msimple_case_folding Key for corresponding single lowercase character. The symbol Msimple_case_folding has the name 'simple-case-folding' and is used as the key of a character property. The value of such a property is the corresponding single lowercase character that is used when comparing M-texts ignoring cases. If a character requires a complicated comparison (i.e. cannot be compared by simply mapping to another single character), the value of such a property is 0xFFFF. In this case, the character has another property whose key is Mcomplicated_case_folding. MSymbol Mcomplicated_case_folding Key for corresponding multiple lowercase characters. The symbol Mcomplicated_case_folding has the name 'complicated-case-folding' and is used as the key of a character property. The value of such a property is the corresponding M-text that contains a sequence of lowercase characters to be used for comparing M-texts ignoring case. MSymbol Mcased Key for values used in case operation. The symbol Mcased has the name 'cased' and is used as the key of charater property. The value of such a property is an integer value 1, 2, or 3 representing 'cased', 'case-ignorable', and both of them respective. See the Unicode Standard 5.0 (Section 3.13 Default Case Algorithm) for the detail. MSymbol Msoft_dotted Key for values used in case operation. The symbol Msoft_dotted has the name 'soft-dotted' and is used as the key of charater property. The value of such a property is Mt if a character has 'Soft_Dotted' property, and Mnil otherwise. See the Unicode Standard 5.0 (Section 3.13 Default Case Algorithm) for the detail. MSymbol Mcase_mapping Key for values used in case operation. The symbol Mcase_mapping has the name 'case-mapping' and is used as the key of charater property. The value of such a property is a plist of three M-Texts; lower, title, and upper of the corresponding character. See the Unicode Standard 5.0 (Section 5.18 Case Mappings) for the detail. MSymbol Mblock Key for script block name. The symbol Mblock the name 'block' and is used as the key of charater property. The value of such a property is a symbol representing a script block of the corresponding character. Author Generated automatically by Doxygen for The m17n Library from the source code. COPYRIGHT
Copyright (C) 2001 Information-technology Promotion Agency (IPA) Copyright (C) 2001-2011 National Institute of Advanced Industrial Science and Technology (AIST) Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License <http://www.gnu.org/licenses/fdl.html>. Version 1.6.2 12 Jan 2011 Character(3m17n)
All times are GMT -4. The time now is 05:02 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy