Thanks for telling me that <space> is being treated like non-printable character. But I want this seach to look for all NON UTF-8 characters actually, I don't have any inkling on how to check those?
The XML file of the application takes only UTF-8 characters and anything other than this will not let the jobs run through this application. Hence is there any way to check for UTF-8 characters? Can you please suggest?
For e.g.
The highlighted character shown in the file is what I've in my application which when seen in unix appears to be ^Y. How to identify such characters?
I need to check ftp'd incoming files for characters that are not alphanumeric,<tab>, <cr>, or <lf> characters. Each file would have 10-20,000 line with up to 3,000 characters per line. Should I use awk, sed, or grep and what would the command look like to do such a search? Thanks much to anyone... (2 Replies)
How could I check if a string variable contains at least (or only) 2 characters, and check and make sure that the string does not contain any numeric digits?...I need to know how to do this as simple as possible. and I am using the Ksh shell. Thanks. (1 Reply)
Sometimes obvious things... are not so obvious. I always thought that it was possible to grep non printable characters but not with my GNU grep (5.2.1) version.
printf "Hello\tWorld" | grep -l '\t'
printf "Hello\tWorld" | grep -l '\x09'
printf "Hello\tWorld" | grep -l '\x{09}'
None of them... (3 Replies)
I have been using OKI data Microline printers; models 590 and 591 to print a bar code using the following escape sequence:
\E^PA^H^C00^D^C^A^A^A\E^PB^H
The escape sequence is stored in a unix file which is edited using vi.
Now, we are considering Microline printer model 395C and the bar code... (3 Replies)
cp $l_options $srcdirfile $destdirfile
If i want to check whether there is a non printable character in the variables
$l_options $srcdirfile $destdirfile
how it can be done? (2 Replies)
Hi,
We have a non printable character "®" in our file , we want to remove this character, we tried tr -dc '' < oldfile> newfile but this command is removing all new line entries along with the non printable character and all the records are coming in one line(it is changing the format of the... (2 Replies)
Hi,
in a file, i have records as below:
123|62|absnb|267629
123|267|28728|uiuip
123|567|26761|2676
i want to remove the non printable characters after the end of each record.
I guess there are certain charcters but not visible.
i don't know what character that is exactly.
I used... (2 Replies)
Hi,
I have a huge file (50 Mil rows) which has certain non-printable ASCII characters in it. I am cleaning the file by deleting those characters using the following command -
tr -cd '\11\12\15\40-\176' < unclean_file > clean_file
Please note that I am excluding the following -
tab,... (6 Replies)
Discussion started by: rishigc
6 Replies
LEARN ABOUT DEBIAN
caca_cp437_to_utf32
libcaca character set conversions(3caca) libcaca libcaca character set conversions(3caca)NAME
libcaca character set conversions -
Functions
__extern uint32_t caca_utf8_to_utf32 (char const *, size_t *)
Convert a UTF-8 character to UTF-32.
__extern size_t caca_utf32_to_utf8 (char *, uint32_t)
Convert a UTF-32 character to UTF-8.
__extern uint8_t caca_utf32_to_cp437 (uint32_t)
Convert a UTF-32 character to CP437.
__extern uint32_t caca_cp437_to_utf32 (uint8_t)
Convert a CP437 character to UTF-32.
__extern char caca_utf32_to_ascii (uint32_t)
Convert a UTF-32 character to ASCII.
__extern int caca_utf32_is_fullwidth (uint32_t)
Tell whether a UTF-32 character is fullwidth.
Detailed Description
These functions perform conversions between usual character sets.
Function Documentation
__extern uint32_t caca_utf8_to_utf32 (char const *s, size_t *bytes) Convert a UTF-8 character read from a string and return its value in the
UTF-32 character set. If the second argument is not null, the total number of read bytes is written in it.
If a null byte was reached before the expected end of the UTF-8 sequence, this function returns zero and the number of read bytes is set to
zero.
This function never fails, but its behaviour with illegal UTF-8 sequences is undefined.
Parameters:
s A string containing the UTF-8 character.
bytes A pointer to a size_t to store the number of bytes in the character, or NULL.
Returns:
The corresponding UTF-32 character, or zero if the character is incomplete.
Referenced by caca_put_str().
__extern size_t caca_utf32_to_utf8 (char *buf, uint32_tch) Convert a UTF-32 character read from a string and write its value in the UTF-8
character set into the given buffer.
This function never fails, but its behaviour with illegal UTF-32 characters is undefined.
Parameters:
buf A pointer to a character buffer where the UTF-8 sequence will be written.
ch The UTF-32 character.
Returns:
The number of bytes written.
__extern uint8_t caca_utf32_to_cp437 (uint32_tch) Convert a UTF-32 character read from a string and return its value in the CP437 character
set, or '?' if the character has no equivalent.
This function never fails.
Parameters:
ch The UTF-32 character.
Returns:
The corresponding CP437 character, or '?' if not representable.
__extern uint32_t caca_cp437_to_utf32 (uint8_tch) Convert a CP437 character read from a string and return its value in the UTF-32 character
set, or zero if the character is a CP437 control character.
This function never fails.
Parameters:
ch The CP437 character.
Returns:
The corresponding UTF-32 character, or zero if not representable.
__extern char caca_utf32_to_ascii (uint32_tch) Convert a UTF-32 character into an ASCII character. When no equivalent exists, a graphically
close equivalent is sought.
This function never fails, but its behaviour with illegal UTF-32 characters is undefined.
Parameters:
ch The UTF-32 character.
Returns:
The corresponding ASCII character, or a graphically close equivalent if found, or '?' if not representable.
__extern int caca_utf32_is_fullwidth (uint32_tch) Check whether the given UTF-32 character should be printed at twice the normal width
(fullwidth characters). If the character is unknown or if its status cannot be decided, it is treated as a standard-width character.
This function never fails.
Parameters:
ch The UTF-32 character.
Returns:
1 if the character is fullwidth, 0 otherwise.
Referenced by caca_put_char(), and caca_put_str().
Author
Generated automatically by Doxygen for libcaca from the source code.
Version 0.99.beta18 Fri Apr 6 2012 libcaca character set conversions(3caca)