How to tell SED to emit output in 8-bit ASCII only?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to tell SED to emit output in 8-bit ASCII only?
# 1  
Old 03-29-2009
Question How to tell SED to emit output in 8-bit ASCII only?

I have to mangle some "plain ASCII" text file (i.e. 8 bits/characters where the text DOES contain characters like Umlauts and accented characters from the upper 7-bits range, i.e. with hex codes in [128..254]).

For this I am trying to use SED which I downloaded as part of cygwin package (yes, I am doing this one Windoze...).

Alas, SED emits the result using Unicode-16 characters (i.e. 16 bits/characters), which the program for which the output is intended can't handle. Can one tell SED to NOT emit Unicode-16 characters but force it to emit 8-bit characters (Unicode-8) only?
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed non ascii value remove

Hi All, i am using the below perl command to remove the non ascii value,it is working fine. we need to find the similar solution using the sed command. but i tried it is not working and getting the error. perl -pe 's/]//g' test.txt sed is not working. sed -i 's/]//g'... (11 Replies)
Discussion started by: bmk123
11 Replies

2. Windows & DOS: Issues & Discussions

Which version of Windows Vista to install with a product key? 32-bit or 64-bit?

Hello everyone. I bought a dell laptop (XPS M1330) online which came without a hard drive. There is a Windows Vista Ultimate OEMAct sticker with product key at the bottom case. I checked dell website (here) for this model and it says this model supports both 32 and 64-bit version of Windows... (4 Replies)
Discussion started by: milhan
4 Replies

3. Shell Programming and Scripting

Convert Hex to Ascii in a Ascii file

Hi All, I have an ascii file in which few columns are having hex values which i need to convert into ascii. Kindly suggest me what command can be used in unix shell scripting? Thanks in Advance (2 Replies)
Discussion started by: HemaV
2 Replies

4. Shell Programming and Scripting

Bit of sed help in XML

Hi all, need some help seeing the bug in my SED Source XML <SMART_FOLDER JOBISN="1" SUB_APPLICATION="PMT-APB" MEMNAME="Job0" JOBNAME="PMT-APB" FOLDER_NAME="PMT-APB"> </SMART_FOLDER> My SED Command sed -e 's/\(<SMART_FOLDER \)\(.*FOLDER_NAME="PMT-APB"\)/\FOLDER_ORDER_METHOD="PCI" \2/' <... (0 Replies)
Discussion started by: J-Man
0 Replies

5. Shell Programming and Scripting

How to handle 64 bit arithmetic operation at 32 bit compiled perl interpreter?H

Hi, Here is the issue. From the program snippet I have Base: 0x1800000000, Size: 0x3FFE7FFFFFFFF which are of 40 and 56 bits. SO I used use bignum to do the math but summing them up I always failed having correct result. perl interpreter info, perl, v5.8.8 built for... (0 Replies)
Discussion started by: rrd1986
0 Replies

6. Shell Programming and Scripting

convert ascii values into ascii characters

Hi gurus, I have a file in unix with ascii values. I need to convert all the ascii values in the file to ascii characters. File contains nearly 20000 records with ascii values. (10 Replies)
Discussion started by: sandeeppvk
10 Replies

7. Programming

copying or concatinating string from 1st bit, leaving 0th bit

Hello, If i have 2 strings str1 and str2, i would like to copy/concatenate str2 to str1, from 1st bit leaving the 0th bit. How do i do it? (2 Replies)
Discussion started by: jazz
2 Replies
Login or Register to Ask a Question
set_ucodepage(3alleg4)						  Allegro manual					    set_ucodepage(3alleg4)

NAME
set_ucodepage - Sets 8-bit to Unicode conversion tables. Allegro game programming library. SYNOPSIS
#include <allegro.h> void set_ucodepage(const unsigned short *table, const unsigned short *extras); DESCRIPTION
When you select the U_ASCII_CP encoding mode, a set of tables are used to convert between 8-bit characters and their Unicode equivalents. You can use this function to specify a custom set of mapping tables, which allows you to support different 8-bit codepages. The `table' parameter points to an array of 256 shorts, which contain the Unicode value for each character in your codepage. The `extras' parameter, if not NULL, points to a list of mapping pairs, which will be used when reducing Unicode data to your codepage. Each pair con- sists of a Unicode value, followed by the way it should be represented in your codepage. The list is terminated by a zero Unicode value. This allows you to create a many->one mapping, where many different Unicode characters can be represented by a single codepage value (eg. for reducing accented vowels to 7-bit ASCII). Allegro will use the `table' parameter when it needs to convert an ASCII string to an Unicode string. But when Allegro converts an Unicode string to ASCII, it will use both parameters. First, it will loop through the `table' parameter looking for an index position pointing at the Unicode value it is trying to convert (ie. the `table' parameter is also used for reverse matching). If that fails, the `extras' list is used. If that fails too, Allegro will put the character `^', giving up the conversion. Note that Allegro comes with a default `table' and `extras' parameters set internally. The default `table' will convert 8-bit characters to `^'. The default `extras' list reduces Latin-1 and Extended-A characters to 7 bits in a sensible way (eg. an accented vowel will be reduced to the same vowel without the accent). SEE ALSO
set_uformat(3alleg4) Allegro version 4.4.2 set_ucodepage(3alleg4)