How to tell SED to emit output in 8-bit ASCII only?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to tell SED to emit output in 8-bit ASCII only?
# 1  
Old 03-29-2009
Question How to tell SED to emit output in 8-bit ASCII only?

I have to mangle some "plain ASCII" text file (i.e. 8 bits/characters where the text DOES contain characters like Umlauts and accented characters from the upper 7-bits range, i.e. with hex codes in [128..254]).

For this I am trying to use SED which I downloaded as part of cygwin package (yes, I am doing this one Windoze...).

Alas, SED emits the result using Unicode-16 characters (i.e. 16 bits/characters), which the program for which the output is intended can't handle. Can one tell SED to NOT emit Unicode-16 characters but force it to emit 8-bit characters (Unicode-8) only?
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed non ascii value remove

Hi All, i am using the below perl command to remove the non ascii value,it is working fine. we need to find the similar solution using the sed command. but i tried it is not working and getting the error. perl -pe 's/]//g' test.txt sed is not working. sed -i 's/]//g'... (11 Replies)
Discussion started by: bmk123
11 Replies

2. Windows & DOS: Issues & Discussions

Which version of Windows Vista to install with a product key? 32-bit or 64-bit?

Hello everyone. I bought a dell laptop (XPS M1330) online which came without a hard drive. There is a Windows Vista Ultimate OEMAct sticker with product key at the bottom case. I checked dell website (here) for this model and it says this model supports both 32 and 64-bit version of Windows... (4 Replies)
Discussion started by: milhan
4 Replies

3. Shell Programming and Scripting

Convert Hex to Ascii in a Ascii file

Hi All, I have an ascii file in which few columns are having hex values which i need to convert into ascii. Kindly suggest me what command can be used in unix shell scripting? Thanks in Advance (2 Replies)
Discussion started by: HemaV
2 Replies

4. Shell Programming and Scripting

Bit of sed help in XML

Hi all, need some help seeing the bug in my SED Source XML <SMART_FOLDER JOBISN="1" SUB_APPLICATION="PMT-APB" MEMNAME="Job0" JOBNAME="PMT-APB" FOLDER_NAME="PMT-APB"> </SMART_FOLDER> My SED Command sed -e 's/\(<SMART_FOLDER \)\(.*FOLDER_NAME="PMT-APB"\)/\FOLDER_ORDER_METHOD="PCI" \2/' <... (0 Replies)
Discussion started by: J-Man
0 Replies

5. Shell Programming and Scripting

How to handle 64 bit arithmetic operation at 32 bit compiled perl interpreter?H

Hi, Here is the issue. From the program snippet I have Base: 0x1800000000, Size: 0x3FFE7FFFFFFFF which are of 40 and 56 bits. SO I used use bignum to do the math but summing them up I always failed having correct result. perl interpreter info, perl, v5.8.8 built for... (0 Replies)
Discussion started by: rrd1986
0 Replies

6. Shell Programming and Scripting

convert ascii values into ascii characters

Hi gurus, I have a file in unix with ascii values. I need to convert all the ascii values in the file to ascii characters. File contains nearly 20000 records with ascii values. (10 Replies)
Discussion started by: sandeeppvk
10 Replies

7. Programming

copying or concatinating string from 1st bit, leaving 0th bit

Hello, If i have 2 strings str1 and str2, i would like to copy/concatenate str2 to str1, from 1st bit leaving the 0th bit. How do i do it? (2 Replies)
Discussion started by: jazz
2 Replies
Login or Register to Ask a Question
INTERCAL::Charset::Hollerith(3pm)			User Contributed Perl Documentation			 INTERCAL::Charset::Hollerith(3pm)

NAME
Charset::Hollerith - allows to use Hollerith string constants in ASCII programs (and v.v.) SYNOPSIS
use Charset::Hollerith qw(hollerith2ascii); my $a = hollerith2ascii "(Hollerith text)"; DESCRIPTION
Charset::Hollerith defines functions to convert between a subset of ASCII and a subset of nonstandard Hollerith (since there isn't such a thing as a standard Hollerith we defined our own variant which is guaranteed to be incompatible with all versions of Hollerith used by IBM hardware - however, for each character code we have used the code used by some (but not all) IBM card reader, if the code exists in Hol- lerith at all, or we have made one up in some logical way (such as overpunching) if no IBM hardware had that particular character. The two functions hollerith2ascii and ascii2hollerith are exportable but not exported by default. They do the obvious thing to their argu- ment. HOLLERITH CHARACTER TABLE
A Hollerith string is a sequence of 12-bit characters; they are encoded as two ASCII characters, containing 6 bits each: the first charac- ter contains punches 12, 0, 2, 4, 6, 8 and the second character contains punches 11, 1, 3, 5, 7, 9; interleaving the two characters gives the original 12 bits. To make the characters printable on ASCII terminals, bit 7 is always set to 0, and bit 6 is set to the complement of bit 5. These two bits are ignored when reading Hollerith cards. Some Hollerith characters (produced by overpunching) can be converted to sequences of ASCII characters; ascii2hollerith will correctly recognise the sequences. The following punched cards document the encoding of characters (the last three symbols at the end nongraphic symbols in ASCII; the previ- ous two symbols correspond to multicharacter sequences): ' !"#$%&()*+,-./:;<=>?@[]^_`{|}~cY0123456789 12 * * * * * * * * * * 12 11 * * * * ** ** * * * 11 0 * * * * **** * * *** 0 1 * * * 1 2 * * * * * 2 3 ** * * * * 3 4 *** * * ** * * * * 4 5 * * * * * * 5 6 * * ** * * 6 7 * *** * * 7 8 * ******** * * ******* * * * * * 8 9 * * * 9 ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrs 12 ********* ********* 12 11 ********* ********* 11 0 ********* * * 0 1 * * ** ** * 1 2 * * * ** ** * 2 3 * * * ** ** 3 4 * * * ** ** 4 5 * * * ** ** 5 6 * * * ** ** 6 7 * * * ** ** 7 8 * * * ** ** 8 9 * * * * * 9 tuvwxyz [] ". NL CR HT 12 * * * 12 11 * 11 0 ******* * * 0 1 * * * 1 2 * * * * 2 3 ** * * * * 3 4 ** * * * * 4 5 ** * * * 5 6 ** * * * 6 7 ** * * * * * 7 8 ** * * * * 8 9 * * * * 9 PLEASE NOTE that versions of CLC-INTERCAL before 1.-94.-2 had a bug which caused a rabbit to be represented as 12-3-2-8 instead of 12-3-7-8. Cards punched with such older versions, and containing rabbits, will need to be copied with one of the rabbit holes moved from row 2 to row 7. COPYRIGHT
This module is part of CLC-INTERCAL. Copyright (C) 2000, 2002, 2006, 2007 Claudio Calvelli, all rights reserved See the files README and COPYING in the distribution for information. SEE ALSO
A qualified psychiatrist. perl v5.8.8 2008-03-29 INTERCAL::Charset::Hollerith(3pm)