Processing extended ascii character file names in UNIX (BASH scipts) Post: 302169041

10 More Discussions You Might Find Interesting

1. Programming

Extended ascii

Hi all, I would like to change the extended ascii code ( 128 - 255). I tried to change LC_ALL and LANG in current session ( values from locale -a) and for no good. Thanks.

2. Shell Programming and Scripting

Weird Ascii characters in file names

Hi. I have files in my OS that has weird file names with not-conventional ascii characters. I would like to run them but I can't refer them. I know the ascii # of the problematic characters. I can't change their name since it belongs to a 3rd party program... but I want to run it. is there...

3. Shell Programming and Scripting

extended ascii problem

hi i would like to check text files if they contain extended ascii characters within or not. i really dont have any idea how to start your kind help would be very much appreciated thanks.

4. Shell Programming and Scripting

read in a file character by character - replace any unknown ASCII characters with spa

Can someone help me to write a script / command to read in a file, character by character, replace any unknown ASCII characters with space. then write out the file to a new filename/ Thanks!

5. AIX

Printing extended ASCII

Hi All, I'm trying to send extended ascii characters to my HP2055 as part of PCL printer control codes. What I want to do is select a bar code font, print the bar code and reset the printer to the default font. Selecting the bar code font works good. Printing the bar code goes almost ok too. ...

6. Shell Programming and Scripting

Preserve extented ascii character when run echo comand inside bash script

Hi everyone, I'm echo some text with extended ascii characters as below: echo -e "Pr\xE9sentation du spectacle" > output or echo -e "Pr�sentation du spectacle" > outputIf I open the file created I see this text Pr�sentation du spectacleThe text is shown correctly in this created file when...

7. Shell Programming and Scripting

Identify extended ascii characters in a file

Hi, Is there a way to identify the lines in a file having extended ascii characters and display the same? For instance I have a file abc.txt having below data aaa|bbb|111|This is first line aaa|bbb|222|This is sec�nd line aaa|bbb|333|This is third line aaa|bbb|444|This is fo�rth line...

8. Shell Programming and Scripting

Removal Extended ASCII using awk

Hi All, I am trying to remove (SELECTIVE - passed as argument) Extended ASCII using Awk based on adhoc basis. Can you please let me know how to do it. I have to implement this using awk only. Thanks & Regads

9. UNIX for Beginners Questions & Answers

Convert ascii character values to number that comes between the numbers in UNIX

I have variable that contains multiple values of number and also include overpunch(i.e. # $ % etc) character so we want to replace it with numbers. here are the example: Code: 11500#.0# 28575$.5$ 527#.7# 42".2" 2794 .4 2279!.9! 1067&.7& 926#.6# 2279!.9! 885".5" 11714$.4$ 27361'.1'...

10. UNIX for Beginners Questions & Answers

Print byte position of extended ascii character

Hello, I am on AIX. When I encounter extended ascii characters and special characters on a file I need to print.. Byte position, actual character and line number. Is there a simple command that can give me the above result ? Thanks in advance

LEARN ABOUT DEBIAN

marc::charset

MARC::Charset(3pm)					User Contributed Perl Documentation					MARC::Charset(3pm)

NAME

       MARC::Charset - convert MARC-8 encoded strings to UTF-8

SYNOPSIS

	   # import the marc8_to_utf8 function
	   use MARC::Charset 'marc8_to_utf8';

	   # prepare STDOUT for utf8
	   binmode(STDOUT, 'utf8');

	   # print out some marc8 as utf8
	   print marc8_to_utf8($marc8_string);

DESCRIPTION

       MARC::Charset allows you to turn MARC-8 encoded strings into UTF-8 strings. MARC-8 is a single byte character encoding that predates
       unicode, and allows you to put non-Roman scripts in MARC bibliographic records.

	   http://www.loc.gov/marc/specifications/spechome.html

EXPORTS

   ignore_errors()
       Tells MARC::Charset whether or not to ignore all encoding errors, and returns the current setting.  This is helpful if you have records
       that contain both MARC8 and UNICODE characters.

	   my $ignore = MARC::Charset->ignore_errors();

	   MARC::Charset->ignore_errors(1); # ignore errors
	   MARC::Charset->ignore_errors(0); # DO NOT ignore errors

   assume_unicode()
       Tells MARC::Charset whether or not to assume UNICODE when an error is encountered in ignore_errors mode and returns the current setting.
       This is helepfuli if you have records that contain both MARC8 and UNICODE characters.

	   my $setting = MARC::Charset->assume_unicode();

	   MARC::Charset->assume_unicode(1); # assume characters are unicode (utf-8)
	   MARC::Charset->assume_unicode(0); # DO NOT assume characters are unicode

   assume_encoding()
       Tells MARC::Charset whether or not to assume a specific encoding when an error is encountered in ignore_errors mode and returns the current
       setting.  This is helpful if you have records that contain both MARC8 and other characters.

	   my $setting = MARC::Charset->assume_encoding();

	   MARC::Charset->assume_encoding('cp850'); # assume characters are cp850
	   MARC::Charset->assume_encoding(''); # DO NOT assume any encoding

   marc8_to_utf8()
       Converts a MARC-8 encoded string to UTF-8.

	   my $utf8 = marc8_to_utf8($marc8);

       If you'd like to ignore errors pass in a true value as the 2nd parameter or call MARC::Charset->ignore_errors() with a true value:

	   my $utf8 = marc8_to_utf8($marc8, 'ignore-errors');

	 or

	   MARC::Charset->ignore_errors(1);
	   my $utf8 = marc8_to_utf8($marc8);

   utf8_to_marc8()
       Will attempt to translate utf8 into marc8.

	   my $marc8 = utf8_to_marc8($utf8);

       If you'd like to ignore errors, or characters that can't be converted to marc8 then pass in a true value as the second parameter:

	   my $marc8 = utf8_to_marc8($utf8, 'ignore-errors');

	 or

	   MARC::Charset->ignore_errors(1);
	   my $utf8 = marc8_to_utf8($marc8);

DEFAULT CHARACTER SETS

       If you need to alter the default character sets you can set the $MARC::Charset::DEFAULT_G0 and $MARC::Charset::DEFAULT_G1 variables to the
       appropriate character set code:

	   use MARC::Charset::Constants qw(:all);
	   $MARC::Charset::DEFAULT_G0 = BASIC_ARABIC;
	   $MARC::Charset::DEFAULT_G1 = EXTENDED_ARABIC;

SEE ALSO

       o   MARC::Charset::Constant

       o   MARC::Charset::Table

       o   MARC::Charset::Code

       o   MARC::Charset::Compiler

       o   MARC::Record

       o   MARC::XML

AUTHOR

       Ed Summers (ehs@pobox.com)

perl v5.12.4							    2011-08-05							MARC::Charset(3pm)

10 More Discussions You Might Find Interesting

1. Programming

Extended ascii

Discussion started by: avis

2. Shell Programming and Scripting

Weird Ascii characters in file names

Discussion started by: yamsin789

3. Shell Programming and Scripting

extended ascii problem

Discussion started by: smooth

4. Shell Programming and Scripting

read in a file character by character - replace any unknown ASCII characters with spa

Discussion started by: raghav525