Visit Our UNIX and Linux User Community

extended ascii problem

Thread Tools Search this Thread
Top Forums Shell Programming and Scripting extended ascii problem
# 1  
Old 10-27-2007
extended ascii problem

hi i would like to check text files if they contain extended ascii characters within or not. i really dont have any idea how to start your kind help would be very much appreciated thanks.
# 2  
Old 10-27-2007
Get some introduction into regular expressions.

In regular expressions (regexps) there exists the possiblity of targeting "classes" of characters: [a-z] would, for instance, mean any character from a to z, that would resemble every smallcap character. If you would want to include the capital characters too, you would write "[a-zA-Z]".

You will find more of these classes in a manual.

# 3  
Old 10-27-2007
thanks ill dig into that Smilie
# 4  
Old 10-28-2007
if you want to implement a small python script you could do something like this:

def findExtended(line):
    for f in line:
        if ord(f) > ord('\x7f'):
            return True
    return False

f = file('chars','r')

line = f.readline()

while line:
    if findExtended(line):
        print "found"
    line = f.readline()


This might not be the fastest way around, but it works.
# 5  
Old 10-28-2007
for extended ASCII characters you could do a small check like this,

open(FILE, "<", $filename ) or die "Unable to open file $filename <$!> \n";

while ( read( FILE, $data, 1) == 1 ) {
  print "$data\n" if ( ord($data) > 128 );


# 6  
Old 10-28-2007

Perhaps we can help you better if you tell us what you would do if and when you find extended ASCII characters in a file. Make a list of the files? Delete the file? Delete characters? Replace characters? ... cheers, drl
# 7  
Old 10-28-2007
thanks for the help guys, problem now is i dont have python on the system im using. with regards to why and what i would do to the files. those text files i want to check are actually an output of a conversion process, a .STDF file to text file. at times after conversion some corruption would occur and those extended ascii's would appear. the conversion is done in batch and it takes time to finish. to save time i wish to filter out those good converted files and just reconverted those corrupted. cheers !

Previous Thread | Next Thread
Test Your Knowledge in Computers #45
Difficulty: Easy
IEEE 802.11 specifies bluetooth networks.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print byte position of extended ascii character

Hello, I am on AIX. When I encounter extended ascii characters and special characters on a file I need to print.. Byte position, actual character and line number. Is there a simple command that can give me the above result ? Thanks in advance (38 Replies)
Discussion started by: rosebud123
38 Replies

2. Shell Programming and Scripting

Convert Hex to Ascii in a Ascii file

Hi All, I have an ascii file in which few columns are having hex values which i need to convert into ascii. Kindly suggest me what command can be used in unix shell scripting? Thanks in Advance (2 Replies)
Discussion started by: HemaV
2 Replies

3. Shell Programming and Scripting

Extended ASCII Characters keep on getting reintroduced to text files

I am working with a log file that I am trying to clean up by removing non-English ASCII characters. I am using Bash via Cygwin on Windows. Before I start I set: export LC_ALL=C I clean it up by removing all non-English ASCII characters with the following command; grep -v $''... (4 Replies)
Discussion started by: lewk
4 Replies

4. Shell Programming and Scripting

Removal Extended ASCII using awk

Hi All, I am trying to remove (SELECTIVE - passed as argument) Extended ASCII using Awk based on adhoc basis. Can you please let me know how to do it. I have to implement this using awk only. Thanks & Regads (14 Replies)
Discussion started by: tostay2003
14 Replies

5. Programming

How to read extended ASCII characters from stdin?

Hi, I want to read extended ASCII characters from keyboard using c language on unix/linux. How to read extended characters from keyboard or by copy-paste in terminal irrespective of locale set in the system. I want to read the input characters from keyboard, store it in an array or some local... (3 Replies)
Discussion started by: sanzee007
3 Replies

6. Shell Programming and Scripting

Search and Replace Extended Ascii Characters

We are getting extended Ascii characters in the input file and my requirement is to search and replace them with a space. I am using the following command LANG=C sed -e 's// /g' It is doing a good job, but in some cases it is replacing the extended characters with two spaces. So my input... (12 Replies)
Discussion started by: ysvsr1
12 Replies

7. Shell Programming and Scripting

Identify extended ascii characters in a file

Hi, Is there a way to identify the lines in a file having extended ascii characters and display the same? For instance I have a file abc.txt having below data aaa|bbb|111|This is first line aaa|bbb|222|This is sec§nd line aaa|bbb|333|This is third line aaa|bbb|444|This is fo¨rth line... (3 Replies)
Discussion started by: decci_7
3 Replies

8. AIX

Printing extended ASCII

Hi All, I'm trying to send extended ascii characters to my HP2055 as part of PCL printer control codes. What I want to do is select a bar code font, print the bar code and reset the printer to the default font. Selecting the bar code font works good. Printing the bar code goes almost ok too. ... (5 Replies)
Discussion started by: petervg
5 Replies

9. UNIX for Advanced & Expert Users

Processing extended ascii character file names in UNIX (BASH scipts)

Hi, I have a accentuated letter (÷) in a script for an Installer. It's a file name. This is not working and I'm told to try using the octal value for the extended ascii character. Does anyone no how to do this? If I had the word "filf÷rval", can I just put in the value between the letters, like... (9 Replies)
Discussion started by: peli
9 Replies

10. Programming

Extended ascii

Hi all, I would like to change the extended ascii code ( 128 - 255). I tried to change LC_ALL and LANG in current session ( values from locale -a) and for no good. Thanks. (0 Replies)
Discussion started by: avis
0 Replies

Featured Tech Videos