I am trying to remove (SELECTIVE - passed as argument) Extended ASCII using Awk based on adhoc basis. Can you please let me know how to do it. I have to implement this using awk only.
What do you mean by Extended ASCII? Are you trying to remove a single character? Are you trying to remove individually specified characters with each character specified as a separate argument? Are you trying to remove a string of characters? Are you trying to remove individual characters included in a single argument string?
What do you mean by Awk(sic) based on ad hoc basis?
This is a part of script enhancement. The script would take ascii values as input arguments, generally Extended ASCII (i.e. ASCII values >=128 ) and remove them from input file.
Since the place within script that I need to modify is in awk script, I have to implement this within awk itself instead of any other commands such as tr or sed.
I asked 8 questions. You partially answered one of them (generically, but not specifically for this assignment).
Unless you convince us that this is not a homework assignment, show us that you have made an attempt at solving this, show us the part of your existing awk script that you're trying to modify, show us that you have some idea of what your input arguments need to look like, and provide us with some sample input and output for your script; this thread will be closed.
We are here to help you learn how to write code using the tools available on UNIX and Linux systems to perform various tasks. We are not here to act as your unpaid programming staff trying to guess at why you're trying to do, coaxing descriptions of the tasks that need to be performed out of you, and then designing and writing your code for you. And we most certainly are not here to do your homework assignments for you!
Last edited by Don Cragun; 01-01-2015 at 08:43 PM..
Reason: Fix typo.
Unless you convince us that this is not a homework assignment, show us that you have made an attempt at solving this
This is not a homework assignment. It is part of script which I am currently modifying. I am not well aware of awk. I can do the same using tr or sed. I want to know if there is any function in awk that can perform similar function. I was using sub/gsub function, but the manual contains how to replace a pattern. Here I am not looking for a specific pattern, but a match of ANY of the characters.
Quote:
show us the part of your existing awk script that you're trying to modify,
The script is on client secured network, which cannot be copied.
Quote:
show us that you have some idea of what your input arguments need to look like, and provide us with some sample input and output for your script;
The input arguments would be range of ascii values and/or comma separated ascii values.
If any of the input ascii values appear in any of the lines of input file, then it has to be replaced with empty string.
It appears that your strings are UTF-8; not extend ASCII. Furthermore, printing your strings through od shows that the byte values that you said you wanted to remove are not present in your input string or output string samples:
shows us that the unsigned decimal byte values of the two bytes you want to remove are 197 and 160:
If you are working with UTF-8 input and want "extended ASCII" output (where you may be removing 1 or more bytes out of a multi-byte UTF-8 character, but might not be removing complete characters), you may end up with an unintelligible mess. If you want to remove a specific set of UTF-8 characters, that is easy to do. If you want to remove all non-(7-bit)ASCII characters, that is easy to do on some systems (depending on how well your version of awk handles locales and multi-byte characters).
What OS (including version) and shell are you using?
What Locale are you using when your run this script?
Is it OK to just remove all bytes from your input stream that have the high order bit set? If not, is there a specific list of UTF-8 characters you want to remove? If not, and you really want to remove individual bytes from strings containing multi-byte characters, this may be hard to do in some versions of awk.
You said you know how to do what you want using sed. Show us the sed substitute command that does what you want and we can show you how to easily change that into an awk sub() or gsub() function call.
Hello,
I am on AIX.
When I encounter extended ascii characters and special characters on a file I need to print..
Byte position, actual character and line number.
Is there a simple command that can give me the above result ?
Thanks in advance (38 Replies)
I am working with a log file that I am trying to clean up by removing non-English ASCII characters. I am using Bash via Cygwin on Windows.
Before I start I set:
export LC_ALL=C
I clean it up by removing all non-English ASCII characters with the following command;
grep -v $''... (4 Replies)
Hi,
I want to read extended ASCII characters from keyboard using c language on unix/linux. How to read extended characters from keyboard or by copy-paste in terminal irrespective of locale set in the system. I want to read the input characters from keyboard, store it in an array or some local... (3 Replies)
We are getting extended Ascii characters in the input file and my requirement is to search and replace them with a space. I am using the following command
LANG=C sed -e 's// /g'
It is doing a good job, but in some cases it is replacing the extended characters with two spaces. So my input... (12 Replies)
Hi,
Is there a way to identify the lines in a file having extended ascii characters and display the same?
For instance I have a file abc.txt having below data
aaa|bbb|111|This is first line
aaa|bbb|222|This is sec๕nd line
aaa|bbb|333|This is third line
aaa|bbb|444|This is fo๙rth line... (3 Replies)
Hi all,
I have a file with extended ASCII codes in the description which needs to be removed.
List of extended ascii codes
"", "", "", "", "", "", "-", "-", "",
"'", "", "", "", "","", "", "",
"...", "", "", ""
Sample data:
Test Details-HAVE BEEN PUBLISHED... (1 Reply)
Hi All,
I'm trying to send extended ascii characters to my HP2055 as part of PCL printer control codes. What I want to do is select a bar code font, print the bar code and reset the printer to the default font.
Selecting the bar code font works good. Printing the bar code goes almost ok too. ... (5 Replies)
Hi, I have a accentuated letter (๖) in a script for an Installer. It's a file name. This is not working and I'm told to try using the octal value for the extended ascii character. Does anyone no how to do this? If I had the word "filf๖rval", can I just put in the value between the letters, like... (9 Replies)
hi i would like to check text files if they contain extended ascii characters within or not. i really dont have any idea how to start your kind help would be very much appreciated thanks. (7 Replies)
Hi all,
I would like to change the extended ascii code ( 128 - 255).
I tried to change LC_ALL and LANG in current session ( values from locale -a) and for no good.
Thanks. (0 Replies)