09-10-2010
Non-ASCII char prevents conversion of manpage to plain text
Hello,
I would like to export manual pages to plain text files.
man CommandName | col -bx > CommandName.txt
The above statement works successfully on Mac OS X. However, it often fails on my old Linux. The problem occurs if the source file of the manpage contains an escape sequence for Non-ASCII character such as "\(co" for the copyright character (0xA9).
Whenever "col -bx" encounters an non-ASCII character (0x80 through 0xFF), it aborts any further process and displays the error message, "Invalid or incomplete multibyte or wide character".
The man command on Mac OS X automatically converts non-ASCII characters into ASCII equivalents such as "(C)" for the copyright character. Therefore, col does not receive non-ASCII characters, and the job successfully completes.
On the other hand, the man command on my old Linux does not convert non-ASCII characters into ASCII equivalents. Therefore, col receives non-ASCII characters, and the job fails.
Please suggest me appropriate solutions for this problem.
Is it possible to force the man command on my old Linux to convert non-ASCII characters into ASCII equivalents? Or, is it possible to force the col command to accept non-ASCII characters?
Here are some examples of failed CommandNames with their non-ASCII characters that caused the failures.
find (curly quote, 0xB4)
hexdump (middle dot, 0xB7)
ln (copyright char, 0xA9)
Many thanks in advance.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I am writing the script to encrypt and decrypt content of the text file.
How can I convert ASCII to characters and backward? I need it for Bourne shell script.
Thanks::confused: (3 Replies)
Discussion started by: woody
3 Replies
2. Shell Programming and Scripting
after converting my ebcidic file to ascii i get the following output
2097152+0 records in
1797345+1 records out
Why is there a difference in number of records.
Is the converson chopping off any records.
All i am doing is just a conversion using the following script
dd if=xaa cbs=152 ... (0 Replies)
Discussion started by: rintingtong
0 Replies
3. Shell Programming and Scripting
I have a file say "codefile" here ,contains data like this
Hi! How are you?
I need to covert this data into stram of equivalant ASCII values
I wrote follwoing script.
#!/bin/bash
while read -n1 char
do
printf "%d" \'$char
done < codefile
this gives me output
... (4 Replies)
Discussion started by: sunilmenhdiratt
4 Replies
4. Shell Programming and Scripting
Hi,
Can anyone please help me ascci to decimal conversion in bash
I have a file which contains stream of numbers like this,these are ascci values
729711810132973278105991013268971213233
I want to covert it to its actual value like upper code's decimal is
"Have a Nice Day!"
... (15 Replies)
Discussion started by: sunilmenhdiratt
15 Replies
5. UNIX for Dummies Questions & Answers
when i try to convert a mainframe EBCDIC file to ASCII ,i dont see correct file this is the source file
... (3 Replies)
Discussion started by: venkatvelpula
3 Replies
6. Programming
Compiling xpp (The X Printing Panel) on SL6 (RHEL6 essentially):
xpp.cxx: In constructor ‘printFiles::printFiles(int, char**, int&)’:
xpp.cxx:200: error: invalid conversion from ‘const char*’ to ‘char*’
The same error with all c++ constructors - gcc 4.4.4.
If anyone can throw any light on... (8 Replies)
Discussion started by: GSO
8 Replies
7. Shell Programming and Scripting
I Am trying to change the file encoding from ASCII to UTF-8 using below command
iconv -f ASCII -t UTF-8 <input_file> > <output_file>
But the output_file is not actually in UTF-8 format. If I use the file command to check the file encoding it still says ASCII.
While converting am not... (5 Replies)
Discussion started by: Sriranga
5 Replies
8. Programming
Pointers are seeming to get the best of me and I get that error in my program.
Here is the code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define REPORTHEADING1 " Employee Pay Hours Gross Tax Net\n"
#define REPORTHEADING2 " Name ... (1 Reply)
Discussion started by: Plum
1 Replies
9. UNIX for Advanced & Expert Users
Hi,
I have a input file which is EBCIDIC and it has packed decimals.
Can anyone help me to convert EBCIDIC file to ASCII(Need to convert even Packed decimal values also to normal format).
Thanks
swapna (12 Replies)
Discussion started by: swapna_1990
12 Replies
10. UNIX for Advanced & Expert Users
Hi,
We have a mainframe file which is in EBCDIC format.We dont have direct access to mainframe ,client has provided us the mainframe file.The mainframe file is containing pact data(COMP1 ,COMP2 etc) which are unreadble.Can anyone suggest me how to convert this kind of ebcdic file to ascii... (11 Replies)
Discussion started by: swapna_1990
11 Replies
COL(1) General Commands Manual COL(1)
NAME
col - filter reverse line feeds
SYNOPSIS
col [ -bfh ]
DESCRIPTION
Col reads the standard input and writes the standard output. It performs the line overlays implied by reverse line feeds (ESC-7 in ASCII)
and by forward and reverse half line feeds (ESC-9 and ESC-8). Col is particularly useful for filtering multicolumn output made with the
`.rt' command of nroff and output resulting from use of the tbl(1) preprocessor.
Although col accepts half line motions in its input, it normally does not emit them on output. Instead, text that would appear between
lines is moved to the next lower full line boundary. This treatment can be suppressed by the -f (fine) option; in this case the output
from col may contain forward half line feeds (ESC-9), but will still never contain either kind of reverse line motion.
If the -b option is given, col assumes that the output device in use is not capable of backspacing. In this case, if several characters
are to appear in the same place, only the last one read will be taken.
The control characters SO (ASCII code 017), and SI (016) are assumed to start and end text in an alternate character set. The character
set (primary or alternate) associated with each printing character read is remembered; on output, SO and SI characters are generated where
necessary to maintain the correct treatment of each character.
If the -h option is given, col converts white space to tabs to shorten printing time.
All control characters are removed from the input except space, backspace, tab, return, newline, ESC (033) followed by one of 7, 8, 9, SI,
SO, and VT (013). This last character is an alternate form of full reverse line feed, for compatibility with some other hardware conven-
tions. All other non-printing characters are ignored.
SEE ALSO
troff(1), tbl(1)
BUGS
Can't back up more than 128 lines.
No more than 800 characters, including backspaces, on a line.
7th Edition May 16, 1986 COL(1)