11-18-2005
Quote:
Originally Posted by jerardfjay
Is there any UNIX utility/command/executable that will convert mutlibyte characters to standard single byte ASCII characters in a given file?
Try
recode
10 More Discussions You Might Find Interesting
1. Programming
i know it's out there, but I cannot remember how to check if a given ascii character string contains all digits or not ... any ideas?
ie...function("123") --> OK
function("NOT_A_NUMBER") --> returns error
thanks!! (2 Replies)
Discussion started by: jalburger
2 Replies
2. Shell Programming and Scripting
Hi All,
In the HP Unix that i'm using when i initialise a string as Stalled="'30¬G'"
Stalled=$Stalled" '30¬C'", it is taking the character ¬ as a comma. I need to grep for 30¬G 30¬C in a file and take its count. But since this character ¬ is not being understood, the count returns a zero.
The... (2 Replies)
Discussion started by: roops
2 Replies
3. Shell Programming and Scripting
Hi! I'm trying to separate text into sentences, like this:
$pattern = "/(|]|,)**/";
preg_match_all($pattern, $text, $matches);
This works fine unless the text contains multibyte characters, like "åäö". How can I make this work with these exotic characters? (2 Replies)
Discussion started by: Ilja
2 Replies
4. Shell Programming and Scripting
Hi gurus,
I have a file in unix with ascii values. I need to convert all the ascii values in the file to ascii characters. File contains nearly 20000 records with ascii values. (10 Replies)
Discussion started by: sandeeppvk
10 Replies
5. Shell Programming and Scripting
I am having a file(1234.txt) downloaded from windows server (in Ascii format).However when i ftp this file to Unix server and try to work with it..i am unable to do anything.When i try to open the file using vi editor the file opens in the following format ...
@
@
@
@
@
@
@
@... (4 Replies)
Discussion started by: appu2176
4 Replies
6. Shell Programming and Scripting
Hi! I'm trying to separate text into sentences, like this:
$pattern = "/(|]|,)**/";
preg_match_all($pattern, $text, $matches);
This works fine unless the text contains multibyte characters, like "åäö". How can I make this work with these exotic characters?
An example phrase that doesn't match:... (1 Reply)
Discussion started by: Ilja
1 Replies
7. Shell Programming and Scripting
Hi,
I have many text files which contain some non-ASCII characters. I attach the screenshots of one of the files for people to have a look at. The issue is even after issuing the non-ASCII removal commands one of the characters does not go away. The character that goes away is the black one with a... (2 Replies)
Discussion started by: shoaibjameel123
2 Replies
8. Shell Programming and Scripting
I have been having an encoding problem that I need to solve.
I have an 4-column tab-separated file: I need to remove all of the lines that contain the string 'vis-à-vis'
achiever-n vis-à-vis+ns-j+vp oppose-v 1
achiever-n vis-à-vis+ns-the+vg assess-v 1
administrator-n ... (4 Replies)
Discussion started by: owwow14
4 Replies
9. Shell Programming and Scripting
Hi
I have a requirement to insert a dot "." after a position in each line, say 110th position.
For which, I have written the below command.
cat filename | sed 's/./&\./110' > new_filename
The code is working fine, but when we have multi byte (2 or 3) characters in the input file, the... (3 Replies)
Discussion started by: tostay2003
3 Replies
10. UNIX for Beginners Questions & Answers
Hi,
I'm writing a BBS telnet program. I'm having issues with it not displaying lower ASCII characters. For example, instead of displaying the "smiley face" character (Ctrl-B), it displays ^B. Is this because i'm using Ncurses? If so, is there any way around this?
Thanks. (3 Replies)
Discussion started by: ignatius
3 Replies
TR(1) General Commands Manual TR(1)
NAME
tr - translate characters
SYNOPSIS
tr [ -cds ] [ string1 [ string2 ] ]
DESCRIPTION
Tr copies the standard input to the standard output with substitution or deletion of selected characters. Input characters found in
string1 are mapped into the corresponding characters of string2. When string2 is short it is padded to the length of string1 by duplicat-
ing its last character. Any combination of the options -cds may be used: -c complements the set of characters in string1 with respect to
the universe of characters whose ASCII codes are 01 through 0377 octal; -d deletes all input characters in string1; -s squeezes all strings
of repeated output characters that are in string2 to single characters.
In either string the notation a-b means a range of characters from a to b in increasing ASCII order. The character `' followed by 1, 2 or
3 octal digits stands for the character whose ASCII code is given by those digits. A `' followed by any other character stands for that
character.
The following example creates a list of all the words in `file1' one per line in `file2', where a word is taken to be a maximal string of
alphabetics. The second string is quoted to protect `' from the Shell. 012 is the ASCII code for newline.
tr -cs A-Za-z '