Sponsored Content
Top Forums Shell Programming and Scripting Remove invalid database characters on a file Post 302568638 by Jin_ on Thursday 27th of October 2011 11:12:40 PM
Old 10-28-2011
Remove invalid database characters on a file

Hi All -

I'm building a script wherein it is design to remove characters that are not accepted on a non-unicode database. Examples are the following: ï,¿,½,Â,é, etc.

I can easily sed those characters one-by-one but I there's a problem when other unicode characters are found. Is there any way to remove all of them? I'm thinking they are all not found on a standard keyboard.

Please help. Thanks.

Also, I can't sed/grep characters with grave/accent like: ù

Last edited by Jin_; 10-28-2011 at 02:29 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Invalid Characters in the file.

I am working on AIX. We ftp files to a database. The flat files are having thousands of records and each record is having some 50 to 60 characters(there are fields having certain character length). In addition to some valid ascii characters some invalid characters like Å, å, Ä, ä or pipes creep in which... (5 Replies)
Discussion started by: kanu_pathak
5 Replies

2. UNIX for Dummies Questions & Answers

Invalid Characters in the file.

I am working on AIX. We ftp files to a database. The flat files are having thousands of records and each record is having some 50 to 60 characters(there are fields having certain character length). In addition to some valid ascii characters some invalid characters like Å, å, Ä, ä or pipes creep in which... (15 Replies)
Discussion started by: kanu_pathak
15 Replies

3. Programming

string with invalid characters

This is a pretty straight-forward question. Within a program of mine, I have a string that's going to be used as a filename, but it might have some invalid characters in it that wouldn't be valid in a filename. If there are any invalid characters, I want to get rid of them and essentially squeeze... (4 Replies)
Discussion started by: cleopard
4 Replies

4. Shell Programming and Scripting

Remove characters from file name

Here is my code. for file in *1.3.html ; do mv "$file" `echo $file | tr '.1.3' ''` ; done For some reason I am getting an error. mv: file.idlesince.1.3.html and file.idlesince.1.3.html are identical Could this be done a different way? (5 Replies)
Discussion started by: mrlayance
5 Replies

5. Shell Programming and Scripting

Trying to remove '^M' characters from a file.

Hi guys, Hope you are all well. This is a line of data from a csv file. I have used vi and set the 'set list' option to display the trailing $ character. "01","Grocery","01006","eat Fish & Spreads"$ I have tried the following commands, but neither of them appear to be working? 1) tr... (13 Replies)
Discussion started by: Krispy
13 Replies

6. Shell Programming and Scripting

How to remove ^M characters from a zip file?

Hi All, As all of us know that while moving a file from Windows to Unix some unwanted ^M characters appear in the file. For my case I have release package in zip format which looks like Module_Name_Tag.zip. It contains some directory structure...like Module_Name_Tag.zip | |--trunk/... (2 Replies)
Discussion started by: bhaskar_m
2 Replies

7. Shell Programming and Scripting

Remove the characters from the file

Hi, I have one file in the following format. exa_resu_adj.4ge v.47645 PERSONAL INFORMAIONS PVT LTD 31 Dec 2009 04:36 Page 1 SALARY REPORT Account Account Name CCode Bill No Balance T Amt ----------- ------------ ------- ---------- ------------- ------------- 17490001 Mr Ram PM 10... (6 Replies)
Discussion started by: Kattoor
6 Replies

8. UNIX Desktop Questions & Answers

Remove new line characters from a file

I tried using below command tr -cd "" < InputFile.xml > output.txt ============= This removes all the tabs/newline/extra spaces from a file it successfully removed all the extra spaces,tabs and new line characters but then the complete file become one record. I want to retain one new line... (1 Reply)
Discussion started by: saini
1 Replies

9. UNIX for Dummies Questions & Answers

To get the invalid characters from a file

Hello, Can any one help me in below query to search all the invalid characters that UNIX cannot recognize from a file. can we do anything with the help of grep command or any other commands. Also, i am not sure what are the invalid characters present in the file. Many thanks in advance. ... (6 Replies)
Discussion started by: schandru
6 Replies

10. Shell Programming and Scripting

Remove characters from the file

i know , the below question has been repeated. can you guys guide me . I have the below input 999999 xxxxxxxxxxxxxx 123.45 2013-05-02 08:14 1 1 1 xxxx 999999 xxxxxxxxxxxxxx 123.45 2013-06-02 02:14 1 4 1 dddd i need to remove from the column 54 to 70 , as like the below output.... (9 Replies)
Discussion started by: expert
9 Replies
UNICODE(1)						      General Commands Manual							UNICODE(1)

NAME
unicode - command line unicode database query tool SYNOPSIS
unicode [options] string DESCRIPTION
This manual page documents the unicode command. unicode is a command line unicode database query tool. OPTIONS
-h --help Show help and exit. -x --hexadecimal Assume string to be a hexadecimal number -d --decimal Assume string to be a decimal number -r --regexp Assume string to be a regular expression -s --string Assume string to be a sequence of characters -a --auto Try to guess type of string from one of the above (default) -mMAXCOUNT --max=MAXCOUNT Maximal number of codepoints to display, default: 20; use 0 for unlimited -iCHARSET --io=IOCHARSET I/O character set. For maximal pleasure, run unicode on UTF-8 capable terminal and specify IOCHARSET to be UTF-8. unicode tries to guess this value from your locale, so with properly set up locale, you should not need to specify it. -cADDCHARSET --charset-add=ADDCHARSET Show hexadecimal reprezentation of displayed characters in this additional charset. -CUSE_COLOUR --colour=USE_COLOUR USE_COLOUR is one of on off auto --colour=on will use ANSI colour codes to colourise the output --colour=off won't use colours. --colour=auto will test if standard output is a tty, and use colours only when it is. --color is a synonym of --colour -v --verbose Be more verbose about displayed characters, e.g. display Unihan information, if available. -w --wikipedia Spawn browser pointing to Wikipedia entry about the character. USAGE
unicode tries to guess the type of an argument. For example, you can use any of the following to display information about U+00E1 LATIN SMALL LETTER A WITH ACUTE (a): unicode 00E1 unicode U+00E1 unicode a unicode 'latin small letter a with acute' You can specify a range of characters as argumets, unicode will show these characters in nice tabular format, aligned to 256-byte bound- aries. Use two dots ".." to indicate the range, e.g. unicode 0450..0520 will display the whole cyrillic and hebrew blocks (characters from U+0400 to U+05FF) unicode 0400.. will display just characters from U+0400 up to U+04FF BUGS
Tabular format does not deal well with full-width, combining, control and RTL characters. SEE ALSO
ascii(1) AUTHOR
Radovan Garabik <garabik @ kassiopeia.juls.savba.sk> 2003-01-31 UNICODE(1)
All times are GMT -4. The time now is 05:34 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy