Sponsored Content
Top Forums Shell Programming and Scripting Invalid Characters in the file. Post 302163112 by kanu_pathak on Thursday 31st of January 2008 04:24:33 AM
Old 01-31-2008
Invalid Characters in the file.

I am working on AIX. We ftp files to a database. The flat files are having thousands of records and each record is having some 50 to 60 characters(there are fields having certain character length). In addition to some valid ascii characters some invalid characters like Å, å, Ä, ä or pipes creep in which datawarehouse rejects to load in.
Example: AcuM-^?a 051706 ;
above is a field in the record which is having special characters like -,^ and ? , which should not have been there.

The record separator is a new line and there is no field seperator.

How can I remove these invalid or special characters to creep in the records?
Please help me to find the logic in the shell sripting..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Invalid Characters in the file.

I am working on AIX. We ftp files to a database. The flat files are having thousands of records and each record is having some 50 to 60 characters(there are fields having certain character length). In addition to some valid ascii characters some invalid characters like Å, å, Ä, ä or pipes creep in which... (15 Replies)
Discussion started by: kanu_pathak
15 Replies

2. Programming

string with invalid characters

This is a pretty straight-forward question. Within a program of mine, I have a string that's going to be used as a filename, but it might have some invalid characters in it that wouldn't be valid in a filename. If there are any invalid characters, I want to get rid of them and essentially squeeze... (4 Replies)
Discussion started by: cleopard
4 Replies

3. Shell Programming and Scripting

writing shell script to find line of invalid characters

Hi, I have to write s script to check an input file for invalid characters. In this script I have to find the exact line of the invalid character. If the input file contain 2 invalid character sat line 10 and 17, the script will show the value 10 and 17. Any help is appreciated. (3 Replies)
Discussion started by: beginner82
3 Replies

4. UNIX for Dummies Questions & Answers

to delete an invalid file

there is a file is generated from my program due to undefined filename. -rw-r--r-- 1 angie angie 8644055 Jun 22 09:17 Ô$ÿÿÿÿÿÆ may i know how to delete this file..??? thanks in advance... :) (5 Replies)
Discussion started by: chxxangie
5 Replies

5. Shell Programming and Scripting

Capturing the invalid records to error file

HI, I have a source file which has the below data. Tableid,table.txt sourceid,1,2,3,4,5,6 targetid,1,2,3,4,5,6 Tableid,table sourceid,1,2,3,4,5,6 targetid,1,2,3,4,5,6 Tableid,table.txt sourceid,1,2,3,4,5,6 targetid,1,2,3,4,5,6 Tableid,table sourceid,1,2,3,4,5,6 targetid,1,2,3,4,5,6... (6 Replies)
Discussion started by: shruthidwh
6 Replies

6. Shell Programming and Scripting

Remove invalid database characters on a file

Hi All - I'm building a script wherein it is design to remove characters that are not accepted on a non-unicode database. Examples are the following: ï,¿,½,Â,é, etc. I can easily sed those characters one-by-one but I there's a problem when other unicode characters are found. Is there any way to... (1 Reply)
Discussion started by: Jin_
1 Replies

7. Shell Programming and Scripting

Valid and invalid date in the file

Hi All, How to validate the 4th column,it is date column in the file, if it valid move to valid file else moved invalid file. 9f680174-cb87|20077337254|0|20120511|N 9f680174-cb88|20077337254|0|20120534|N i want two file valid.txt and invalid.txt Thanks, (7 Replies)
Discussion started by: bmk
7 Replies

8. UNIX for Dummies Questions & Answers

To get the invalid characters from a file

Hello, Can any one help me in below query to search all the invalid characters that UNIX cannot recognize from a file. can we do anything with the help of grep command or any other commands. Also, i am not sure what are the invalid characters present in the file. Many thanks in advance. ... (6 Replies)
Discussion started by: schandru
6 Replies

9. Shell Programming and Scripting

How to get the Invalid records from a file using awk?

My Input file is fixed length record ends with . as end of the line and the character length is 4156 Example: 12234XYZ TY^4253$+00000-00000........... I need to check is there any control characters(like ^M,^Z) The line will be splitted awk '{id=substr($0,1,5) nm=substr($0,6,3)... (2 Replies)
Discussion started by: dineshaila
2 Replies

10. UNIX for Beginners Questions & Answers

Search for the invalid url in a file

Hello guys, Here i am writing a script to check for a valid url from a file,i am getting the valid url & i print it in a file and i want to print the invalid url also.how to do that? #here is my script if then URL=$(grep -E -o... (2 Replies)
Discussion started by: Meeran Rizvi
2 Replies
srec_mos_tech(5)						File Formats Manual						  srec_mos_tech(5)

NAME
srec_mos_tech - MOS Technology file format DESCRIPTION
The MOS Technology format allows binary files to be uploaded and downloaded between between a computer system (such as a PC, Macintosh, or workstation) and an emulator or evaluation board for microcontrollers and microprocessors. The Lines Each line consists of 5 fields. These are the length field, address field, data field, and the checksum. The lines always start with a semicolon (;) character. The Fields +--+--------+---------+------+----------+------+ |; | Length | Address | Data | Checksum | CRLF | +--+--------+---------+------+----------+------+ Length The record length field is a 2 character (1 byte) field that specifies the number of data bytes in the record. Typically this is 24 or less. Address This is a 2-byte address that specifies where the data in the record is to be loaded into memory, big-endian. Data The data field contains the executable code, memory-loadable data or descriptive information to be transferred. Checksum The checksum is an 2-byte field that represents the least significant two bytes of the the sum of the values represented by the pairs of characters making up the record's length, address, and data fields, big-endian. End of File The final line should have a data length of zero, and the data line count in the address field. The checksum is not the usual checksum, it is instead a repeat of the data line count. Size Multiplier In general, binary data will expand in sized by approximately 2.54 times when represented with this format. EXAMPLE
Here is an example MOS Technology format file. It contains the data "Hello, World" to be loaded at address 0. ;0C000048656C6C6F2C20576F726C640454 ;0000010001 COPYRIGHT
srec_cat version 1.58 Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Peter Miller The srec_cat program comes with ABSOLUTELY NO WARRANTY; for details use the 'srec_cat -VERSion License' command. This is free software and you are welcome to redistribute it under certain conditions; for details use the 'srec_cat -VERSion License' command. AUTHOR
Peter Miller E-Mail: pmiller@opensource.org.au //* WWW: http://miller.emu.id.au/pmiller/ KIM-1 User Manual - Appendix F - Paper Tape Format (The following information is reproduced from http://users.telenet.be/kim1-6502/6502/usrman.html#F just in case it vanishes from the Web.) The paper tape LOAD and DUMP routines store and retrieve data in a specific format designed to insure error free recovery. Each byte of data to be stored is converted to two half bytes. The half bytes (whose possible values are 0 to F HEX) are translated into their ASCII equivalents and written out onto paper tape in this form. Each record outputted begins with a ";" character (ASCII 3B) to mark the start of a valid record. The next byte transmitted (18HEX) or (24 decimal) is the number of data bytes contained in the record. The record's starting address High (1 byte, 2 characters), starting address Lo (1 byte, 2 characters), and data (24 bytes, 48 characters) follow. Each record is terminated by the record's check-sum (2 bytes, 4 characters), a carriage return (ASCII 0D), line feed (ASCII 0A), and six "NULL" characters (ASCII 00). (NULL characters cause a blank area on the paper tape.) The last record transmitted has zero data bytes (indicated by ;00) The starting address field is replaced by a four digit Hex number repre- senting the total number of data records contained in the transmission, followed by the records usual check-sum digits. An "XOFF" charac- ter ends the transmission. ;180000FFEEDDCCBBAA0099887766554433221122334455667788990AFC ;0000010001 During a "LOAD" all incoming data is ignored until a ";" character is received. The receipt of non ASCII data or a mismatch between a records calculated check-sum and the check-sum read from tape will cause an error condition to be recognized by KIM. The check-sum is cal- culated by adding all data in the record except the ";" character. The paper tape format described is compatible with all other MOS Technology, Inc. software support programs. Reference Manual SRecord srec_mos_tech(5)
All times are GMT -4. The time now is 01:45 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy