Sponsored Content
Top Forums Shell Programming and Scripting Invalid Characters in the file. Post 302163112 by kanu_pathak on Thursday 31st of January 2008 04:24:33 AM
Old 01-31-2008
Invalid Characters in the file.

I am working on AIX. We ftp files to a database. The flat files are having thousands of records and each record is having some 50 to 60 characters(there are fields having certain character length). In addition to some valid ascii characters some invalid characters like Å, å, Ä, ä or pipes creep in which datawarehouse rejects to load in.
Example: AcuM-^?a 051706 ;
above is a field in the record which is having special characters like -,^ and ? , which should not have been there.

The record separator is a new line and there is no field seperator.

How can I remove these invalid or special characters to creep in the records?
Please help me to find the logic in the shell sripting..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Invalid Characters in the file.

I am working on AIX. We ftp files to a database. The flat files are having thousands of records and each record is having some 50 to 60 characters(there are fields having certain character length). In addition to some valid ascii characters some invalid characters like Å, å, Ä, ä or pipes creep in which... (15 Replies)
Discussion started by: kanu_pathak
15 Replies

2. Programming

string with invalid characters

This is a pretty straight-forward question. Within a program of mine, I have a string that's going to be used as a filename, but it might have some invalid characters in it that wouldn't be valid in a filename. If there are any invalid characters, I want to get rid of them and essentially squeeze... (4 Replies)
Discussion started by: cleopard
4 Replies

3. Shell Programming and Scripting

writing shell script to find line of invalid characters

Hi, I have to write s script to check an input file for invalid characters. In this script I have to find the exact line of the invalid character. If the input file contain 2 invalid character sat line 10 and 17, the script will show the value 10 and 17. Any help is appreciated. (3 Replies)
Discussion started by: beginner82
3 Replies

4. UNIX for Dummies Questions & Answers

to delete an invalid file

there is a file is generated from my program due to undefined filename. -rw-r--r-- 1 angie angie 8644055 Jun 22 09:17 Ô$ÿÿÿÿÿÆ may i know how to delete this file..??? thanks in advance... :) (5 Replies)
Discussion started by: chxxangie
5 Replies

5. Shell Programming and Scripting

Capturing the invalid records to error file

HI, I have a source file which has the below data. Tableid,table.txt sourceid,1,2,3,4,5,6 targetid,1,2,3,4,5,6 Tableid,table sourceid,1,2,3,4,5,6 targetid,1,2,3,4,5,6 Tableid,table.txt sourceid,1,2,3,4,5,6 targetid,1,2,3,4,5,6 Tableid,table sourceid,1,2,3,4,5,6 targetid,1,2,3,4,5,6... (6 Replies)
Discussion started by: shruthidwh
6 Replies

6. Shell Programming and Scripting

Remove invalid database characters on a file

Hi All - I'm building a script wherein it is design to remove characters that are not accepted on a non-unicode database. Examples are the following: ï,¿,½,Â,é, etc. I can easily sed those characters one-by-one but I there's a problem when other unicode characters are found. Is there any way to... (1 Reply)
Discussion started by: Jin_
1 Replies

7. Shell Programming and Scripting

Valid and invalid date in the file

Hi All, How to validate the 4th column,it is date column in the file, if it valid move to valid file else moved invalid file. 9f680174-cb87|20077337254|0|20120511|N 9f680174-cb88|20077337254|0|20120534|N i want two file valid.txt and invalid.txt Thanks, (7 Replies)
Discussion started by: bmk
7 Replies

8. UNIX for Dummies Questions & Answers

To get the invalid characters from a file

Hello, Can any one help me in below query to search all the invalid characters that UNIX cannot recognize from a file. can we do anything with the help of grep command or any other commands. Also, i am not sure what are the invalid characters present in the file. Many thanks in advance. ... (6 Replies)
Discussion started by: schandru
6 Replies

9. Shell Programming and Scripting

How to get the Invalid records from a file using awk?

My Input file is fixed length record ends with . as end of the line and the character length is 4156 Example: 12234XYZ TY^4253$+00000-00000........... I need to check is there any control characters(like ^M,^Z) The line will be splitted awk '{id=substr($0,1,5) nm=substr($0,6,3)... (2 Replies)
Discussion started by: dineshaila
2 Replies

10. UNIX for Beginners Questions & Answers

Search for the invalid url in a file

Hello guys, Here i am writing a script to check for a valid url from a file,i am getting the valid url & i print it in a file and i want to print the invalid url also.how to do that? #here is my script if then URL=$(grep -E -o... (2 Replies)
Discussion started by: Meeran Rizvi
2 Replies
srec_intel16(5) 						File Formats Manual						   srec_intel16(5)

NAME
srec_intel16 - Intel Hexadecimal 16-bit file format specification DESCRIPTION
This format is also known as the INHX16 format. This document describes the hexadecimal object file format for 16-bit microprocessors. This format is very similar to the srec_intel(5) format, except that the addresses are word addresses. The count field is a word count. The hexadecimal representation of binary is coded in ASCII alphanumeric characters. For example, the 8-bit binary value 0011-1111 is 3F in hexadecimal. To code this in ASCII, one 8-bit byte containing the ASCII code for the character '3' (0011-0011 or 0x33) and one 8-bit byte containing the) ASCII code for the character 'F' (0100-0110 or 0x46) are required. For each byte value, the high-order hexadecimal digit is always the first digit of the pair of hexadecimal digits. This representation (ASCII hexadecimal) requires twice as many bytes as the binary representation. A hexadecimal object file is blocked into records, each of which contains the record type, length, memory load address and checksum in addition to the data. There are currently six (6) different types of records that are defined, not all combinations of these records are meaningful, however. The record are: o Data Record o End of File Record o Extended Segment Address Record o Start Segment Address Record o Extended Linear Address Record o Start Linear Address Record General Record Format +-------+--------+--------+--------+--------+--------+ |Record | Record | Load | Record | Data | Check | |Mark | Length | Offset | Type | | sum | +-------+--------+--------+--------+--------+--------+ Record Mark. Each record begins with a Record Mark field containing 0x3A, the ASCII code for the colon (":") character. Record Length Each record has a Record Length field which specifies the number of 16-bit words of information or data which follows the Record Type field of the record. This field is one byte, represented as two hexadecimal characters. The maximum value of the Record Length field is hexadecimal 'FF' or 255. Load Offset Each record has a Load Offset field which specifies the 16-bit starting load offset of the data words, therefore this field is only used for Data Records (if the words are loaded as bytes, the address needs to be doubled). In other records where this field is not used, it should be coded as four ASCII zero characters ("0000" or 0x30303030). This field one 16-bit word, represented as four hexadecimal characters. Record Type Each record has a Record Type field which specifies the record type of this record. The Record Type field is used to interpret the remaining information within the record. This field is one byte, represented as two hexadecimal characters. The encoding for all the current record types are: 0 Data Record 1 End of File Record 5 Execution Start Address Record Data Each record has a variable length Data field, it consists of zero or more 16-bit words encoded as set of 4 hexadecimal digits, most significant digit first. The interpretation of this field depends on the Record Type field. Checksum Each record ends with a Checksum field that contains the ASCII hexadecimal representation of the two's complement of the 8-bit bytes that result from converting each pair of ASCII hexadecimal digits to one byte of binary, from and including the Record Length field to and including the last byte of the Data field. Therefore, the sum of all the ASCII pairs in a record after converting to binary, from the Record Length field to and including the Checksum field, is zero. Data Record (8-, 16- or 32-bit formats) +-------+--------+--------+--------+--------+--------+ |Record | Record | Load | Record | Data | Check | |Mark | Length | Offset | Type | | sum | |(":") | | | | | | +-------+--------+--------+--------+--------+--------+ The Data Record provides a set of hexadecimal digits that represent the ASCII code for data bytes that make up a portion of a memory image. The contents of the individual fields within the record are: Record Mark This field contains 0x3A, the hexadecimal encoding of the ASCII colon (":") character. Record Length The field contains two ASCII hexadecimal digits that specify the number of 16-bit data words in the record. The maximum value is 255 decimal. Load Offset This field contains four ASCII hexadecimal digits representing the word address at which the first word of the data is to be placed. (For an exquivalent bytes address, double it.) Record Type This field contains 0x3030, the hexadecimal encoding of the ASCII character "00", which specifies the record type to be a Data Record. Data This field contains sets of four ASCII hexadecimal digits, one set for each 16-bit data word, most significant digit first. Checksum This field contains the check sum on the Record Length, Load Offset, Record Type, and Data fields. Execution Start Address Record +-------+--------+--------+--------+--------+--------+ |Record | Record | Load | Record | EIP (4 | Check | |Mark | Length | Offset | Type | bytes) | sum | |(":") | (4) | (0) | (5) | | | +-------+--------+--------+--------+--------+--------+ The Execution Start Address Record is used to specify the execution start address for the object file. This is where the loader is to jump to begin execution once the hex load is complete. The Execution Start Address Record can appear anywhere in a hexadecimal object file. If such a record is not present in a hexadecimal object file, a loader is free to assign a default execution start address. The contents of the individual fields within the record are: Record mark This field contains 0x3A, the hexadecimal encoding of the ASCII colon (":") character. Record length The field contains 0x3032, the hexadecimal encoding of the ASCII characters "02", which is the length, in bytes, of the EIP regis- ter content within this record. Load Offset This field contains 0x30303030, the hexadecimal encoding of the ASCII characters "0000", since this field is not used for this record. Record Type This field contains 0x3035, the hexadecimal encoding of the ASCII character "05", which specifies the record type to be a Start Address Record. EIP This field contains eight ASCII hexadecimal digits that specify the address. The field is encoded big-endian (most significant digit first). Checksum This field contains the check sum on the Record length, Load Offset, Record Type, and EIP fields. End of File Record This shall be the last record in the file. +-------+--------+--------+--------+--------+ |Record | Record | Load | Record | Check | |Mark | Length | Offset | Type | sum | |(":") | (0) | (0) | (1) | (0xFF) | +-------+--------+--------+--------+--------+ The End of File Record specifies the end of the hexadecimal object file. The contents of the individual fields within the record are: Record mark This field contains 0x3A, the hexadecimal encoding of the ASCII colon (":") character. Record Length The field contains 0x3030, the hexadecimal encoding of the ASCII characters "00". Since this record does not contain any Data bytes, the length is zero. Load Offset This field contains 0x30303030, the hexadecimal encoding of the ASCII characters "0000", since this field is not used for this record. Record Type This field contains 0x3031, the hexadecimal encoding of the ASCII character "01", which specifies the record type to be an End of File Record. Checksum This field contains the check sum an the Record Length, Load Offset, and Record Type fields. Since all the fields are static, the check sum can also be calculated statically, and the value is 0x4646, the hexadecimal encoding of the ASCII characters "FF". Size Multiplier In general, binary data will expand in sized by approximately 2.3 times when represented with this format. EXAMPLE
Here is an example INHX16 file. It contains the data "Hello, World" to be loaded at address 0. :0700000065486C6C2C6F5720726F646CFF0AA8 :00000001FF COPYRIGHT
srec_cat version 1.58 Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Peter Miller The srec_cat program comes with ABSOLUTELY NO WARRANTY; for details use the 'srec_cat -VERSion License' command. This is free software and you are welcome to redistribute it under certain conditions; for details use the 'srec_cat -VERSion License' command. AUTHOR
Peter Miller E-Mail: pmiller@opensource.org.au //* WWW: http://miller.emu.id.au/pmiller/ Reference Manual SRecord srec_intel16(5)
All times are GMT -4. The time now is 11:29 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy