Sponsored Content
Top Forums Shell Programming and Scripting what is the better way to validate records in a file. Post 302715391 by Seshendranath on Sunday 14th of October 2012 11:56:48 PM
Old 10-15-2012
Bug what is the better way to validate records in a file.

hi all,

We are checking for the delimited file records validation

Delimited file will have data like this:
Code:
Aaaa|sdfhxfgh|sdgjhxfgjh|sdgjsdg|sgdjsg|
Aaaa|sdfhxfgh|sdgjhxfgjh|sdgjsdg|sgdjsg|
Aaaa|sdfhxfgh|sdgjhxfgjh|sdgjsdg|sgdjsg|
Aaaa|sdfhxfgh|sdgjhxfgjh|sdgjsdg|sgdjsg|

So we are checking for where the records of files we got is having validating length or not.

The structer of file/table will be configured in Teradata, we will fetch the column length from tht file.
ex:
Code:
col1 varchar(5),
col2varchar(5),
col3varchar(5),
col4 varchar(5)

we hav to check all columns have field length not greater than 5 if its then we will write the hole error record to bad file.

In the script col_nm col_order_num col_len
col_nm =column name
col_order_num =oder number will be order of column in tht table….it will be 1 2 3….like tht
col_len=length of the column
Code:
#------------------------------------------
#  Reading through the file and checking for the column length
#----------------------------------------------------
                logNote "Reading through the temp file and and checking for the column length"
 
                while read col_nm col_order_num col_len
                do
                                typeset -i col_len
                                typeset -i col_len_good
 
                                col_len_good=`expr $col_len + 1`
 
                                logNote "col_nm : $col_nm"
                                logNote "col_order_num : $col_order_num"
                                logNote "col_len : $col_len"
                                logNote "col_len_good : $col_len_good"
 
                                awk 'BEGIN{col_ord='$col_order_num';col_l='$col_len'}{FS="|"}{if (length($col_ord) > col_l) print $0;}'  $Src_File >> $Src_File.bad
 
                                awk 'BEGIN{col_ord='$col_order_num';col_l='$col_len_good'}{FS="|"}{if (length($col_ord) < col_l) print $0;}'  $Src_File > $Src_File.temp
 
                                rm -f $Src_File
                                mv $Src_File.temp $Src_File
 
                done <$RPT_FILE

================================
we are using this script but its very slow in validating, preformance is very slow
can amy ione come up with soem better way plzs.

Last edited by Scrutinizer; 10-15-2012 at 01:40 AM.. Reason: code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

validate the file name

write a shell script that check file name like pstat_24.txt (up to 5 digits) i mean to say this digit can be range from 1 to 99999 only correct file name are pstat_10000.txt pstat_12345.txt pstat_14569.txt wrong file name are pstat_1234567.txt pstat_1a2345.txt... (2 Replies)
Discussion started by: maykap100
2 Replies

2. Shell Programming and Scripting

How to validate a CSV file?

Hi. I think some people have already asked this, but the answers/questions seem to be about validating the contents inside a CSV file. I am simply after a simple variable solution (ie 0 = false, 1 = true) that I can use in my script to say that file so-and-so is actually a CSV file, or in some... (4 Replies)
Discussion started by: ElCaito
4 Replies

3. Shell Programming and Scripting

validate against a file

Hello all, I am having problem in writing a if condition for the following: I have a file Instance.dat which has: #Server Environment server1 dev server2 dev server3 sit #!/bin/ksh ENV=dev for i in $( cat Instances.dat | grep -v '#' |awk {'print $2'} ) do if ]... (7 Replies)
Discussion started by: chiru_h
7 Replies

4. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies

5. Shell Programming and Scripting

Validate the file

How do we validate the header file. The file number should increament by 1 (position 17 to 19) if not abend the process. first week ABC0001 20100101123 second week ABC0001 20100108124 Third week ABC0001 20100115125 (7 Replies)
Discussion started by: zooby
7 Replies

6. UNIX for Dummies Questions & Answers

Grep specific records from a file of records that are separated by an empty line

Hi everyone. I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this: ID: 20 Name: X Age: 19 ID: 21 Name: Z ID: 22 Email: xxx@yahoo.com Name: Y Age: 19 I want to grep records that... (4 Replies)
Discussion started by: Atrisa
4 Replies

7. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

8. UNIX for Advanced & Expert Users

Wanted best way to validate delimited file records

actually i post about this issue before but many folkz miss-understood with my quesion, We are checking for the delimited file records validation Delimited file will have data like this: Aaaa|sdfhxfgh|sdgjhxfgjh|sdgjsdg|sgdjsg| Aaaa|sdfhxfgh|sdgjhxfgjh|sdgjsdg|sgdjsg|... (3 Replies)
Discussion started by: Seshendranath
3 Replies

9. Shell Programming and Scripting

Separate records of a file on 2 types of records

Hi I am new to shell programming in unix Please if I can provide help. I have a file structure of a header record and "N" detail records. The header record will be the total number of detail records I need to split the file in 2: One for the header Another for all detail records Could... (1 Reply)
Discussion started by: jamcogar
1 Replies

10. UNIX for Beginners Questions & Answers

Validate csv file

Hi guys, i want to validate the no.of colums in a csv file ,but if there is a comma(,) in any of the data values it should skip and count only valid (,) commas. e.g 1.abc,pqrs,1234,567,hhh result :4 2.abc,pqrs,1234,567,hhh,"in,valid",end12 result:6 here script should skip the comma inside... (10 Replies)
Discussion started by: harry123
10 Replies
ASNVAL(1)						     NCBI Tools User's Manual							 ASNVAL(1)

NAME
asnval - validate ASN.1 biological sequence records SYNOPSIS
asnval [-] [-A] [-B] [-C N] [-D] [-E str] [-F] [-G] [-J] [-K] [-L filename] [-M] [-N flags] [-P N] [-Q N] [-R N] [-S N] [-T] [-U] [-X] [-Y] [-Z] [-a type] [-b] [-c] [-d path] [-e] [-f str] [-i filename] [-k] [-l] [-o filename] [-p path] [-q] [-r] [-u] [-v N] [-x str] DESCRIPTION
asnval is a command-line tool to validate ASN.1-format biological sequence records. OPTIONS
A summary of options is included below. - Print usage message -A Validate Alignments -B Validate Barcodes -C N Max count -D Disable message suppression -E str Only Error Code to Show -F Test network access -G GI lookup from accession -J Require ISO-JTA? -K Summary to error file -L filename Log File -M Match locus_tag against General ID -N flags Latitude-longitude / country flags 0 none 1 test state/province 2 ignore water exception 3 both of the above -P N Highest severity for error to show: 0 none 1 informational 2 warning 3 error (default for -Q) 4 grounds for rejection (default for -P, -R) -Q N Lowest severity for error to show, per P -R N Severity for error in return code, per P -S N Skip count -T Use Threads -U Genome center submission -X Exon splice check -Y Check against old IDs -Z Remote CDS product fetch -a type Input ASN.1 type: a Automatic (default) c Catenated z Any e seq-Entry b Bioseq s bioseq-Set m seq-subMit t baTch bioseq-set u batch seq-sUbmit -b Batch file is Binary -c Batch file is Compressed -d path Path to Indexed Binary ASN.1 Data -e Ignore transcription/translation Exceptions -x str Substring filter -i filename Single input file (standard input by default) -k Local fetching -l Lock components in advance -o filename Single output file -p path Path to ASN.1 Files -q Taxonomy lookup -r Remote Fetching from ID -u Recurse -v N Verbosity: 1 Standard report (default) 2 Accession / severity / code (space delimited) 3 Accession / severity / code (tab delimited) 4 XML report 5 Accession / GI / severity / code (tab delimited) -x str File selection substring (.ent by default) AUTHOR
The National Center for Biotechnology Information. SEE ALSO
asndisc(1), cleanasn(1), sequin(1). NCBI
2012-06-24 ASNVAL(1)
All times are GMT -4. The time now is 04:46 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy