File name and format validation


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File name and format validation
# 1  
Old 01-29-2010
File name and format validation

Hi Gurus,

I used unix long time back. I need help for writing a unix script which can be automated and execute every day on specific time.

1.) This is the actual functional requirement.
Informatica should reject incoming files that have invalid filenames or file formats

2.) My File name will be of below format:
Code:
<interface_name>_<source_name>_<template-type>_<sequence_number>_<datetime-stamp>.csv

e.g. VJA_AP_BRSIN_00001_2601201012316.csv

3.) For eg: if I have the below fields in my file
Code:
SiteID
Num(3) 
ICOMS Action
TBD
SKU
Char(50) 
Serial
Char(16)

I should validate the name of the file and also its format. For this, a shell script has to be written. Every day, i see 50 files in my incoming directory.

I feel, The level of field validation should be
A) Count of fields in a file should be matched
B) Datatypes and field value size should match
C) if any NOT NULL fields coming as NULL - check

I appreciate if some one could help me ASAP.

---------- Post updated at 07:54 AM ---------- Previous update was at 07:49 AM ----------


Hi All,

I also need to check whether the file has a header , detail records and footer in my validation

Last edited by Scott; 01-30-2010 at 06:40 AM.. Reason: Removed formatting, added code tags
# 2  
Old 01-29-2010
We can't do all your job, but give some ideas.

1. reject invalid filenames,

Code:
BASE=/Mydirectory
for file in `find $BASE -type f `
do
  FN=`basename $file`
  if echo $FN |grep "^[A-Z]\{3\}_[A-Z]\{2\}_[A-Z]\{5\}_[0-9]\{5\}_[0-9]\{13\}.csv"  ; then 
       echo $file is valid file
  else 
       echo $file is invalid file
#     rm $file
   fi
done

2. For file format and other requests, you need provide some samples to us first.
# 3  
Old 01-30-2010
Help me

HI rdcwayx,

Thanks for your reply.

I haven't got exact file with me now. sorry for that.
But i know the details of the file.

File is a .csv file and it is comma separated with header, detail and footer values.

File name is VJA_AP_BRSIN_00001_2601201012316.csv

For eg: File has 6 fields which has numeric, string and date fields.
Code:
HSiteID,ICOMS Action,SKU,serial,edate
01,ABC,Pending,23,4,19951227120556
02,DM,Pending,26,5,19951227120556
03,RP,delivered,28,,19951227120556
T3

The level of field validation is more with 4 checks. As i have experience in working on small scripts few years back. I request some one to help me on this...

A) After loading the source VJA_AP_BRSIN_00001_2601201012316.csv file to target xxx.csv file, I need to count the reords matching the target count of records.(In Trailer, i have 3, it should match with target xxx.csv file record count.
B) Datatypes and field value size should match
I mean first field site id should be numeric, 2nd field ICOMS should be string, 3rd string, 4th numeric, 5th numeric, 6th date.
C) if any NOT NULL fields coming as NULL - check
If any of the required fields coming as NULL then we should create a log and send the message that file is invalid, correct it.
D) Need to check whether the file came with header , detail records and footer. If not, log it with message, no footer or no header, etc.

Thank you very much.

---------- Post updated at 11:59 AM ---------- Previous update was at 04:39 AM ----------

One more validation to be done was

E) No. of columns in source file should match as expected(for eg:6 as above file)
and i should check for 'n' no. of files at a time placed in a folder: Srcfile folder.

Thanks,
vsmeruga

Last edited by Scott; 01-30-2010 at 06:44 AM.. Reason: Please use code tags
# 4  
Old 02-03-2010
hi some one help me on this
# 5  
Old 02-09-2010
Hi All

Can you just help me on finding the NULLs from the existing list of columns.
2nd row - column4 is NULL, 3rd row - column5 is NULL

My file data will be like below:

01,ABC,Pending,23,4,19951227120556
02,DM,Pending,,5,19951227120556
03,RP,delivered,28,,19951227120556

with file name as ATRPU_RP_ATU_00008_05022010125056.csv

Thanks,
vsmeruga
# 6  
Old 02-09-2010
Quote:
Originally Posted by vsmeruga
Hi All

Can you just help me on finding the NULLs from the existing list of columns.
2nd row - column4 is NULL, 3rd row - column5 is NULL

My file data will be like below:

01,ABC,Pending,23,4,19951227120556
02,DM,Pending,,5,19951227120556
03,RP,delivered,28,,19951227120556

with file name as ATRPU_RP_ATU_00008_05022010125056.csv

Thanks,
vsmeruga

Code:
awk -F, '{for(i=0; ++i<=NF;){
if($i==""){print "Row No " NR " and Column No " i " is null"}}}' infile

# 7  
Old 02-10-2010
Thanks. I will try and know about it

---------- Post updated at 05:01 AM ---------- Previous update was at 04:40 AM ----------

Hi Malcome

Thanks for the quick reply. Let me re frame my question again.

Actually, I have 3 mandatory fields among the list of fields in a file. I should not get the values as NULL for those 3 fields

Mandatory Fields are : Feild1, Field4, Field5

If i find any NULL values in those 3 fields. I need to write the message to log as "Mandatory field : Field Num coming as NULL. File cannot be processed"

---------- Post updated at 06:19 AM ---------- Previous update was at 05:01 AM ----------

Hi trying to execute below script with script file name as interface_main_script.sh
and getting the error as below:

Code:
user prompt: ksh interface_main_script.sh
interface_main_script.sh[6]: syntax error at line 14 : `elif' unexpected

Please let me know my mistake.

interface_main_script.sh:

Code:
#!/bin/ksh
#set -x

BASE=/grid/PowerCenter/stage/velocity_r3/inbound/ATRPU

for file in `find $BASE -type f `
do
  FN=`basename $file`
  intname = $FN |grep "^[A-Z]\{5\}"  
  if $intname= 'ATRPU' then 
	echo "ATRPU Interface"
       #./AT_VELUNIX01_SRV01_inbound_ftp.sh ${intname}
  elif  $intname= 'ATRGI' then 
	echo "ATRGI Interface"
       #./AT_VELUNIX01_SRV01_inbound_ftp.sh ${intname}
  else
	echo "Not valid Interface Files"
   fi
done


Last edited by Franklin52; 02-10-2010 at 07:24 AM.. Reason: Please use code tags!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Flat File Validation

Hi Team there is a requirement to do validate on Flat files using Shell Scripting. Suppose we have two flat files ( F1, F2). Validation 1. Row count between F1 and F2 a) Write it in a file with results of count of each file and differences if any 2. Apply checksum or any... (1 Reply)
Discussion started by: shlearner
1 Replies

2. UNIX for Dummies Questions & Answers

Validation before moving the File

Hi, Before moving the files from source directory to target directory I need to check if the files have these strings 'Error', 'Not Valid' ,' YTF-' or if the file is a zero byte file, if the file contains these strings or if it is a zero byte file i should log a entry in the log file and fail... (5 Replies)
Discussion started by: gaur.deepti
5 Replies

3. Shell Programming and Scripting

Validation of date from file name

I'm writing a shell script for cleanup of older files from various sub-directories inside a main directory The structure of directories is as below: Logs daily online archive weekly online archive... (1 Reply)
Discussion started by: asyed
1 Replies

4. Shell Programming and Scripting

File validation

Hi there, As a part of file validation, I needed to check for delimiter count in the file. My aim is to find, how many records have failed to have predefined numbers of delimiters in the file. My code looks like below i=`awk -F '|' 'NF != 2 {print NR, $0} ' ${pinb_fldr}/${pfile}DAT |... (3 Replies)
Discussion started by: anandapani
3 Replies

5. Shell Programming and Scripting

XML file validation

Hi, i am new to unix script. i have two xml files. one is orderinfo.xml and another one is xsd file(EmailMessage.xml). we need to compare the both file for poper nodes exists are not. for example: <xsd:element name="EmailMessage"> tag in the EmailMessage.xml file(xsd sheet),this tag... (2 Replies)
Discussion started by: bmk
2 Replies

6. Shell Programming and Scripting

File validation

Hello, File contains 1,3 and 5 are required columns. it's working use this command. awk -F\| '$1 && $3 && $5' test1.txt > test2.txt How can use this unix programming.while using runnign this scirpt,it's raising the error. awk: ^ syntax error #!/usr/bin/ksh `awk -F\| '$1 &&... (3 Replies)
Discussion started by: bammidi
3 Replies

7. Shell Programming and Scripting

File Name Validation

Hi All, I am trying to validate the file name against the naming convention mentioned in configuration file. In the configuration file, the file name convention is mentioned as: Myfile_SQ<NN>_<NN>_YYYYMMDD_HHMMSS.xml The actual file received is Myfile_SQ10_30_20110423_073002.xml How do... (1 Reply)
Discussion started by: angshuman
1 Replies

8. UNIX for Dummies Questions & Answers

file name with timestamp validation

Hi , I am trying to write a shell script which will validate all .csv file names in a directory , my file name format is as below. CDR_SCAN_Report_YYYYMMDDHHMI.csv 1. I need to validate the name should start with CDR_SCAN_Report. 2. And the time stamp part is a valid time stamp,if it is... (1 Reply)
Discussion started by: shruthidwh
1 Replies

9. UNIX for Dummies Questions & Answers

File Validation

Hi All, I am new to UNIX scripting. I need to validate a file. I need to check whether the file has a header , detail records and footer. If all the file is good I need to create a status file with the status 'Y' else 'N'. I have pasted the an example of the file below: ... (5 Replies)
Discussion started by: kumar66
5 Replies

10. Shell Programming and Scripting

Unix File Validation! Help

Hi All, I got a file with 3 fields delimited by hyphen "-". I have to validate & cleanse the data before i begine the processing Requirements 1. No record should contain more than 2 delimiters 2. No record should even contain less than 2 delimiters 3. Any records that matches rule 1 &... (8 Replies)
Discussion started by: minnuverma
8 Replies
Login or Register to Ask a Question