Data file:
Config File:
Can be used to match any regex including date different format and numbers.
Depending on the Delimiter the file would be read.
The validator would check for the columns against the regex for each column specified.
If any column record doesnt match then the row should be flagged with the related rule no for the failure
Eg.
DHF,345,ED3(doesnt begin with D),Rule3
if multiple failure then
123,4565,FGJG,Rule2;Rule
What would be the best coding language awk or perl?
Examples would be appreciated
---------- Post updated at 02:42 PM ---------- Previous update was at 02:32 PM ----------
Post validation,we might have a requirement to convert the columns to format as specifier in the config.
Config File:
Data File Output:
Notice the Sal column has changed to 1234.00 from 1234
Data file:
Config File:
Can be used to match any regex including date different format and numbers.
Depending on the Delimiter the file would be read.
The validator would check for the columns against the regex for each column specified.
If any column record doesnt match then the row should be flagged with the related rule no for the failure
Eg.
DHF,345,ED3(doesnt begin with D),Rule3
if multiple failure then
123,4565,FGJG,Rule2;Rule
What would be the best coding language awk or perl?
Examples would be appreciated
---------- Post updated at 02:42 PM ---------- Previous update was at 02:32 PM ----------
Post validation,we might have a requirement to convert the columns to format as specifier in the config.
Config File:
Data File Output:
Notice the Sal column has changed to 1234.00 from 1234
Regards,D
Dikesh Shah.
I don't get it.
You say the 3rd field in Rule* lines in your config file is a regular expression, but none of your input fields match any of your regular expressions. (You also don't say what type of regular expression, but since you mentioned awk I'll assume that you want extended regular expression. Rule1's [:upper:] would match a single character from the set ":", "e", "p", "r", and "u"; not three uppercase characters. To match "ABC", "AYX", and "DHF", you would need an ERE something like ^[[:upper:]]{3}$. Similarly, Rule2's [:digit:] would match a single character from the set ":", "d", "g", "i", and "t". To match the values you have in field 2, you would need an ERE similar to ^[[:digit:]]+$. And, Rule3's [D]* matches every string that contains zero or more copies of the letter "D". (In other words, the ERE in Rule3 will match EVERY input string.) If you're looking for a "D" followed by a single digit or by one or more digits, you would need EREs similar to ^D[[:digit:]]$ or ^D[[:digit:]]+$, respectively. You could simplify this somewhat if you were to specify that all EREs are anchored at both ends (i.e., the "^" at the beginning of the ERE and the "$" at the end of the ERE are assumed and should not be explicitly mentioned).
Then there is the question of how your format string works. In what formatting language does the format string "0.00" transform "1234" to "1234.00"? I could understand "%d.00", "%s.00", or" %.2f", but it seems to me that "0.00" should change any input string to the string "0.00".
Also note that using "-" as the field delimiter in your Config file severely restricts the EREs and format strings users can easily specify. Do you have the ability to change the format of the Config file? If you use the same EREs and format strings that awk uses, it would be much better to use <tab> as your config file field delimiter (in awk, "\t" can be easily used in both EREs and format strings anywhere a <tab> character is needed). This is especially a problem if you ever want to match dates of the form YYYY-MM-DD.
Are there any other commands (besides "delimiter" and "Rule*") allowed in a Config file? Is there a default delimiter if a "delimiter" command is not specified in a Config file?
I am using below logic to validate whether i am expecting the correct data from source,if not logic should give which column has error.i am running below logic in linux
awk -F, '
NR==1{next}
{f=" "}
$1!~/^{0,5}$|^$/{f=f?f" emp_id-error":"emp_id-error"}
$4!~/^{0,6}$|^$/{f=f?f"... (4 Replies)
Hi All,
I am completely new to Unix Shell Scripting.
I m validating(Basic File Validation) an .HHT file in TIBCO.
After that i need to do Data Validation through UNIX Shell scripting.
Rules in DataValidation:
1.) Need to Check Every field wheather it is a Char or Number?(Fields are... (1 Reply)
hpux. older version. don't have alot of the newer features in some utilities.
How do I verify that a variable starts with the letter A. I would like to make it case insensitive.
Also, if I have a variable that has letters and numbers. I want to change all the lower case letters to upper case.... (2 Replies)
Hello,
I am working on an interface between a legacy system and an ERP system.The format of the data extracted into the staging folder from the legacy system is a follows.
One control file named ExtractDataControl.txt
Multiple entity files eg R1001.txt, R1002.txt, R1020.txt
The control... (5 Replies)
I have a comma delimited file that I need to validate the data in one two columns in.
Sample File:
1234,1234,1234,DESCRIPTION,1,1,2
1234,1234,1234,DESCRIPTION,1,1,2
1234,1234,1234,DESCRIPTION,1,1,2
1234,1234,1234,DESCRIPTION,1,1,2
I need to make sure that the second column's entries are... (3 Replies)
How to validate a date and optionly a time in shell scripting when i get the date and time as pararmeters that sent out with the call of the file? (in my case sh union.sh `first parameter ,second parameter...` (4 Replies)
Hello People
I have the following file.csv:
date,string,float,number,boolean
20080303,abc,1.5,123,Y
20080304,abc,1.2,345,N
20080229,nvh,1.4,098,Y
20080319,ugy,1.9,586,N
20080315,gyh,2.4,345,Y
20080316,erf,3.1,932,N
I need to filter the date field where I have a data bigger than I... (1 Reply)
Hello,
I am trying to use data validation with a program. I have everything else working fine. I just can't figure out what I am doing wrong with the data validation in one of my files. Here is the code:
# data validation loop
while :
do
# get input from keyboard
... (2 Replies)
Hi ,
This is a general doubt....
Is there any way to do data validation inside an awk script..
Let me make it more clear....
I have a string variable inside awk script .. Is there any way to check whether it is number or string etc...
Thanks in advance.
Shihab (1 Reply)