Search Results

Search: Posts Made By: SriniShoo
40,097
Posted By SriniShoo
Since both are flat files with data patterns, it...
Since both are flat files with data patterns, it is possible with awk
40,097
Posted By SriniShoo
It wouldn't degrade the performance...but...
It wouldn't degrade the performance...but performing an operation on 100 GB file at a time vs performing the task on 1GB files parallelly...
That's the beauty of Hadoop
But upto you to choose where...
40,097
Posted By SriniShoo
If you are dumping the data into HDFS, I assume...
If you are dumping the data into HDFS, I assume the data is very big.
In that case you can run the awk code as streams and it would run faster
Anyways I shall provide the code in some time as I...
40,097
Posted By SriniShoo
I don't see any issue when I tested the above...
I don't see any issue when I tested the above given code
I would suggest you check the input files correctly and retry
40,097
Posted By SriniShoo
awk -F "," 'NR == FNR {h = (h == "") ? $1 : (h FS...
awk -F "," 'NR == FNR {h = (h == "") ? $1 : (h FS $1); gsub("[)(]", "-", $2); split($2, a, "-"); d[NR] = a[1]; l[NR] = a[2]; n[NR] = ($3 == "NOT NULL") ? 1 : 0; next}
FNR == 1 {print h >...
40,097
Posted By SriniShoo
These 2 lines are the login part... 1st line is...
These 2 lines are the login part...
1st line is assignment part for schema file
2nd line is the validation part from schema
40,097
Posted By SriniShoo
You had issues splitting the data type field in...
You had issues splitting the data type field in schema file using the delimited '(' or ')' (issue depends on unix flavor).
Hence, I have converted the braces to '-'
now, your schema file datatype...
40,097
Posted By SriniShoo
I ran the same code and below is the output $...
I ran the same code and below is the output
$ cat conf
id,Numeric(2),NOT NULL
name,String(20)
state,String(10),NOT NULL
street_No,Numeric(4)
$ cat data
abc,john,MI,201
22,Lilly,CA,405...
40,097
Posted By SriniShoo
awk -F "," 'NR == FNR {gsub("[)(]", "-", $2);...
awk -F "," 'NR == FNR {gsub("[)(]", "-", $2); split($2, a, "-"); d[NR] = a[1]; l[NR] = a[2]; n[NR] = ($3 == "NOT NULL") ? 1 : 0; next}
{for(i = 1; i <= NF; i++)
{if(((d[i] == "Numeric" && (($i +...
40,097
Posted By SriniShoo
Verified this code and it provided the expected...
Verified this code and it provided the expected output for the given input
awk -F "," 'NR == FNR{split($2, a, "\)|\("); d[NR] = a[1]; l[NR] = a[2]; n[NR] = ($3 == "NOT NULL") ? 1 : 0; next}
...
40,097
Posted By SriniShoo
post your questions here
post your questions here
40,097
Posted By SriniShoo
Are you sure cf and df has the same content as...
Are you sure cf and df has the same content as provided in the previous post
40,097
Posted By SriniShoo
Hey Shree, you cant keep changing the...
Hey Shree, you cant keep changing the requirement.
In your first code, there was a header and hence written the code to meet that requirement

With the data you gave recently, I ran the below...
40,097
Posted By SriniShoo
The above given code is updated for Numeric and...
The above given code is updated for Numeric and String.
Tell me if that doesn't work
40,097
Posted By SriniShoo
In the first example, you have given, "integer"...
In the first example, you have given, "integer" and "string"
The code is written in such a way to cehck integer and string but not "Numeric" and "String". Even case is important.
you can change the...
40,097
Posted By SriniShoo
In the below code, conf.csv is the name of schema...
In the below code, conf.csv is the name of schema file and file.csv is the name of data file
awk -F "," 'NR == FNR {a[NR] = $2; next}
FNR == 1 {print $0 > "goodrec"; print $0 > "badrec"}
FNR >...
40,097
Posted By SriniShoo
You will need to both the files Schema file...
You will need to both the files
Schema file first and data file next
40,097
Posted By SriniShoo
If both schjema file (conf) and data file (file)...
If both schjema file (conf) and data file (file) as comma separated
Below code will do the job and send eligible records to a file "goodrec" and bad data to "badrec"

awk -F "," 'NR == FNR {a[NR]...
Showing results 1 to 18 of 18

 
All times are GMT -4. The time now is 08:34 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy