Checking for the file existence


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Checking for the file existence
# 1  
Old 05-07-2014
Checking for the file existence

Hi,

I have written a script to validate the data file by referreing to the configurtion file. And moving the validated good records and bad records into HDFS.
Suppose after 15 mins if i receive one more data fie,then after validation the good and bad records shold be stored in hadoop with the timestamp attached to it.
So whenever the new data file comes ,after valifdation the bad and good records are stored in the hadoop with the timestamp attached to the file name.
How this can be done using the unix script?

Below is the code written so for
Code:
#!/bin/bash 

awk -F "," 'NR == FNR{
h = (h == "") ? $1 : (h FS $1); 
gsub("[)(]", "-", $2);
split($2, a, "-"); 
d[NR] = a[1]; l[NR] = a[2]; n[NR] = ($3 == "NOT NULL") ? 1 : 0; next}  
FNR == 1 {print h > "goodrec"; print h > "badrec"} 
{
  for(i = 1; i <= NF; i++)
  {
   
     if(((d[i] == "Integer" && (($i + 0) == $i || $i == "")) ||  (d[i] == "String" && ($i + 0) != $i) || (d[i] == "Char"  && ($i + 0) != $i)) && (length($i) <= l[i])  && (length($i) >= n[i]))
      {f = 1} else {f = 0};
        if(f == 0) {print $0 > "badrec"; b++; next}
  }
    print $0 > "goodrec"; g++
}
END {
print "Parsing Success!";
print "Validated records are found on the Hadoop Path \"/user/hduser/Dataparse\""
    }' configfile.txt datafile.txt d1.txt

#Loading good and bad records on HDFS
hadoop fs -put /home/hduser/saptha/validate/badrec /user/hduser/Dataparse/
hadoop fs -put /home/hduser/saptha/validate/goodrec /user/hduser/Dataparse/

So here i want goodrec and badrec file should be like for each data file.
Code:
goodrec_timestamp
badrec_timestamp

Thanks,
Shree
# 2  
Old 05-07-2014
So where you stuck?
What about the datafile which is processed? Do you move it to some other location?
What is the criteria? Is it fixed filename?
# 3  
Old 05-07-2014
Hi,

In the above code just forget about the file d1.txt(do not consider it). And the code if working perfectly fine. Suppose after some time if i get 2nd datafile i want the good and bad records related to 2nd data file should be stored with a different name. It should not overwrite the existing validated files. So eachtime it should create a goodrec and badrec files with a timestamp attached to the filename.

Below are the configfile and datafile:
configfile.txt:
Code:
id,Integer(2),NOT NULL
name,String(20)
state,String(5),NOT NULL
phone_no,Integer(4)
gender,Char(1)

datafile.txt
Code:
1,John,MI,4589,M
2,Lilly,FL,589,F
3,Richard,CA,2212,M
4,Cruse,VA,2222,M
5a,Taylor,,5888,M
6,Merry,TX,6969,F
7,,CO,5656,F
8,Tom,AL,5555,M
9,Sam,FL,2586,M
10,8888,OK,456
11,George11,MI,5555,M
12,Reet,MI,4589,M
13,8888a,FL,5899,F
14,Meera,NY,2546,F
15,Madav,,4454,M

# 4  
Old 05-07-2014
what is the expected timestamp?
if it is MMDDYYYYHHMi,
Code:
awk -F "," -vDT="$(date +%m%d%Y%H%M)" 'BEGIN {GOOD = "goodrec_" DT; BAD = "badrec_" DT}
NR == FNR{
h = (h == "") ? $1 : (h FS $1); 
gsub("[)(]", "-", $2);
split($2, a, "-"); 
d[NR] = a[1]; l[NR] = a[2]; n[NR] = ($3 == "NOT NULL") ? 1 : 0; next}  
FNR == 1 {print h > GOOD; print h > BAD} 
{
  for(i = 1; i <= NF; i++)
  {
   
     if(((d[i] == "Integer" && (($i + 0) == $i || $i == "")) ||  (d[i] == "String" && ($i + 0) != $i) || (d[i] == "Char"  && ($i + 0) != $i)) && (length($i) <= l[i])  && (length($i) >= n[i]))
      {f = 1} else {f = 0};
        if(f == 0) {print $0 > BAD; b++; next}
  }
    print $0 > GOOD; g++
}
END {
print "Parsing Success!";
print "Validated records are found on the Hadoop Path \"/user/hduser/Dataparse\""
    }' configfile.txt datafile.txt d1.txt

# 5  
Old 05-07-2014
Yes, MMDDYYYYHHMi also fine for me. I'm not looking into a specific time stamp format. I wanted to add time stamp into a filename in order to differentiate the good and bad records files on a timely basis for different data files.
# 6  
Old 05-07-2014
Try the above given code
# 7  
Old 05-08-2014
Hi Srini,

The above code is perfectly working fine. In local system its createing the goodrec and badrec with timestamps. But to store the files on HDFS do i need to write the below code every time for each new good and bad record files. Can it be handed through unix scripts?
Code to store data on HDFS :
Code:
hadoop fs -put /home/hduser/saptha/validate/badrec /user/hduser/Dataparse/
hadoop fs -put /home/hduser/saptha/validate/goodrec /user/hduser/Dataparse/

So how can i pass the goodrec and badrec with timestamp in the above code?
Ca it be done?

Thanks,
Shree
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Command script for checking a file existence

Hello, I have a directory where sometimes appear a certain file name - and I'd like to be notified by email when that happens... so what command or script I may use? e.g. if there's a file named "adam" in the directory named "dir1" then send a mail to "abc@abc.com".. it needs to permanently... (5 Replies)
Discussion started by: netrom
5 Replies

2. Shell Programming and Scripting

Checking file existence along with condition

Hi am trying to write a script which find the existence of a file from a find command output and perform a task if the file exists. Help me out with the correct syntax . Am trying with the following one but unable to get the output. if then <some tasks> else echo "file not exists" fi (5 Replies)
Discussion started by: rogerben
5 Replies

3. Shell Programming and Scripting

Checking existence of file using awk

Hi, I need to check whether a particular file exists ot not using awk. Can anyone help me please? For Example:script that i am using: awk '{filename =$NF; rc=(system("test -r filename")) print $rc;}' "$1" is not working. Here I am passing a text file as input whose last word contains a... (6 Replies)
Discussion started by: manish007
6 Replies

4. Shell Programming and Scripting

checking the file existence using ssh

Hi Can any body say me the reason for below error ssh -o 'StrictHostKeyChecking no' user@client ' && print "1"' I am getting error as "Missing ]":wall: (6 Replies)
Discussion started by: ramesh12621
6 Replies

5. Shell Programming and Scripting

Checking the existence of a file before getting last modified

Hi, I am trying to check the existence of a file, from a list of possible filenames: status-A status-B status-C before retrieving the last modified datetime using ls, I want to check it exists or ls will throw an error. So I have tried this: if ; then ls status-* fi But the if... (3 Replies)
Discussion started by: LostInTheWoods
3 Replies

6. Shell Programming and Scripting

Multiple file existence and checking file size

I want to check the files in particular directory are more that 0 Bytes i.e, Non zero byte file. The script should print a msg if all the files in that directory are empty( 0 Byte). (2 Replies)
Discussion started by: lathish
2 Replies

7. Shell Programming and Scripting

Checking Multiple file existence

Hi, I want to check multiple files exist or not in a single if statement in korn Shell:confused:. Please help me Thanks (1 Reply)
Discussion started by: lathish
1 Replies

8. Shell Programming and Scripting

Checking for existence of a flat file in UNIX !

Hi All, I have a requirement where in i need to check for existence of a file and later execute some pmcmd commands related to informatica. I tried many ways but was unsuccessful could you please throw some light. Below are the sample codes i wrote. Example 1: #!/bin/ksh... (4 Replies)
Discussion started by: Ariean
4 Replies

9. Shell Programming and Scripting

Checking the existence of a file..

Hi, I am trying to check for the existence of a file using the 'test' and the file existence options. When trying to check for a file with a space in between e.g 'Team List', it gives the following error. learn1: line 3: test: `Team: binary operator expected I am pasting my code below as... (7 Replies)
Discussion started by: igandu
7 Replies

10. Shell Programming and Scripting

checking file existence

Hi, My requirement was to check the existence of a file having a specified pattern.The way i tried to achieve this was if ; then echo "File found" fi an example file having this pattern was 'ilvs_trace01.0124'. it will vary... (3 Replies)
Discussion started by: DILEEP410
3 Replies
Login or Register to Ask a Question