I have requirement to read only unread files from the directory and load into database.
Scenario: I do receive bunch of files into my unix directory for every 15 mins. My ETL Process every once in a day and reads the files and loads into db table. I cannot move these file into different location after extraction as source system will ftp for previous 15 days so i receive the files again if there is no file. So i will have to keep files for 15 days atleast.
Could you please advise how we can write the script to read only unread files?
1. Rename the file after it is loaded to the db with a suffix, say .processed. Use this suffix to identify already read file.
2. Create a list file which would contain names of files that are processed (i.e., loaded to db). When you load the file to db, make an entry in this list file with the name of the file loaded.
Now, use this list file to identify read and unread files.
1st option may not be possible because if i rename the file, I will receive the same file again from sourece as they do ftp for 15 days of files.
2nd Option: i don't understand how do we identify processed and not processed files(code difficulty as i am not very good at Unix).
I've got one approach:
First i create file(FILE.ALL) with all the files
Second i find the difference between FILE.ALL and FILE.BACKUP and write into FILE.LIST
After processing all the files in FILE.LIST, i will append the list from FILE.LIST into FILE.BACKUP file.
seems so far so good, Now i need to remove the filenames from FILE.BACKUP which are older than 15 days.
Filename has date in it Eg:Filename_093013_xxxx.csv.
Could some one advise is this the good approach? and how do we remove the file name from FILE.BACKUP file by comparing dates?
I don't want to receive the files again. that is the reason i do keep all the processed files in the same directory location. I do house keeping activity on which received before 15 days.
If I understand you correctly you have a Unix box to which another node is ftp'ing files regularly. You have a process on this Unix box which needs to read these files but not the one's already processed. I assume that you can process these files in the chronological order that they are received in????? If so, here's another option......
At the end of your processing job you put the command:
to create a file at the time of the process run called "timestamp".
At the start of the job you put:
to only select files created (ftp'd onto the box) since the last run finished.
That way, all the historical files can be left in the directory and not be selected for processing.
The above assumes that I have completely understood you but, if not, do post back the issues.
[a] access (read the file's contents) -atime
[b] change the status (modify the file or its attributes) -ctime
[c] modify (change the file's contents) -mtime
If your file are read-only, you can comapre the atime and the mtime
The problem is this one. I tar and gzip files on remote server
find . -ctime -1 | tar -cvf transfer_dmz_start_daily.tar *${Today}*.*;
Command
find . -ctime -1
Doesn't find files without extension
.csv .txt
I have to collect all files for current day, when the program... (1 Reply)
I need to find a word '% Retail by State' in the folder /usr/sas/reports/RetailSalesTaxallocation.
When I tried like below,
-bash-4.1$ cd /usr/sas/reports/RetailSalesTaxallocation
-bash-4.1$ find ./ -name % Retail by State
find: paths must precede expression: Retail
Usage: find ... (10 Replies)
Hello all,
this is my first and probably not my last question around here. I do hope you can help or at least point me in the right direction.
My question is as follows, I need to find files and possible folders which are not owner = AAA group = BBB with a said location and all sub folders ... (7 Replies)
Does anyone have any idea how to see only unread (new) e-mails in the Alpine client when using IMAP?
I finally have a fast IMAP client, but don't want to go over all the e-mails I've already read through other clients...
Thanks in advance for any hints.
---------- Post updated at 01:21 PM... (0 Replies)
Hi,
I have been having problem with pine for the past few weeks. I use email clinet Thunderbird to view my emails. Every time I open the thunderbird, all my emails were marked as unread. So, I logged into our email server to see what's wrong. even when I opened pine, all messages are labeled as... (0 Replies)
Hi all,
I use Conky monitor (Conky - Home) for my laptop and I needed a script to see the count of new messages on gmail/IMAP, then I made this small perl script
(I hope they can be useful to someone :))
gimap.pl
#!/usr/bin/perl
# gimap.pl by gxmsgx
# description: get the count of unread... (0 Replies)
Yes , I have to find a file in unix without using any find or where commands.Any pointers for the same would be very helpful as i am beginner in shell scritping and need a solution for the same.
Thanks in advance.
Regards
Jatin Jain (10 Replies)
I need to find files that have the ending of .out and that are older than 20 days. However, I cannot use find as I do not want to search in the directories that are underneath the directory that I am searching in.
How can this be done?? Find returns files that I do not want. (2 Replies)