Extract based on timestamp


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract based on timestamp
# 8  
Old 12-27-2015
Hi Don,
It woking fine

Thanks,
MohanSmilie

---------- Post updated at 04:01 AM ---------- Previous update was at 02:57 AM ----------

Hi Don,
Your script is working fine ,i am getting the required files but i want to copy those files ,i have tried with another pipe to copy this but it throwing error

I can able to copy the files from s3(all files without using filter) to s3 using the below
Code:
aws s3  cp $folder/  $target  --recursive

Moderator's Comments:
Mod Comment Please use CODE tags to display code; not bold tags.


Code:
logdir=/home/hadoop/logs/
logfile=segment_$DATE2_inc.logDATE2=$(date --date='1 day ago' +%Y-%m-%d)
date --date='1 day ago' -u -d  "$(date --date='1 day ago' -u "+%a %b %e 00:00:00 %Z %Y")" +%s000 >unixtime 

x=`eval cat unixtime`

echo "processing of $DATE2 files with Unix timestamp $x " >$logdir/$logfile
 
folder="s3://xx-logs/JaZ/$x"
echo "Source Path of S3 bucket $folder">>$logdir/$logfile
echo $folder
target="s3://xx/test/"
chk="$DATE2 20:30"
#Uisng another pipe to copy only required files 
aws s3 ls $folder/|awk -v start="$chk" '$1" "$2 > start'|aws s3 cp *.gz $target --recursive
#aws s3 cp $test  $target  --recursive
#aws s3  cp $folder/|awk -v start='$DATE2 20:30' '$1 " " $2 > start { print $0 }'/  $target  --recursive


hadoop@ip-172-31-19-240 ~]$ ./xx_process_inc.sh 
s3://xx-logs/JaZ/1451088000000

2015-12-26 20:39:56     334080 file1.gz
2015-12-26 20:44:08     320179 file2.gz
2015-12-26 20:44:13     316953  file3.gz
2015-12-26 21:42:07     305313  file4.gz
2015-12-27 00:42:06     189541  file5.gz

Thanks,
Mohan

Last edited by Don Cragun; 12-27-2015 at 11:58 PM.. Reason: Change B tags to CODE tags.
# 9  
Old 12-28-2015
Quote:
Originally Posted by mohan705
Hi Don,
Your script is working fine ,i am getting the required files but i want to copy those files ,i have tried with another pipe to copy this but it throwing error

I can able to copy the files from s3(all files without using filter) to s3 using the below
Code:
aws s3  cp $folder/  $target  --recursive

Moderator's Comments:
Mod Comment Please use CODE tags to display code; not bold tags.


Code:
logdir=/home/hadoop/logs/
logfile=segment_$DATE2_inc.logDATE2=$(date --date='1 day ago' +%Y-%m-%d)
date --date='1 day ago' -u -d  "$(date --date='1 day ago' -u "+%a %b %e 00:00:00 %Z %Y")" +%s000 >unixtime 

x=`eval cat unixtime`

echo "processing of $DATE2 files with Unix timestamp $x " >$logdir/$logfile
 
folder="s3://xx-logs/JaZ/$x"
echo "Source Path of S3 bucket $folder">>$logdir/$logfile
echo $folder
target="s3://xx/test/"
chk="$DATE2 20:30"
#Uisng another pipe to copy only required files 
aws s3 ls $folder/|awk -v start="$chk" '$1" "$2 > start'|aws s3 cp *.gz $target --recursive
#aws s3 cp $test  $target  --recursive
#aws s3  cp $folder/|awk -v start='$DATE2 20:30' '$1 " " $2 > start { print $0 }'/  $target  --recursive


hadoop@ip-172-31-19-240 ~]$ ./xx_process_inc.sh 
s3://xx-logs/JaZ/1451088000000

2015-12-26 20:39:56     334080 file1.gz
2015-12-26 20:44:08     320179 file2.gz
2015-12-26 20:44:13     316953  file3.gz
2015-12-26 21:42:07     305313  file4.gz
2015-12-27 00:42:06     189541  file5.gz

Thanks,
Mohan
I will assume that the code marked in red above is on a line by itself and is not at the end of a line as shown...

I am always glad to hear that code that I have suggested is working. And, disappointed to see that many of the code improvements I suggested have been ignored.

Showing us a command that works is nice. Saying that when you modify it, you get an error without showing us what error(s) you get leaves us with a lot of work to try to duplicate your environment so we can figure out what error you might be getting (or just leaves us with little desire to try to help you). Since most of us volunteering our time to help you here don't have Amazon Web Server accounts, the only way for us to figure out what is going wrong in your environment is for you to explicitly show us exactly what errors you are getting. And, showing us exactly what is being produced by each step in your script so we can figure out what is happening to the data as it passes through your script and seeing where things are going wrong. If you want our help, please help us help you by giving us the data we need to help you analyze your problem(s).

It is immediately obvious that when you pipe a list of long format aws ls utility output to a command that is not a filter (i.e., does not read from standard input), the command at the start of the pipeline will be ignored (unless it contains a syntax error that kills the entire pipeline). Even if that did work; treating a line selected and printed by awk (such as:
Code:
2015-12-26 20:44:13     316953  file3.gz

as a the name of a file is clearly never going to work).

After looking up an aws cp man page, it would appear that you want to create an aws cp command something like:
Code:
aws s3 cp "$folder/" "$target" --recursive --exclude '*' \
    $(aws s3 ls $folder/ |
        awk -v start="$chk" '
            $1" "$2 > start{
                printf(" --include \"%s\"", $NF)
            }
            END{print ""}
        ')

using command substitution (not a pipeline) to add arguments to your aws cp command line. But, without actually seeing the exact format of the output produced by:
Code:
aws s3 ls $folder/

the error messages you got running the last script you showed us, and the exact output you get from:
Code:
aws s3 ls $folder/|
    awk -v start="$chk" '$1" "$2 > start{printf(" --include \"%s\"", $NF)};END{print ""}'

(all in code tags), this is just wild speculation.
# 10  
Old 12-28-2015
Hi Don,
Thanks for your advise ,Btw changed the code as you suggested ,Apologies for not publishing in the post .This below code working fine .I will check your script

Code:
chk="$DATE2 20:30"
 

aws s3 ls $folder/|awk -v start="$chk" '$1" "$2 > start'|awk '{print $NF }'>/home/hadoop/file.txt

for file in `cat /home/hadoop/file.txt`; do
aws s3 cp $folder/$file $target
done

Thanks Smilie
# 11  
Old 12-28-2015
Quote:
Originally Posted by mohan705
Hi Don,
Thanks for your advise ,Btw changed the code as you suggested ,Apologies for not publishing in the post .This below code working fine .I will check your script

Code:
chk="$DATE2 20:30"
 

aws s3 ls $folder/|awk -v start="$chk" '$1" "$2 > start'|awk '{print $NF }'>/home/hadoop/file.txt

for file in `cat /home/hadoop/file.txt`; do
aws s3 cp $folder/$file $target
done

Thanks Smilie
You're welcome. (Note that hitting the Smilie Thanks button at the lower left corner of posts that you find helpful is also appreciated.)

Again, if you don't need the file /home/hadoop/file.txt elsewhere, your script will run faster if you don't create an unneeded file, write data to that file, and read the data back from that file. And feeding the output of an awk script into another awk script (especially when they're both using the same field separators) is almost always a waste of resources. For example:
Code:
chk="$DATE2 20:30"

aws s3 ls $folder/ |
awk -v start="$chk" '$1" "$2 > start{print $NF}' |
while read -r file; do
    aws s3 cp $folder/$file $target
done

will produce the same output as your script above, but run faster and consume fewer system resources.
# 12  
Old 12-28-2015
Yes Don,it produces the same out put and script ran fast compare to the previous .

SmilieThanks
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script to keep todays files based on Timestamp

Hi i need to keep todays files based on timestamp and archive the remaining files ex: Managerial_Country_PRD_20150907.csv Managerial_Country_PRD_20150907.csv Managerial_Country_PRD_20150906.csv Managerial_Country_PRD_20150905.csv (6 Replies)
Discussion started by: ram1228
6 Replies

2. UNIX for Dummies Questions & Answers

Display files based on particular file timestamp

Hi, I have requirement to list out files that are created after particular file. ex. I have below files in my directory. I want to display files created after /dirdat/CG1/cg004440 file. ./dirdat/CG1/cg004438 09/07/14 0:44:05 ./dirdat/CG1/cg004439 09/07/14 6:01:48 ... (3 Replies)
Discussion started by: tmalik79
3 Replies

3. Shell Programming and Scripting

Cd to Folder based on TimeStamp

Hi, ls -ltr -rw-rw---- 1 user1 admins 5000032 Jan 20 17:11 M1120252_P004640.csv Now i wish to cd to that folder amongst the below folders that would have log files for the date of the .csv file i.e. 20 Jan 17:11 ls -ltr total 53616 drwxrwx--- 2 user1 admins 20840448 Jan 19... (4 Replies)
Discussion started by: mohtashims
4 Replies

4. UNIX for Dummies Questions & Answers

Condition based on Timestamp (Date/Time based) from logfile (Epoch seconds)

Below is the sample logfile: Userids Date Time acb Checkout time: 2013-11-20 17:00 axy Checkout time: 2013-11-22 12:00 der Checkout time: 2013-11-17 17:00 xyz Checkout time: 2013-11-19 16:00 ddd Checkout time: 2013-11-21 16:00 aaa Checkout... (9 Replies)
Discussion started by: asjaiswal
9 Replies

5. UNIX for Dummies Questions & Answers

Order based on timestamp in a single file

Hi All, I have a large text file which is a combination of multiple files. This is what I used and it worked. for i in /home/docs/text/* do cat $i >> Single_File done Now wondering, if there is a way to sort that single large file based on timestamps in ascending order. Text file... (11 Replies)
Discussion started by: prrampalli
11 Replies

6. Shell Programming and Scripting

Read directories sequential based on timestamp

Hi, I have a directory structure like below Directoryname create time d1 12:00 d2 12:05 d3 12:08 I want to read the directories based on timestamp.That is oldest directory must be read first and kick off certain process. ... (7 Replies)
Discussion started by: chetan.c
7 Replies

7. Shell Programming and Scripting

Extract date from filename and set timestamp

I have lots of files in this format: dvgrab-2003.06.29_15-30-24.mpg The numbers represents the date and time (YYYY.MM.DD_HH-MM-SS) How can I extract the dates from the filenames, and use the dates in the file timestamp? I guess this can be done by using "find", "sed" and "touch"? Can... (6 Replies)
Discussion started by: qwerty1234
6 Replies

8. UNIX for Dummies Questions & Answers

How to pick only the latest files based on the timestamp?

I have a few log files which get generated on a daily basis..So, I need to pick only the ones which get generated for that particular day. -rw-r--r-- 1 staff 510732676 Apr 7 22:01 test.log040711 -rwxrwxrwx 1 staff 2147482545 Apr 7 21:30 test.log.2 -rwxrwxrwx 1 staff 2147482581 Apr 7 19:26... (43 Replies)
Discussion started by: win4luv
43 Replies

9. Shell Programming and Scripting

copy files based on creation timestamp

Dear friends.. I have the below listing of files under a directory in unix -rw-r--r-- 1 abc abc 263349631 Jun 1 11:18 CDLD_20110603032055.xml -rw-r--r-- 1 abc abc 267918241 Jun 1 11:21 CDLD_20110603032104.xml -rw-r--r-- 1 abc abc 257672513 Jun 3 10:41... (5 Replies)
Discussion started by: sureshg_sampat
5 Replies

10. Shell Programming and Scripting

How to extract timestamp from the filename?

Hi., My file name is of the format: name_abc_20100531_142528.txt where., my timestamp is of the format: yyyymmdd_hhmmss How to extract the date strring and time string into seperate variables in the shell script, after reading the file as the input? I want to get the variables... (9 Replies)
Discussion started by: av_vinay
9 Replies
Login or Register to Ask a Question