remember processed files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting remember processed files
# 1  
Old 09-24-2009
remember processed files

Hello dear community!

I have the following task to accomplish: there is a directory with approximately 2 thousand files. I have to write a script which would randomly extract 200 files on the first run. On the second run it should extract again 200 files but that files mustn't intersect with those extracted during the first run of the script. So I have to remember the names (or probably inodes) of already extracted files. What do you think is the best way to do that? So far my decision is to create a new file with a list of inodes of already extracted files. On the subsequent runs of my script I'll then check whether the inodes of randomly chosen files are already present in the list. What do you think about this approach? Are there other probably more elegant ways to remember (or to mark) what files have already been extracted?
# 2  
Old 09-24-2009
Don't have code for you but you could just take a list of all files and randomize them. Then take the first 200, then next 200, etc.
# 3  
Old 09-24-2009
can you mv the files to a new directory ?

can you cp the files to a new directory and then diff the 'ls -1' on the two dirs?
# 4  
Old 09-25-2009
What you need is not remembering file names.
You need the count of files.
Try this.

Code:
 
#-- Move away from org. files 
cd /dum/dumma/here/
 
if [ ! -r counter.txt ] ; then
    echo "1" > counter.txt
fi;

typeset -i from=$(<counter.txt)
typeset -i till=$(expr $from + 199)

#-- If you want, you can merge this line with "| sed"
#-- But this way, you have your own advantages
ls -1 > dummyy.txt

sed -n "$from,${till}p" dummyy.txt | do_some_thing.sh
echo $(expr $till + 1) > counter.txt


Last edited by edidataguy; 09-25-2009 at 12:15 AM..
# 5  
Old 09-25-2009
Quote:
Originally Posted by sidorenko
What do you think about this approach
I'd create a "numbered list of files given" (see code below), then use those numbers for random selection and finally remove list entries in accordance to files extracted ...

Code:
ls -1 $FOLDER | nl -n nl >> files.list

# 6  
Old 09-25-2009
Thank you very much for your ideas. They were very useful to me. Indeed, creating once a randomized list of files and then dealing with it is much more efficient than randomize files every time my script is run. Thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Red Hat

Oracle Linux issue - all files FTPed to it from windows need to to be processed with dos2UNIX

Hello Friends, I have observed one recent issue about ftp from windows to Linux. Initially we had Solaris unix and any file sent from windows to solaris via ftp (binary or ascii) worked smoothly. Say for e.g if I sent a shell script to soalris from windows, that script used to run corectly.... (17 Replies)
Discussion started by: Albert_Pinto7
17 Replies

2. Post Here to Contact Site Administrators and Moderators

Log Out vs Remember Me

Howdy, I clicked the rememberer me when I log in, and evidently I really do not understand what that means. I had hoped that at least it would remember my user name for the next time that I log in. However, when I log out, I see a message about cookies being removed and one other thing that I... (1 Reply)
Discussion started by: danuke
1 Replies

3. What is on Your Mind?

Anyone remember this cute unix ad?

It showed a cleaning woman (probably in the evening, after most of the other employees had left work) happily typing commands on a dot matrix terminal (could've been a DEC LA120, IIRC) just because "unix is so easy to use, even a cleaning woman can use it!". If you know where to find a scanned... (2 Replies)
Discussion started by: mathiasbage
2 Replies

4. UNIX for Dummies Questions & Answers

Grep that I should know but I can't remember!

I have a master list of hosts, and a list of bad hosts. I want to filter out the bad hosts from the master list. I was trying a few for loops but it's not providing the desired result: for i in $(cat master_host_list);do grep -iv $i bad_host_list;done | sort | uniq # won't work because it... (5 Replies)
Discussion started by: MaindotC
5 Replies

5. Shell Programming and Scripting

Deleting processed lines

I have a log file that I am processing. This contains messages from and to a server (requests and responses). The responses to requests may not be in order i.e. we can have a response to a request after several requests are sent, and in some error cases there may not be any response message. ... (2 Replies)
Discussion started by: BootComp
2 Replies

6. UNIX for Dummies Questions & Answers

Deleting access_log.processed in crontab

Hi, I've worked out that my server was getting clogged with the access_log.processed file. I deleted it using the command > /var/www/vhosts/domain.com/statistics/logs/access_log.processed I also set that up as a crontab job for every Wednesday. What I was wondering is the version using... (4 Replies)
Discussion started by: chickenhouse
4 Replies

7. Forum Support Area for Unregistered Users & Account Problems

Two Login's - Remember only one.

Hi All, Have only recently returned to Unix.com due to other activities (*oh the shame of it all). Anyways, when I initially came back to redesigned/revamped site I couldn't remember any of my logins. I'd originally thought that I'd used the UID of 'Cameron' but couldn't remember any details... (2 Replies)
Discussion started by: Cameron
2 Replies

8. Shell Programming and Scripting

AWK--does anyone remember it

I am trying to run awk on a 55 page Word document. I wanted to delete every occurrence of <company>, <script>, </scripts> from the file then cut & paste all of the appropriate fields to an Excel spreadsheet. Also the code is suppose to replace the dates in a new format such as "xxxx-xx-xx" ... (2 Replies)
Discussion started by: cnitadesigner
2 Replies

9. Shell Programming and Scripting

It's been awhile...help me remember

Well it's been a long time since I have used any OS besides apples and windows (raising my son). My principal would like our teachers to use UNIX as their mail system. That's not a problem, the mail system is like riding a bike you never forget. Here's my problem. She wants me to write a script... (2 Replies)
Discussion started by: catbad
2 Replies
Login or Register to Ask a Question