Fastest way to traverse through large directories


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Fastest way to traverse through large directories
# 1  
Old 12-21-2005
Fastest way to traverse through large directories

Hi!

I have thousands of sub-directories, and hundreds of thousands of files in them. What is the fast way to find out which files are older than a certain date? Is the "find" command the fastest? Or is there some other way?

Right now I have a C script that traverses through and checks each file. But it is taking forever to run! Can somebody tell me a better way?

Thanks
# 2  
Old 12-21-2005
This is going to take a while no matter what you do. "find" should be able to do at close to the top possible speed. A custom C program should be able to beat "find", but not by much. I would not csh for anything.
# 3  
Old 12-21-2005
Grep?

Would grep be faster than find or no? I dont know much about any of it but just a thought. if you grepped for 10-2-04 or older *not exactly sure how to do that* but shouldnt turn up files fast?
# 4  
Old 12-21-2005
Corrail -
He wants a -ctime or an -mtime older than a certain date. grep doesn't do file date/times
# 5  
Old 12-21-2005
Thanks

Ok thanks for clearing that up. I am working on teaching myself unix and grep/awk/sed and different things like that so i bring up questions on here of possible solutions to problems and then ppl can correct me and explain why and it helps ALOT. So thank you.
# 6  
Old 12-21-2005
Hi Guys,

Thanks for your input...

From what I hear here, I guess that the way I've taken is the best. The C program is pretty fast, but then, with the hundreds of thousands of files, I'm looking for greased lightning. Every day about 20,000 files get added. I'm supposed to archive the old files (those older than 60 days). Sixty days means about 1.2 million files, with the rate growing as time goes by.

I guess one way to beat the large number of files is to go deeper down the file structure. But then, I'll have to run the application multiple times, one for each folder path that I select.

Just to complete the picture, the list of files older than 60 days is then fed to NetBackup (that archiving application from Symantec). NetBackup then moves the files to tape. I use Netbackup because it becomes easy to restore particular files whenever I need them.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Traverse through directories

Hi all, Require help in completing a shell script. i do have a total of 90 directories where in we have different sub-directories and one of the sub directory named logs I need to go inside the logs subdirectory and check if a particular log is present or not. for example below is the... (3 Replies)
Discussion started by: bhaskar t
3 Replies

2. Shell Programming and Scripting

Traverse Latest directory First

I wish to traverse the latest to the oldest directory based on its timestamp. ls -ltr drwxr-x--- 3 admin bel 1024 Jan 22 02:29 sys drwxr-x--- 2 admin bel 2048 Jan 22 02:30 admin drwxr-x--- 10 admin bel 24576 Jan 23 21:31 bin For the above i need to cd first to... (2 Replies)
Discussion started by: mohtashims
2 Replies

3. OS X (Apple)

OS X 'find' nogroup/nouser doesn't traverse directories?

flamingo:~ joliver$ sudo find / -nogroup find: /dev/fd/4: No such file or directory find: /home: No such file or directory find: /Library: No such file or directory find: /net: No such file or directory find: /Network: No such file or directory find: /private: No such file or directory find:... (2 Replies)
Discussion started by: jnojr
2 Replies

4. Shell Programming and Scripting

Fastest way to delete duplicates from a large filelist.....

OK I have two filelists...... The first is formatted like this.... /path/to/the/actual/file/location/filename.jpg and has up to a million records The second list shows filename.jpg where there is more then on instance. and has maybe up to 65,000 records I want to copy files... (4 Replies)
Discussion started by: Bashingaway
4 Replies

5. Shell Programming and Scripting

PERL - traverse sub directories and get test case results

Hello, I need help in creating a PERL script for parsing test result files to get the results (pass or fail). Each test case execution generates a directory with few files among which we are interested in .result file. Lets say Testing is home directory. If i executed 2 test cases. It will... (4 Replies)
Discussion started by: ravi.videla
4 Replies

6. Shell Programming and Scripting

Traverse through directory....

hi I have a directory structure like Parent Parent/child1/ Parent/child2/ Parent/child3/ and the each main directory contains Parent/child1/file1.txt, Parent/child1/fil2.zip ....... Parent/child2/file1.txt,Parent/child/fil2.zip ...... Now i want to traverse to each and want to... (1 Reply)
Discussion started by: Reddy482
1 Replies

7. Shell Programming and Scripting

Remove Duplicate Filenames in 2 very large directories

Hello Gurus, O/S RHEL4 I have a requirement to compare two linux based directories for duplicate filenames and remove them. These directories are close to 2 TB each. I have tried running a: Prompt>diff -r data1/ data2/ I have tried this as well: jason@jason-desktop:~$ cat script.sh ... (7 Replies)
Discussion started by: jaysunn
7 Replies

8. Shell Programming and Scripting

find in given path do not want to traverse to its sub-directories

Hi All, My UNIX Version is: OS Name Release Version AIX appma538 3 5 I want to find certain files with some criterias under the given path. At the same time i want to find the files which resides under the given directory, but normal find traverse to its sub-directories... (4 Replies)
Discussion started by: Arunprasad
4 Replies

9. Shell Programming and Scripting

Traverse catalogs

Here is my problem (it seems I've a lot of problems nowadays). I have several folders: runner.20070830.12.45.12 runner.20070830.12.45.15 runner.20070830.12.45.17 runner.20070830.12.45.20 runner.20070830.12.45.45 runner.20070830.12.45.55 Each catalog contains some html-files. I... (3 Replies)
Discussion started by: baghera
3 Replies

10. UNIX for Dummies Questions & Answers

Unix File System performance with large directories

Hi, how does the Unix File System perform with large directories (containing ~30.000 files)? What kind of structure is used for the organization of a directory's content, linear lists, (binary) trees? I hope the description 'Unix File System' is exact enough, I don't know more about the file... (3 Replies)
Discussion started by: dive
3 Replies
Login or Register to Ask a Question