size of a directory


 
Thread Tools Search this Thread
Top Forums Programming size of a directory
# 1  
Old 10-13-2004
Question size of a directory

hai friends

I need a program to find the size of a directory.. When i tried to get the size, it always gives the default space allocated for it. How can i findout the exact size of a directory using a c program

Thanks in advance
Collins
# 2  
Old 10-14-2004
What do you mean by "exact size of a directory"?
# 3  
Old 10-14-2004
Or du(1) and fts(3).
# 4  
Old 10-14-2004
YEah, right about fts... Wasn't sure if this could be standard or not. But AFAIK, Linux has it too.


But there is no significant overhead with du... This definitively is I/O bound. The biggest problem is that this is prolly only there on UNIX systems. And now that I think about it... du/fts will probably be optimized to run much faster than whatever could be coded trivially... Heh, what overhead? Not to mention du/fts will probably get right the traversing of the hierarchy while it is being modified. Modularity.
# 5  
Old 10-15-2004
Quote:
Originally posted by Driver
> This definitively is I/O bound.

Irrelevant. Invoking an external program adds the time needed to create a process, execute the du program, communicate back the results and destroy the process. Whether this is going to be a problem or not depends on the application; And a portable directory traverser, once written, will compile even using only, say, DOS/Windows libraries - with minimal changes (many libraries provide the exact same functions prefixed with a ``_'' character, and/or declared in a header called direct.h instead of dirent.h)
If you mean that there is a portability issue with du, I agree with that. But spawning the sub-process will take almost no time compared to reading the directories. The overhead will never be noticeable.

Quote:
> And now that I think about it... du/fts will probably be optimized to
> run much faster than whatever could be coded trivially... Heh, what
> overhead? Not to mention du/fts will probably get right the
> traversing of the hierarchy while it is being modified. Modularity.

In think that you are dreaming this up in order to support your assertion that ``du'' is a superior solution Smilie If you can explain how a program can ``get right the traversing of the hierarchy while it is being modified'' (and what you actually mean by that), I'm all ears. As to whether it will be ``optimized to run much faster'', well, there's not a lot of room for any relevant differences. In part because it is ``I/O bound'', as you said earlier.
1) AFAIK, it keeps opened FDs to the parent directories as it does the traversal. If some directories are moved, it will always use the right parent dir. There might be other stuff they did in fts as IIRC, there were security issues related with that as rm -r, chmod -R, find, etc, are often using fts and could be used as root on user files (ex, what happens if root does rm -rf ~user/stuff and user moves ~/stuff/a/b/c to /tmp while rm is traversing it? if using chdir ".." without checking if the parent dir is the same, it could end up traversing files starting at /. If not using "..", it will not traverse all the files in ~user/stuff because the path have changed. Keeping the open'd FD requires keeping them in dynamic data structures and this is a waste of time to code again).

2) ... Dude... It Is BECAUSE this is I/O bound that adding some CPU overhead to reduce the I/O will give a faster results. You see for yourself; they have a lot of code in that library and this is prolly not to make things slower (although a lot of it could be for security): http://www.freebsd.org/cgi/cvsweb.cg...libc/gen/fts.c
# 6  
Old 10-15-2004
Quote:
Originally posted by Driver
> if using chdir ".." without checking if the parent dir is the same, it could end up traversing files starting at /.
> If not using "..", it will not traverse all the files in ~user/stuff because the path have changed.

If you really want to use two chdir() calls for every sub directory, then I agree that ``du'' will be faster. As soon as you call opendir(), the contents of the directory are likely going to be slurped into a buffer for subsequent use anyway. There is no way you can lose this data even if you chdir() into a sub directory and back before reading from the buffer. However, while this is an implementation detail, what you're getting at here does not matter for the program we are talking about.

> Keeping the open'd FD requires keeping them in dynamic data structures and this is a waste of time to code again).

We do not wish to delete files. We want to stat() them and add up the sizes. The results are never guaranteed to be accurate, either, no matter at which time you look at them.
It can still be much more accurate using a du that uses fts. Ex, it should be harder to make the program sum up everything in / erroneously.

Quote:
> 2) ... Dude... It Is BECAUSE this is I/O bound that adding some CPU overhead to reduce the I/O will give a faster results.

Well then, ``dude'', and why would fts be able to ``add some CPU overhead to reduce the I/O'' any more than a straightforward ad-hoc implementation?
Heh. That's something that puzzles me about most textbook algorithms too, dude. But this is definitively possible to reduce I/O while adding some CPU overhead; think of the tree/hash used databases....

Quote:
> You see for yourself; they have a lot of code in that library and this is prolly not to make things slower (although a lot of it could be for
> security): http://www.freebsd.org/cgi/cvsweb.c.../libc/gen/fts.c

Do you really believe that ``they'' make the only operating system that matters? No other system but various 4.4BSD derivatives and possibly glibc (but they probably rolled their own compatible implementation) will even posses this file to begin with.
Most fts should be similar, if not better (especially considering that this one is very old and really free (so it could have been forked a few times)). Just consider this one a proof of concept if you want...
# 7  
Old 10-15-2004
Quote:
Originally posted by Driver
> Heh. That's something that puzzles me about most textbook algorithms too, dude. But this is definitively possible to reduce I/O
> while adding some CPU overhead; think of the tree/hash used databases...

I'm not ``puzzled'' about the concept of trading processing time versus I/O; I merely asked you to back up your claim that it is actually possible and meaningful to do so, in this particular case.
I pasted link to FreeBSD's fts. It does it.

Quote:
> Most fts should be similar, if not better (especially considering that this one is very old and really free (so it could have been forked a few
> times)).

Most fts ... I thought we had already established that fts is a BSD thing Smilie
Most OSes will use something similar for rm/chmod/du/find, etc. I call these things fts. At the very least, their du must be implemented correctly.

Quote:
> Just consider this one a proof of concept if you want...

Proof of WHAT concept? It calls opendir(), readdir() and stat() as well, just like your own implementation would. You have yet to show what it is that's supposed to make this code less I/O intensive and thus faster than a straightforward manual implementation here! And keep in mind that an ad-hoc solution can also save all the bookkeeping work; It can stat() a file and throw it away, without all the ``bloat'' found in fts.c.
fts is bloated, but it is faster and secure. See source code.

Quote:
Anyway. If you don't mind, I will leave this fruitless discussion in favor of something more productive.
Sure.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Ls directory size reporting byte size instead of file count

I have been searching both on Unix.com and Google and have not been able to find the answer to my question. I think it is partly because I can't come up with the right search terms. Recently, my virtual server switched storage devices and I think the problem may be related to that change.... (2 Replies)
Discussion started by: jmgibby
2 Replies

2. Shell Programming and Scripting

How to delete some of the files in the directory, if the directory size limits the specified size

To find the whole size of a particular directory i use "du -sk /dirname".. but after finding the direcory's size how do i make conditions like if the size of the dir is more than 1 GB i hav to delete some of the files inside the dir (0 Replies)
Discussion started by: shaal89
0 Replies

3. Solaris

get directory size

Hi, How to get size of particular directory? Thanks (5 Replies)
Discussion started by: cutefriend
5 Replies

4. UNIX for Dummies Questions & Answers

directory tree with directory size

find . -type d -print 2>/dev/null|awk '!/\.$/ {for (i=1;i<NF;i++){d=length($i);if ( d < 5 && i != 1 )d=5;printf("%"d"s","|")}print "---"$NF}' FS='/' Can someone explain how this works..?? How can i add directory size to be listed in the above command's output..?? (1 Reply)
Discussion started by: vikram3.r
1 Replies

5. Solaris

Directory size larger than file system size?

Hi, We currently have an Oracle database running and it is creating lots of processes in the /proc directory that are 1000M in size. The size of the /proc directory is now reading 26T. How can this be if the root file system is only 13GB? I have seen this before we an Oracle temp file... (6 Replies)
Discussion started by: sparcman
6 Replies

6. Shell Programming and Scripting

size of directory

Hello again; I have a directories and subdirectories in my current directory and i wanna to find the directories( and subdirectories ) which are larger than what user enters as first parameter. find . -type d -size +"$1"c -print > directories.dat I used this command and i am not sure it is... (19 Replies)
Discussion started by: redbeard_06
19 Replies

7. UNIX for Dummies Questions & Answers

directory size with ls -l

am I right in assuming that in unix a directory size is just information about that directory stored somewhere on the file system, and not a sum of its contents? This is because ls -l gives 1024 as my directory size, when the directory contains many gigs worth of stuff. also, is du -sk dir ... (2 Replies)
Discussion started by: JamesByars
2 Replies

8. AIX

size of directory with ls -l

hello When i do a "ls -l" in a directory (Aix 5.3), i have the result : >ls -l total 65635864 -rw-r--r-- 1 lobi system 2559909888 Feb 20 15:06 cible5.7bdat -rw-r--r-- 1 lobi system 1020098870 Feb 20 13:06 cible6.7bdat -rw-r--r-- 1 lobi system 1544789511 Feb 20 11:06 cible9.7bdat -rw-r--r--... (2 Replies)
Discussion started by: pascalbout
2 Replies

9. UNIX for Dummies Questions & Answers

size of a directory?

hi, say i have the following directory structure a/b/c/d... can i do df -kt /a/b/c/d and the output will gives me the total space of the directory space in d? or the output will just be the total space of the parent directory a. hope its clear.. (2 Replies)
Discussion started by: yls177
2 Replies

10. Shell Programming and Scripting

Size of a directory

Hi, It would be of great help if anyone can tell me what is the command for getting the size of a directory. Thx a lot in advance Minaz (9 Replies)
Discussion started by: minazk
9 Replies
Login or Register to Ask a Question