Centos 7 3.10.0-327.el7.x86_64. I've got multiple instances running on both VMware & VirtualBox. I've just tested it on a different Centos 7 guest and I get similar results when I use the exact "-size" option and it returns files that don't qualify with the criteria I specified. P.S. thanks for the assistance!
---------- Post updated at 09:54 PM ---------- Previous update was at 07:37 PM ----------
Quote:
Originally Posted by Don Cragun
Hi bodisha,
Guessing that the find that you're using behaves differently than the macOS/BSD find utility I'm using and that you really do only want to select files that are exactly of size 1G bytes, try:
Code:
find /home -type f -user database -size 1073741824c -ls
If you're looking for files that are at least 1G bytes, try:
Code:
find /home -type f -user database -size +1073741823c -ls
Here's a screenshot of the problem. It's the same version of Centos 7 but on a different laptop. As you can see when I use the "-size 1M" option I get more files that expected. When I use the "-size 1000k" I get the results expected.
I have no idea what is going on with the Centos (presumably GNU) find utility. With BSD and macOS find, -size 1024k and -size 1M should produce identical results.
...
Here's a screenshot of the problem. It's the same version of Centos 7 but on a different laptop. As you can see when I use the "-size 1M" option I get more files that expected. When I use the "-size 1000k" I get the results expected.
I am using Debian 8 "Jessie" that has GNU find 4.4.2
After running your commands on similarly sized files, here's my guess about what is happening.
When you say "-size nS", where "n" is an integer specifying "units of space" and "S" is the suffix (M, k etc.), then the find command searches for files that have a rounded up size of "nS". That effectively means that the size of the file is > (n-1)S and <= nS.
So, as per this theory, "-size 1M" means files with "rounded up size of 1M", or sizes > 0M and <= 1M. 0M bytes = 0 bytes. Hence files with sizes 183, 2112 etc are displayed.
If you say "-size 2M", then it would mean files with "rounded up size of 2M", or sizes > 1M and <= 2M. That will not display anything, since there is no file with size > 1M or 1048576 bytes and <= 2M or 2097152 bytes.
Code:
$ find . -type f -size 2M -ls
$
Similar case could be argued for size 3M.
Code:
$ find . -type f -size 3M -ls
$
Now, in case of "-size = 1000k", notice that k = 1024 bytes, so it searches for files with rounded up size of 1000k i.e. sizes > 999k or (999 * 1024 =) 1022976 bytes and <= 1000k or (1000 * 1024 =) 1024000 bytes.
That displays your one file.
Code:
$ find . -type f -size 1000k -ls
6029496 1000 -rw-r--r-- 1 r2d2 r2d2 1024000 Feb 17 02:32 ./large1.log
$
To test this logic further, I created three files with sizes:
$
$ # size 1k = sizes in the range (0k, 1k] or (0, 1024]
$ find . -type f -size 1k -ls
6029492 4 -rw-r--r-- 1 r2d2 r2d2 183 Feb 17 02:29 ./graphical.txt
$
$ # size 2k = sizes in the range (1k, 2k] or (1024, 2048]
$ find . -type f -size 2k -ls
$
$ # size 3k = sizes in the range (2k, 3k] or (2048, 3072]
$ find . -type f -size 3k -ls
6029493 4 -rw-r--r-- 1 r2d2 r2d2 2112 Feb 17 02:29 ./strace.out
$
$ # size 4k = sizes in the range (3k, 4k] or (3072, 4096]
$ find . -type f -size 4k -ls
$
$ # size 5k = sizes in the range (4k, 5k] or (4096, 5120]
$ find . -type f -size 5k -ls
6029494 8 -rw-r--r-- 1 r2d2 r2d2 4659 Feb 17 02:29 ./vmstat.out
6029495 8 -rw-r--r-- 1 r2d2 r2d2 4660 Feb 17 02:30 ./vmstat1.out
$
$
So essentially, if a file is using up:
(a) 10.3 blocks i.e. 10 blocks + a fraction of the next block, then its size is considered to be 11 blocks
(b) 4k blocks + a fraction of the next 1k block, then its size is considered to be 5k blocks
(c) 2M blocks + a fraction of the next 1M block, then its size is considered to be 3M blocks
Last edited by durden_tyler; 02-17-2017 at 05:09 AM..
These 3 Users Gave Thanks to durden_tyler For This Post:
@durden_tyler: Thanks, this is exactly (admittedly not that detailed) what I found when testing with my find (GNU findutils) 4.7.0-git on linux, hence my highlighting of the "rounding up to unit size" in the man page citation (commented by Don Cragun in post#3).
I see that with other versions on other systems, the size test is handled differently. In FreeBSD, for instance, rounding is done only for 512 byte blocks, and 1k means exactly 1024 bytes, -1k includes 1023 bytes but excludes 1024, +1k shows 1025 but not 1024.
@durden_tyler: Thanks, this is exactly (admittedly not that detailed) what I found when testing with my find (GNU findutils) 4.7.0-git on linux
Now, this is funny - this is eactly what i thought to be the case, until Don said it can't be that way. The same reasoning led me to think that any file sized >0c is selected by -size 1G - because it is "rounded up" to the next full GB.
Now, this is funny - this is eactly what i thought to be the case, until Don said it can't be that way. The same reasoning led me to think that any file sized >0c is selected by -size 1G - because it is "rounded up" to the next full GB.
Now completely confused.
bakunin
Hi Bakunin,
Don't be confused. What we see here is another case where GNU utilities and BSD utilities behave differently. (And, some UNIX systems don't offer the extension at all.) You get exactly the same behavior on BSD, Linux, and UNIX systems for:
Code:
find file... ... -size [+|-]number[c] ...
which are the -size primary argument formats required by the POSIX standards, but the behavior of:
Code:
find file... ... -size [+|-]number[k|M|G|T|F] ...
where one of the optional size multipliers is supplied is likely to give you a syntax error on some UNIX-branded systems, one of the two behaviors that we have discussed in this thread on Linux systems (and maybe on some UNIX-branded systems), and the other behavior on BSD-based systems and at least one UNIX-branded system.
This User Gave Thanks to Don Cragun For This Post:
I have a text file downloaded from the web, I want to count the unique words used in the file, and a person's speaking length during conversation by counting the words between the opening and closing quotation marks which differ from the standard ASCII code. Also I found out the file contains some... (2 Replies)
i feel weird with this 2 command
find /tmp/*test* -user `whoami` -mtime +1 -type f -exec rm -f {}\;
find /tmp/*test* -user `whoami` -mtime +1 -type f -exec ls -lrt {}\;
the first one return correct which only delete those filename that consist *test* where second command it listed all the... (12 Replies)
Hi,
Am running the command below to search for files that contains a certain string.
grep -il "shutdown" `find . -type f -mtime -1 -print` | grep "^./scripts/active"
How do I get it to do a ls -l on the list of files? I tried doing ls -l `grep -il "shutdown" `find . -type f -mtime -1... (5 Replies)
Hi,
I have a problem with a shell script.
The script should find all .cpp and .h files and list them.
With:
for file in `find $src -name '*.h' -o -name '*.cpp'
it gives out this:
H:\FileList\A\E\F\G\newCppFile.cpp
H:\FileList\header01.h
H:\FileList\B\nextCppFile.cpp
... (4 Replies)
I have an issue with a korn shell script that I am writing. The script parses through a configuration file which lists a heap of path/directories for some files which need to be FTP'd. Now the script needs to check whether there are any files which have not been processed and are X minutes old.
... (2 Replies)
I was running some timings in my code to see which of several functions was the best and I've been getting some odd results. Here's the code I'm using:
static double time_loop(int (*foo)(int)) {
clock_t start, end;
int n = 0, i = 0;
start = clock();
for (; i <= MAXN; i++)
if... (6 Replies)
I'm attempting to read a file that is composed of complex 32-bit floating point values on Solaris 10 that came from a 64-bit Red Hat computer.
When I first tried reading the file, it looked like there was a byte-swapping problem and after running the od command on the file Solaris and Red Hat... (2 Replies)
I am looking for files of a certian type and logging them. After they are logged they need to be moved to a different directory. HOw can i incorporate that in my current script?
CSV_OUTFILE="somefile.csv"
find . -name W\* -exec printf "%s,%s,OK" {} `date '+%Y%m%d%H%M%S'` \; > ${CSV_OUTFILE}
... (9 Replies)
Hi--
Ok. I have now found that:
find -x -ls
will do what I need as far as finding all files on a particular volume. Now I need to sort the results by the file's modification date/time.
Is there a way to do that?
Also, I notice that for many files, whereas the man for find says ls is... (8 Replies)