Help with command to Move files by X number to seperate directories


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with command to Move files by X number to seperate directories
# 1  
Old 03-25-2010
Help with command to Move files by X number to seperate directories

Hello,

I need help finding a script that will allow me to move files from one directory to another directory 10k files at a time.

I have a directory that has 100 K files in it. I need to have those 100k files broken apart to separate directories each with 10k files in them.

Here is the diagram of what i'm trying to do.

Dir1
100000 files

Needs to be broken apart every 10000 files and those need to be in a new directory

NEWDIRA = 10000 files
NEWDIRB = 10000 files

If anyone knows of a command that would allow this to be run that would help greatly. Thanks

Geo_Bean
# 2  
Old 03-25-2010
The most important thing to take into account is to avoid using a for-loop: you would run into an "argument list too long"-error. I have no directory with that many files at hand to test it, but the following should work (i still can't tell you about execution time, test it carefully):

Code:
#! /bin/ksh
typeset iCnt=0
typeset fSrcDir="/path/to/source-dir"
typeset fTgtDir="/path/to/target-dir"
typeset iPart=1
typeset fFile=""

ls "$fSrcDir" | while read fFile ; do
     mv "$fSrcDir/$fFile" "${fTgtDir}/Part${iPart}"
     if [ $? -gt 0 ] ; then
          exit $?
     fi
     (( iCnt += 1 ))
     if [ $iCnt -ge 10000 ] ; then
          iCnt=0
          (( iPart += 1 ))
     fi
done

exit 0

I hope this helps.

bakunin
# 3  
Old 03-25-2010
Code:
#!/bin/ksh
cd dirA
ls >../list
x=1
y=1
mkdir ../dir$y
while read file
do
   mv $file ../dir$y
   x=`expr $x + 1` 
   if [ $x -gt 10000 ]
   then
       x=1
       y=`expr $y + 1`
       mkdir ../dir$y
   fi
done < ../list



---------- Post updated at 09:16 AM ---------- Previous update was at 09:07 AM ----------

Bakunin,
No fair, I saw it first, but had to go answer the phone.

Interesting that we both had the same solution, but totally different coding style.

Jack

Last edited by jgt; 03-25-2010 at 10:12 AM.. Reason: changed location of list file
# 4  
Old 03-25-2010
I like above solutions much better than mine, but I want to show my solution anyway, just to show that we have many solutions.

PHP Code:
#!/usr/bin/ksh
input_dir="/home/temp/in"
output_dir="/home/temp/out"
amount_files_input=`ls -1 $input_dir | wc -l`

if [ 
"$amount_files_input-ge 100000 ] ; then
   
for i in {1..10000}
     do
        
file_to_move=`ls -1 $input_dir | tail -1`
        
mv $input_dir/$file_to_move $output_dir
   done
fi 
and of course I'm a newbie still Smilie
# 5  
Old 03-25-2010
Bug

what if in a folder we have thousands of files ,lets say for past 2 years...

how can i search files of 2009 and just move them to a separate folder ?

date format being : 2009-03-25....

can 'xargs' be used here? if yes,how ?
# 6  
Old 03-25-2010
Quote:
Originally Posted by ak835
what if in a folder we have thousands of files ,lets say for past 2 years...

how can i search files of 2009 and just move them to a separate folder ?

date format being : 2009-03-25....

can 'xargs' be used here? if yes,how ?
Since this is presumably a one time job. I would use something like:
ls -ltr >filelist
Then edit filelist and remove all the file names that you do not want to move.
Then write a script to read the edited file and move the files.

---------- Post updated at 11:20 AM ---------- Previous update was at 11:01 AM ----------

Quote:
Originally Posted by urandom
I like above solutions much better than mine, but I want to show my solution anyway, just to show that we have many solutions.

PHP Code:
#!/usr/bin/ksh
input_dir="/home/temp/in"
output_dir="/home/temp/out"
amount_files_input=`ls -1 $input_dir | wc -l`

if [ 
"$amount_files_input-ge 100000 ] ; then
   
for i in {1..10000}
     do
        
file_to_move=`ls -1 $input_dir | tail -1`
        
mv $input_dir/$file_to_move $output_dir
   done
fi 
and of course I'm a newbie still Smilie
Yes it will work with less than 250 entries in the directory, but....
With thousands of entries in the directory, ls, might well take significant time to execute.
I tried the "ls -1 $input_dir |tail -1" line on a directory with 168000 files.
Real time was 0.50 seconds*. You execute this command 10000 times

*on a dual processor quad core system with serial SCSI RAID10
# 7  
Old 03-25-2010
Quote:
Originally Posted by ak835
what if in a folder we have thousands of files ,lets say for past 2 years...

how can i search files of 2009 and just move them to a separate folder ?

date format being : 2009-03-25....

can 'xargs' be used here? if yes,how ?
It's probably best to start your own thread for this question. Aside from having to deal with a lot of files, it really has nothing to do with the original poster's problem and will only muddle the discussion, in my opinion.

Regards,
Alister

---------- Post updated at 04:48 PM ---------- Previous update was at 11:25 AM ----------


Hello, bakunin:

Quote:
Originally Posted by bakunin
The most important thing to take into account is to avoid using a for-loop: you would run into an "argument list too long"-error.
That's actually bad advice in general (there is absolutely no need to avoid using a for loop because the list may be very very long), and it's terrible advice in this particular case, where a glob in a for loop's list is by far the simplest and safest way to handle a directory of files (field splitting is not a concern since the glob is expanded during the penultimate step in shell command line processing, only quote removal follows it).

Your warning regarding "argument list too long" scenarios does not apply to a shell expanding a wildcard, which is done internally and does not require an exec system call. Nor does it apply to the for loop since that is also internal. Nor does it apply to any commands within the for loop since they are fed the list items one at a time. For more info regarding ARGMAX issues, The maximum length of arguments for a new process may be helpful.

To test for yourself, you can execute the following (if your system has jot, if not perhaps you can tweak it to use seq, or even brace expansion):
Code:
$ for f in $(jot -w '%0100d' 100000); do touch "$f"; done
$ for f in *; do :; done

That will create 100,000 files, each with a 100 character filename, and then run a do-nothing loop which nevertheless has to expand the * wildcard.

Regarding your solution, there are some caveats: it will not properly handle any files which contain leading whitespace, embedded newlines, or a trailing backslash. The whitespace and trailing backslash can be fixed by tweaking IFS and using read's -r option. The embedded newline however cannot be worked around (at least not with any posix-compliant functionality in ls/read that I'm aware of).

On an unrelated note, the following idiom will always return a 0 exit status, since when the exit command executes, the value of $? is the exit status of the [ command, which must have succeeded if the exit has been reached.
Quote:
Originally Posted by bakunin
Code:
mv "$fSrcDir/$fFile" "${fTgtDir}/Part${iPart}"
if [ $? -gt 0 ] ; then
    exit $?
fi

A simple, correct way would be:
Code:
mv "$fSrcDir/$fFile" "${fTgtDir}/Part${iPart}" || exit

exit will only execute if mv fails, and it will return mv's exit status (the last command run).

The solution you provided also assumes that all the target directories exist, since there is no mkdir anywhere (that may be intentional, I'm just pointing it out to save the original poster some time Smilie

Please don't take the above criticisms personally; they are intended to be helpful. If my analysis is erroneous, I would appreciate being corrected.

Regards,
Alister

---------- Post updated at 04:57 PM ---------- Previous update was at 04:48 PM ----------

Hello, jgt:

Quote:
Originally Posted by jgt
Code:
#!/bin/ksh
cd dirA
ls >../list
x=1
y=1
mkdir ../dir$y
while read file
do
   mv $file ../dir$y
   x=`expr $x + 1` 
   if [ $x -gt 10000 ]
   then
       x=1
       y=`expr $y + 1`
       mkdir ../dir$y
   fi
done < ../list

This solution is broken with regard to whitespace. $file in the mv should be double-quoted. There are also problems with the way read is used, which will mangle filenames with leading whitespace, embedded newlines, and a trailing backlash.

I mention it only in case the original poster's monster directory has some susceptible filenames.

Regards,
Alister

---------- Post updated at 05:23 PM ---------- Previous update was at 04:57 PM ----------

My attempt at a solution. It should handle any filenames without issue. The only downside is that it must execute mv once per file to move. However, I'll take the performance hit over possible breakage when mv'ing 100K files one at a time only takes about 5 minutes on a 3 yr old laptop with a slow drive. It operates on the current working directory and also creates the necessary destination directories (001, 002, ...) in the current working directory.
Code:
#!/bin/sh

i=0
for f in *; do
    if [ $((i%10000)) -eq 0 ]; then
        dest=$(printf '%03u' $((i/10000+1)))
        mkdir $dest || exit 2
    fi
    mv "$f" $dest || exit 1
    : $((++i))
done

Creating 100,000 files with 100 character long filenames:
Code:
$ for f in $(jot -w '%0100d' 100000); do touch "$f"; done

A test run, followed by a some simple checks:
Code:
$ ../mv10k.sh 

$ ls
001     002     003     004     005     006     007     008     009     010

$ for d in *; do echo $d: $(ls $d | wc -l) files; ls $d | head -n2; echo ...snip...; ls $d | tail -n2; echo =================; done
001: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000009999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000
=================
002: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000019999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000020000
=================
003: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000020001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000020002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000029999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000030000
=================
004: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000030001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000030002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000039999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000040000
=================
005: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000040001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000040002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000049999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000050000
=================
006: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000050001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000050002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000059999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000060000
=================
007: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000060001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000060002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000069999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000070000
=================
008: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000070001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000070002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000079999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000080000
=================
009: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000080001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000080002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000089999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000090000
=================
010: 10000 files
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000090001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000090002
...snip...
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000099999
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000
=================

Regards,
Alister
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Move several files into specific directories with a loop

Hello, I'm a first time poster looking for help in scripting a task in my daily routine. I am new in unix but i am attracted to its use as a mac user. Bear with me... I have several files (20) that I manually drag via the mouse into several named directories over a network. I've used rsync... (14 Replies)
Discussion started by: SonnyClark
14 Replies

2. Shell Programming and Scripting

Need BASH Script Help to Move Files While Creating Directories

I've got this script to loop through all folders and move files that are more than 2 years old. I'm using the install command because it creates the necessary directories on the destination path and then I remove the source. I'd like to change the script to use the mv command since it is much... (4 Replies)
Discussion started by: consultant
4 Replies

3. UNIX for Dummies Questions & Answers

Move multipe files to corresponding directories

Hi, In a parent directory there are several files in the form IDENTIFIER1x IDENTIFIER1.yyy IDENTIFIER1_Z, etc IDENTIFIER2x IDENTIFIER2.yyy IDENTIFIER2_Z, etc IDENTIFIER3x IDENTIFIER3.yyy, IDENTIFIER3_Z, etcIn the same parent directory there are corresponding directories named... (7 Replies)
Discussion started by: spirospap
7 Replies

4. Shell Programming and Scripting

Recursively move directories along with files/specific files

I would like to transfer all files ending with .log from /tmp and to /tmp/archive (using find ) The directory structure looks like :- /tmp a.log b.log c.log /abcd d.log e.log When I tried the following command , it movies all the log files... (8 Replies)
Discussion started by: frintocf
8 Replies

5. OS X (Apple)

Batch file to move video files and retain sub-directories

I have just purchased my first ever Apple computer - and am therefore new to UNIX also. I would like to create a simple "batch file" (apologies if this is the wrong terminology) to do the following: When I plug my camera into the MAC it automatically downloads photos and videos into a new... (1 Reply)
Discussion started by: mm0mss
1 Replies

6. Shell Programming and Scripting

Loop to move files in different directories

Hi, I have various log files in different paths. e.g. a/b/c/d/e/server.log a/b/c/d/f/server.log a/b/c/d/g/server.log a/b/c/h/e/server.log a/b/c/h/f/server.log a/b/c/h/g/server.log a/b/c/i/e/server.log a/b/c/i/e/server.log a/b/c/i/e/server.log and above these have an archive folder... (6 Replies)
Discussion started by: acc01
6 Replies

7. Shell Programming and Scripting

want to move files in a dir into different directories based on the filename

I want to move the files in a dir to different dirs based on their file names. Ex: i have 4 different files with name - CTS_NONE_10476031_MRL_PFT20081215a.txt CTS_NONE_10633009_MRL_PFT20091020a.txt CTS_NONE_10345673_MRL_PFT20081215a.txt CTS_NONE_10872456_MRL_PFT20091020a.txt and the 1st... (4 Replies)
Discussion started by: Sriranga
4 Replies

8. UNIX for Dummies Questions & Answers

want to move files in a dir into different directories based on the filename

I want to move the files in a dir to different dirs based on their file names. Ex: i have 4 different files with name - CTS_NONE_10476031_MRL_PFT20081215a.txt CTS_NONE_10633009_MRL_PFT20091020a.txt CTS_NONE_10345673_MRL_PFT20081215a.txt CTS_NONE_10872456_MRL_PFT20091020a.txt and the 1st... (2 Replies)
Discussion started by: Sriranga
2 Replies

9. Shell Programming and Scripting

grep'ing for specific directories, and using the output to move files

Hello, this is probably another really simple tasks for most of you gurus, however I am trying to make a script which takes an input, greps a specific file for that input, prints back to screen the results (which are directory names) and then be able to use the directory names to move files.... (1 Reply)
Discussion started by: JayC89
1 Replies

10. UNIX for Dummies Questions & Answers

Script to move certain number of files every 10 minutes.

Hi, I need to move a certain number of files every 10 minutes from one folder to another. I have written the script below, however its not working, please advise. #! /bin/ksh start() { mv /test1/$(head -1000 /movetst) /test2/ sleep 600 } stop() { exit } ls ti* >... (1 Reply)
Discussion started by: amitsayshii
1 Replies
Login or Register to Ask a Question