Quick script to rename files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Quick script to rename files
# 8  
Old 11-15-2014
Quote:
Originally Posted by Aia
@jonesal2
...
...
The regex capabilities of Perl or even Awk are superior that anything the shell has to offer.
True.
They're limited in the shell, but they can work in some cases...like a 'one-off' situation maybe.
Code:
#!/bin/sh
for fname in *_*_*_*_*_*_*
do
n1=$(expr "$fname" : '.*_\(.*_.*_.*_\).*_.*_.*')
n2=$(expr "$fname" : '.*_.*_.*_.*_.*_\(.*_.*\)')
echo $n1$n2
done


Last edited by ongoto; 11-15-2014 at 08:16 PM.. Reason: mispelled fname
# 9  
Old 11-15-2014
Quote:
Originally Posted by ongoto
True.
They're limited in the shell, but they can work in some cases...like a 'one-off' situation maybe.
Code:
#!/bin/sh
for fname in *_*_*_*_*_*_*
do
n1=$(expr "$fname" : '.*_\(.*_.*_.*_\).*_.*_.*')
n2=$(expr "$fname" : '.*_.*_.*_.*_.*_\(.*_.*\)')
echo $n1$n2
done

The only problem with this is that expr isn't a shell built-in. So, when processing 30,000 files, it creates an additional 60,000 processes.

If:
Code:
        rename $src, $dst unless $src eq $dst;

is a built-in in perl, this alone is a strong argument to use perl rather than awk or the shell script I suggested.

If there are other files in this directory with six or more underscores in their names, the OP needs to give us naming details so we can alter the EREs or filename matching patterns to correctly select the files to be processed.
# 10  
Old 11-15-2014
@ Don Cragun

I never thought of it like that, but you are right. That is a ton of overhead. The OP asked for a quick script and didn't elaborate much so I figured it was a small job.

I'll go along with Perl though. It's regex and string handling is the bomb. Some say Python is it's replacement, but I dunno. Smilie
# 11  
Old 11-16-2014
Quote:
Originally Posted by Don Cragun
[..]
Code:
#!/bin/ksh
for i in *_*_*_*_*_*_*
do	printf '%s\n' "$i" | { IFS="_" read x s f y x o z
		echo mv "$i" "${s}_${f}_${y}_${o}_$z"
	}
done

Quote:
Originally Posted by Don Cragun
The only problem with this is that expr isn't a shell built-in. So, when processing 30,000 files, it creates an additional 60,000 processes.
[..]
The approach above will still require an additional process for every file, because of the pipe.

A here-doc or here-string, would not have that drawback:
Code:
for i in *_*_*_*_*_*_\(*\)
do
  IFS=_ read x s f y x o z << EOF
$i
EOF
  echo mv "$i" "${s}_${f}_${y}_${o}_$z"
done

or in modern bash/ksh93/zsh
Code:
for i in *_*_*_*_*_*_\(*\)
do
  IFS=_ read x s f y x o z <<< "$i"
  echo mv "$i" "${s}_${f}_${y}_${o}_$z"
done

or use variable expansion:
Code:
for i in *_*_*_*_*_*_\(*\)
do
  first=${i%_*_*_*}
  last=${i#"$first"_}
  echo mv "$i" "${first#*_}_${last#*_}"
done

But at any rate every mv command will still require one process per file..

----

Quote:
Originally Posted by Aia
@jonesal2


Perl and Awk do have a place on it, especially, when that "easily be done" in the shell can make a file named

This_is_one_I_want_as_is into is_one_I_as_is unintentionally, when all you want is x_surname_firstname_y_20141115_OS_(z) into surname_firstname_y_OS_(z)

Suddenly you find yourself, figuring out what kind of glob you can pass to the for loop, to limit the range, or what kind of check you have to perform inside the body to deal with unwanted matches. The regex capabilities of Perl or even Awk are superior that anything the shell has to offer.
It really depends on how much precision is required. More precision means more complexity, so you would only use it if needed.

Globbing will do fine in the majority of the situations and can also be made more precise if need be. If situation demands then that would need to be tightened.

In situations where more precision is required than globbing can handle, then I agree regex give you tighter control. You could use Perl for that, but modern bash, ksh93 and bash also provide the possibilty of using regex .

But you would still need to check, because
Code:
$dst =~  s/\d+_(\w+_\w+_\d+)_\d+(_\w+_\(\d+\))/$1$2/;

may also not be tight enough (you might need a front and back anchoring for example, or \w may match unwanted digits)

So you will find yourself figuring out what regex to use. In either case it is a good idea to print first and check..

----

On the other hand, I do agree with Don that the performance advantage is a strong argument for using Perl in this case, since rename is a Perl builtin, the operation would not require an additional process for every mv command...

BTW. There appear to be some caveats to the rename function, so extra testing is required:

Quote:
Changes the name of a file; an existing file NEWNAME will be clobbered. Returns true for success, false otherwise.

Behavior of this function varies wildly depending on your system implementation. For example, it will usually not work across file system boundaries, even though the system mv command sometimes compensates for this. Other restrictions include whether it works on directories, open files, or pre-existing files. Check perlport and either the rename(2) manpage or equivalent system documentation for details.

For a platform independent move function look at the File::Copy module.
rename - perldoc.perl.org

Last edited by Scrutinizer; 11-16-2014 at 04:52 AM..
# 12  
Old 11-16-2014
Quote:
Originally Posted by Scrutinizer
The approach above will still require an additional process for every file, because of the pipe.

A here-doc or here-string, would not have that drawback:
Code:
for i in *_*_*_*_*_*_\(*\)
do
  IFS=_ read x s f y x o z << EOF
$i
EOF
  echo mv "$i" "${s}_${f}_${y}_${o}_$z"
done

or in modern bash/ksh93/zsh
Code:
for i in *_*_*_*_*_*_\(*\)
do
  IFS=_ read x s f y x o z <<< "$i"
  echo mv "$i" "${s}_${f}_${y}_${o}_$z"
done

or use variable expansion:
Code:
for i in *_*_*_*_*_*_\(*\)
do
  first=${i%_*_*_*}
  last=${i#"$first"_}
  echo mv "$i" "${first#*_}_${last#*_}"
done

But at any rate every mv command will still require one process per file..

----


It really depends on how much precision is required. More precision means more complexity, so you would only use it if needed.

Globbing will do fine in the majority of the situations and can also be made more precise if need be. If situation demands then that would need to be tightened.

In situations where more precision is required than globbing can handle, then I agree regex give you tighter control. You could use Perl for that, but modern bash, ksh93 and bash also provide the possibilty of using regex .

But you would still need to check, because
Code:
$dst =~  s/\d+_(\w+_\w+_\d+)_\d+(_\w+_\(\d+\))/$1$2/;

may also not be tight enough (you might need a front anchor for example, or \w may match unwanted digits)

So you will find yourself figuring out what regex to use. In either case it is a good idea to print first and check..

----

On the other hand, I do agree with Don that the performance advantage is a strong argument for using Perl in this case, since rename is a Perl builtin, the operation would not require an additional process for every mv command...

BTW. There appear to be some caveats to the rename function, so extra testing is required:


rename - perldoc.perl.org
All of the above are fine alternatives to the script I suggested, but the script I suggested and all of the above alternatives (other than using perl) use the same number of processes.

The standards say that the elements of a pipeline may be executed in the current shell execution environment (as ksh does) or in a subshell environment (as bash does). Neither of these create a new process as long as the commands in that subshell are shell built-ins. This can be seen using a slight modification of the script I suggested that prints the PID at the start of the script and in the last element of the pipeline (and adds the escaped parentheses to the pattern):
Code:
echo $$
for i in *_*_*_*_*_*_\(*\)
do	printf '%s\n' "$i" | { IFS="_" read x s f y x o z
		echo $$
		echo mv "$i" "${s}_${f}_${y}_${o}_$z"
	}
done

Running this with ksh in a directory containing two matching files produces:
Code:
$ ksh tester
66443
66443
mv 123_surname_firstname_y_20141115_OS_(456) surname_firstname_y_OS_(456)
66443
mv x_surname_firstname_y_20141115_OS_(z) surname_firstname_y_OS_(z)
$

and bash (in the same directory) produces:
Code:
bash-3.2$ bash tester
66452
66452
mv 123_surname_firstname_y_20141115_OS_(456) surname_firstname_y_OS_(456)
66452
mv x_surname_firstname_y_20141115_OS_(z) surname_firstname_y_OS_(z)
bash-3.2$

So, if rename in the version of perl on your system does what you need; use perl (which will only need two processes to rename all of the files). Otherwise, any of the shell scripts Scrutinizer and I suggested should get the job done using (n + 1) processes where n is the number of files to be renamed.
# 13  
Old 11-16-2014
Quote:
Originally Posted by Don Cragun
All of the above are fine alternatives to the script I suggested, but the script I suggested and all of the above alternatives (other than using perl) use the same number of processes.

The standards say that the elements of a pipeline may be executed in the current shell execution environment (as ksh does) or in a subshell environment (as bash does). Neither of these create a new process as long as the commands in that subshell are shell built-ins. This can be seen using a slight modification of the script I suggested that prints the PID at the start of the script and in the last element of the pipeline (and adds the escaped parentheses to the pattern):
Code:
echo $$
for i in *_*_*_*_*_*_\(*\)
do	printf '%s\n' "$i" | { IFS="_" read x s f y x o z
		echo $$
		echo mv "$i" "${s}_${f}_${y}_${o}_$z"
	}
done

Running this with ksh in a directory containing two matching files produces:
Code:
$ ksh tester
66443
66443
mv 123_surname_firstname_y_20141115_OS_(456) surname_firstname_y_OS_(456)
66443
mv x_surname_firstname_y_20141115_OS_(z) surname_firstname_y_OS_(z)
$

and bash (in the same directory) produces:
Code:
bash-3.2$ bash tester
66452
66452
mv 123_surname_firstname_y_20141115_OS_(456) surname_firstname_y_OS_(456)
66452
mv x_surname_firstname_y_20141115_OS_(z) surname_firstname_y_OS_(z)
bash-3.2$

So, if rename in the version of perl on your system does what you need; use perl (which will only need two processes to rename all of the files). Otherwise, any of the shell scripts Scrutinizer and I suggested should get the job done using (n + 1) processes where n is the number of files to be renamed.
Hi Don,

That pertains to the RHS of the pipeline, which can be executed in a subshell or in the foreground. The LHS is always executed in a subshell.

This can easily be checked:
Code:
for i in dash bash ksh zsh
do 
  shl=$i $i -c "{ u=1 ;} | { v=1;}; echo \"\$shl: \\\$u:\$u,\\\$v:\$v\""
done

Code:
dash: $u:,$v:
bash: $u:,$v:
ksh: $u:,$v:1
zsh: $u:,$v:1

This shows that ksh and zsh execute the RHS in the foreground. The RHS in the other shells and the LHS in all shells is executed in a subshell. In bash 4 there is a separate setting (not the default) that can be turned on so that the RHS is also executed in the foreground.

The process id within a subshell cannot be tested with $$. A subshell is a child process that inherits the environment of the parent shell, including the variable $$. Therefore in a subshell of the parent shell, $$ will represent the parent's $$.

Quote:
A subshell environment shall be created as a duplicate of the shell environment, except that signal traps that are not being ignored shall be set to the default action. Changes made to the subshell environment shall not affect the shell environment. Command substitution, commands that are grouped with parentheses, and asynchronous lists shall be executed in a subshell environment. Additionally, each command of a multi-command pipeline is in a subshell environment; as an extension, however, any or all commands in a pipeline may be executed in the current environment. All other commands shall be executed in the current shell environment.
Shell Command Language


The pid of a subshell can still be checked by starting a child process that is not a subshell and checking $PPID:
Code:
sh -c 'echo $PPID'

I used the following script to test:

Code:
for i in dash bash ksh zsh
do
  shl=$i $i -c "{ echo \"\$shl:LHS:\$(sh -c 'echo \$PPID')\">>tst.out;} | { echo \"\$shl:RHS:\$(sh -c 'echo \$PPID')\">>tst.out;}; echo \$shl:parent:\$\$ >> tst.out"
done

Code:
$ cat tst.out
dash:RHS:57607
dash:LHS:57606
dash:parent:57605
bash:LHS:57613
bash:RHS:57614
bash:parent:57610
ksh:LHS:57618
ksh:RHS:57617
ksh:parent:57617
zsh:LHS:57623
zsh:RHS:57621
zsh:parent:57621

Again this shows that the RHS of pipelines in ksh and zsh are executed in the foreground, but the rest are not..


Therefore using a pipeline in the file moving loop, earlier in the thread requires (2n+1) or (3n+1) processes including the ones for the mv command (depending on the shell that is used), while using a heredoc/string or parameter expansions leads to (n+1) processes.

Last edited by Scrutinizer; 11-16-2014 at 11:12 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 14  
Old 11-16-2014
Thanks for all the replies, I wasn't expecting it to be that complex.

I sort of thought the script might simply search the filename string for the first underscore and delete it plus all the characters before it, then search for (what would then be) the third underscore delete it plus the next 8 characters (since the time stamp is always YYYYMMDD)?

Or am I being too simplistic?

Last edited by jonesal2; 11-16-2014 at 04:33 PM.. Reason: typo
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script for rename many files

Hello friends! I have a problem with my script. I'm a italian boy. Sorry for my english ehehehehehhe. I've many files .jpg and I would like rename they in this mode: I have not files with progressive number e I would like rename with progressive number. Example: DSC_0012.JPG DSC_0582.JPG... (7 Replies)
Discussion started by: vegetablu
7 Replies

2. Shell Programming and Scripting

Script to unzip files and Rename the Output-files

Hi all, I have a many folders with zipped files in them. The zipped files are txt files from different folders. The txt files have the same names. If i try to find . -type f -name "*.zip" -exec cp -R {} /myhome/ZIP \; it fails since the ZIP files from different folders have the same names and... (2 Replies)
Discussion started by: pmkenya
2 Replies

3. Shell Programming and Scripting

Rename files in the script

Hi All, I want to write a script to rename the file in to the incremental order for example Original file filename=/nfs/n1/file1.img filename=/nfs/n1/file1.img filename=/nfs/n1/file1.img filename=/nfs/n1/file1.img filename=/nfs/n1/file1.img I want output shpuld be... (4 Replies)
Discussion started by: mangeshpardhi
4 Replies

4. Shell Programming and Scripting

Script to rename files

I have the following directories in my home directory, my scripts dbmig es ms_done my-home I want my output to look like the following MyScripts DbmigEs MsDone MyHome Basically, I want to get rid of spaces,special characters and convert the first letter of each word to uppercase and... (1 Reply)
Discussion started by: ramky79
1 Replies

5. Shell Programming and Scripting

Script Rename files

Hello, I have this problem. In a directory I have 4 csv files with this format: PHOENIX_KM_INTERAZIONI_YYYYMMDD.csv PHOENIX_KM_TRIPLETTE_YYYYMMDD.csv NEWCAB_KM_INTERAZIONI_YYYYMMDD.csv NEWCAB_KM_INTERAZIONI_YYYY_MM_DD.csv YYYYMMDD: format CURRENT date I wont rename all files in... (4 Replies)
Discussion started by: manichino74
4 Replies

6. UNIX for Dummies Questions & Answers

Script to Rename Files

I wrote a simple script that converts my windows text files to unix, so that I can compare them to different unix files purposes of my project. win2unix file1.txt file1Win.txt win2unix file2.txt file2Win.txt etc Is there a way to simplify this to: <while .txt in... (5 Replies)
Discussion started by: idano530
5 Replies

7. Shell Programming and Scripting

Shell Script to rename files

Hi, i need a bit of help writting a tcsh script which renames all ascii text files in the current directory by adding a number to their names before the extension so for example, a directory containing the files Hello.txt Hello.t Hello should have the following changes, Hello.txt... (2 Replies)
Discussion started by: yakuzaa
2 Replies

8. Shell Programming and Scripting

Script to rename files

Let me preface this by stating I have absolutely no idea what I'm doing in this arena, but I'm in need of a little help here. I need to take filenames like this: amwed_0402c-slug~1-cp.jpg And reduce them to slug~1.jpg That is, I need to remove the first 12 and last 3 characters. The... (3 Replies)
Discussion started by: cpreovol
3 Replies

9. UNIX for Dummies Questions & Answers

Script to rename files

Have files of the sort 3p1522015.dgn and need to have them renamed to 152201.dgn. Essentially dropping the 1st 2 characters and the last. I'm relatively new to UNIX and uncertain of where to start. I hope this provides enough detail. Thanks (5 Replies)
Discussion started by: Dinkster
5 Replies

10. OS X (Apple)

Rename Files with a script ?

Hi All !!! Is there any solution to get rid of / " * in old files names WITH A SCRIPT (About 100 Gb of old files) I know it can be done i just dont know how ! Hope that some one can help Best R. Yovel (1 Reply)
Discussion started by: yoveln
1 Replies
Login or Register to Ask a Question