Rename text file with a specific pattern in directory


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Rename text file with a specific pattern in directory
# 1  
Old 02-09-2017
Rename text file with a specific pattern in directory

I am trying to rename all text files in a directory that match a pattern. The current command below seems to be using the directory path in the name and since it already exists, will not do the rename. I am not sure what I am missing? Thank you Smilie.


Files to rename in /home/cmccabe/Desktop/test/vcf/overall/annovar
Code:
16-0000_File-A_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt
16-0002_File-B_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt
16-0005_File-C_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt

desired output
Code:
16-0000_File-A_hg19multianno.txt
16-0002_File-B_hg19multianno.txt
16-0005_File-C_hg19multianno.txt

Code:
rename 's/(.*?_[^_]+).*/${1}_hg19multianno.txt/g' /home/cmccabe/Desktop/test/vcf/overall/annovar/*_classify.txt

# 2  
Old 02-10-2017
Hi,

I've come up with the following script which does what you need, I think:

Code:
#!/bin/bash

pattern1="variant_strandbias_readcount.vcf."
pattern2_old="hg19_multianno"
pattern2_new="hg19multianno"
pattern3="_dbremoved_removed_final_index_inheritence_import_classify"

for file in `/bin/ls *.txt`
do
        newname=`echo $file | /bin/sed s/$pattern1//g | /bin/sed s/$pattern2_old/$pattern2_new/g | /bin/sed s/$pattern3//g`
        /bin/mv -fv $file $newname
done

If I run it, this is what I get:

Code:
$ ls -1
16-0000_File-A_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt
16-0002_File-B_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt
16-0005_File-C_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt
script.sh
$ ./script.sh 
'16-0000_File-A_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt' -> '16-0000_File-A_hg19multianno.txt'
'16-0002_File-B_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt' -> '16-0002_File-B_hg19multianno.txt'
'16-0005_File-C_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt' -> '16-0005_File-C_hg19multianno.txt'
$ ls -1
16-0000_File-A_hg19multianno.txt
16-0002_File-B_hg19multianno.txt
16-0005_File-C_hg19multianno.txt
script.sh
$

Hope this helps.
This User Gave Thanks to drysdalk For This Post:
# 3  
Old 02-10-2017
Quote:
Originally Posted by drysdalk
Hi,

I've come up with the following script which does what you need, I think:

Code:
#!/bin/bash

pattern1="variant_strandbias_readcount.vcf."
pattern2_old="hg19_multianno"
pattern2_new="hg19multianno"
pattern3="_dbremoved_removed_final_index_inheritence_import_classify"

for file in `/bin/ls *.txt`
do
        newname=`echo $file | /bin/sed s/$pattern1//g | /bin/sed s/$pattern2_old/$pattern2_new/g | /bin/sed s/$pattern3//g`
        /bin/mv -fv $file $newname
done

If I run it, this is what I get:

Code:
$ ls -1
16-0000_File-A_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt
16-0002_File-B_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt
16-0005_File-C_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt
script.sh
$ ./script.sh 
'16-0000_File-A_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt' -> '16-0000_File-A_hg19multianno.txt'
'16-0002_File-B_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt' -> '16-0002_File-B_hg19multianno.txt'
'16-0005_File-C_variant_strandbias_readcount.vcf.hg19_multianno_dbremoved_removed_final_index_inheritence_import_classify.txt' -> '16-0005_File-C_hg19multianno.txt'
$ ls -1
16-0000_File-A_hg19multianno.txt
16-0002_File-B_hg19multianno.txt
16-0005_File-C_hg19multianno.txt
script.sh
$

Hope this helps.
Note that several short cuts can be used to speed up the above script and reduce chances for failures due to system limits on the size of argument lists allowed when execing a file...

The command:
Code:
for file in `/bin/ls *.txt`

produces exactly the same list of files to be processed as:
Code:
for file in *.txt

as long as you don't have any directories located in the current working directory with names ending in .txt (which, although possible, would be unconventional); none of the selected files have names containing any <space>, <tab>, or <newline> characters (which cause "file not found" errors when using ls *.txt, but will work correctly when just using *.txt); and, if there are a lot of file, /bin/ls *.txt can fail if the shell runs out of memory producing the list of filenames matching *.txt or the list exceeds the ARG_MAX system limit while just using *.txt will only fail if the shell runs out of memory producing the list of filenames. In cases where the given pattern does match a directory name, /bin/ls pattern will give you a list of the unhidden files in directories matching pattern while just using pattern will give you the names of the directories instead of the names of the files in the directories.

The command:
Code:
        newname=`echo $file | /bin/sed s/$pattern1//g | /bin/sed s/$pattern2_old/$pattern2_new/g | /bin/sed s/$pattern3//g`

invokes three copies of /bin/sed when only one is needed. That takes more time, more swap space, more memory, more ... Try the following instead:
Code:
        newname=`echo $file | /bin/sed -e s/$pattern1//g -e s/$pattern2_old/$pattern2_new/g -e s/$pattern3//g`

to get exactly the same results taking less time, less swap space, less memory, less ... Note also that there is no need to to the g flag on these substitutions since you are only trying to remove one copy of each of these patterns, each of these substitutions will fail if any of the variables being used in these substitutions contain any <space>, <tab>, or <newline> characters. And, if any of the filenames being modified starts with a <hyphen> or contains any <space>, <tab>, <newline>, or <backslash> characters, echo might not produce the results you want. Therefore, I would suggest using:
Code:
        newname=`printf '%s\n' "$file" | /bin/sed -e "s/$pattern1//" -e "s/$pattern2_old/$pattern2_new/" -e "s/$pattern3//"`

instead. Note also that with most modern shells all of these changes could be performed in the shell with various variable substitutions instead of using command substitution to invoke sed. But, since we don't know that operating system or shell are being used by the submitter of this thread, I won't go there.

And, for safety in case of some of the characters listed above might appear in filenames, I would also change:
Code:
        /bin/mv -fv $file $newname

to:
Code:
        /bin/mv -fv "$file" "$newname"

These 2 Users Gave Thanks to Don Cragun For This Post:
# 4  
Old 02-21-2017
Thank you both, works perfectly. Sorry for the delay I was out of the country, any just out of curiosity why did the rename not work as expected? Thank you Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Match all lines in file where specific text pattern is less than

In the below file I am trying to grep or similar, all lines where only AF= is less than 0.4.. Thank you :). grep grep "AF=" ,+ .4 file file 12 112036782 . T C 34.0248 PASS ... (3 Replies)
Discussion started by: cmccabe
3 Replies

2. UNIX for Beginners Questions & Answers

How to zip csv files having specific pattern in a directory using UNIX shell script?

I have files in a Linux directory . Some of the file is listed below -rw-rw-r--. 1 roots roots 0 Dec 23 02:17 zzz_123_00000_A_1.csv -rw-rw-r--. 1 roots roots 0 Dec 23 02:18 zzz_121_00000_A_2.csv -rw-rw-r--. 1 roots roots 0 Dec 23 02:18 zzz_124_00000_A_3.csv drwxrwxr-x. 2 roots roots 6 Dec 23... (4 Replies)
Discussion started by: Balraj
4 Replies

3. Shell Programming and Scripting

Rename specific file extension in directory with match to another file in bash

I have a specific set (all ending with .bam) of downloaded files in a directory /home/cmccabe/Desktop/NGS/API/2-15-2016. What I am trying to do is use a match to $2 in name to rename the downloaded files. To make things a more involved the date of the folder is unique and in the header of name... (1 Reply)
Discussion started by: cmccabe
1 Replies

4. Shell Programming and Scripting

Extract specific line in an html file starting and ending with specific pattern to a text file

Hi This is my first post and I'm just a beginner. So please be nice to me. I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file? I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies

5. Shell Programming and Scripting

Rename the file with specific pattern

Hello I am making a script where I need to rename the files but with different names.The file name could be change according to the product I made a logic but that is not working properly arr=$(echo a@b@c | tr "@" "\n") echo $arr output is a b c arry=$(echo d@e@f | tr "@" "\n") ... (4 Replies)
Discussion started by: anuragpgtgerman
4 Replies

6. UNIX for Dummies Questions & Answers

look for file size greater than "0" of specific pattern and move those to another directory

Hi , i have some files of specific pattern ...i need to look for files which are having size greater than zero and move those files to another directory.. Ex... abc_0702, abc_0709, abc_782 abc_1234 ...etc need to find out which is having the size >0 and move those to target directory..... (7 Replies)
Discussion started by: dssyadav
7 Replies

7. Shell Programming and Scripting

Help with remove last text of a file that have specific pattern

Input file matrix-remodelling_associated_8_ aurora_interacting_1_ L20 von_factor_A_domain_1 ATP_containing_3B_ . . Output file matrix-remodelling_associated_8 aurora_interacting_1 L20 von_factor_A_domain_1 ATP_containing_3B . . (3 Replies)
Discussion started by: perl_beginner
3 Replies

8. Shell Programming and Scripting

extract specific string and rename file

Hi all, I am working on a small prog.. i have a file.txt which contains random data... K LINES V4 ADD CODE `COMPANY` ADD CODE `DISTRIBUTOR` SEQ NAME^K LINES V5 SEQ NAME^K LINES V6 ADD `PACK-LDATE` SEQ NAME^K^KCOMMAND END^KHEADINFO... (1 Reply)
Discussion started by: mukeshguliao
1 Replies

9. Shell Programming and Scripting

How can i break a text file into parts that occur between a specific pattern

How can i break a text file into parts that occur between a specific pattern? I have text file having various xml many tags like which starts with the tag "<?xml version="1.0" encoding="utf-8"?>" . I have to break the whole file into several xmls by looking for the above pattern. All the... (9 Replies)
Discussion started by: abhinav192
9 Replies

10. UNIX for Dummies Questions & Answers

extracting text and reusing the text to rename file

Hi, I have some ps files where I want to ectract/copy a certain number from and use that number to rename the ps file. eg: 'file.ps' contains following text: 14 (09 01 932688 0)t the text can be variable, the only fixed element is the '14 ('. The problem is that the fixed element can appear... (7 Replies)
Discussion started by: JohnDS
7 Replies
Login or Register to Ask a Question