sed parser behaving strange on replacing multiple words in multiple files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sed parser behaving strange on replacing multiple words in multiple files
# 1  
Old 12-08-2017
sed parser behaving strange on replacing multiple words in multiple files

I have 4000 files like

Code:
$cat clus_grp_seq10_g.phy 

 18 1002
anig_OJJ65951_1     ATGGTTTCGCAGCGTGATAGAGAATTGTTTAGGGATGATATTCGCTCGCGAGGAACGAAGCTCAATGCTGCCGAGCGCGAGAGTCTGCTAAGGCCATATCTGCCAGATCCGTCTGACCTTCCACGCAGGCCACTTCAGCGGCGCAAGAAGGTTCCTCG
aver_OOF92921_1     ATGGTTTCGCAACGAGAT---------AGAGAATTGAATATCACGGCTTCCTCAGGGGTCTCTGGCATTATGCTGGTGCTCAGATGAGGTTTGGC
anid_EAW13573_1     ATGGTCTCACAGCGTGACAGAGAGTTGGCTGTTGAATACCAGGGCTATCTCAGGGGTTTGTGGCATTACGCTGGGGCCCAGATGCGATTTGGC
azon_EAW20028_1     ATGGCCCTAGCACGTGATAGAGAATTACTGAGGGACACTATTCGCACCCAAGGGACCGCACTTACTGCTGCCGATCGCGAAAATATCCTGAAGCCATATCTGCCGGATCCATCAGAACTTGCACGTCGGCCACTACAGCGACAGAAGAAAGC
awen_EED46037_1     ATGGTATCACAACGGGATAGAGTGGTGTGTCTGCC------------------------------------------------CTCTACAGGTCA------AAACAGTGCGAAATA---------AA
acar_EAL84889_1     ATGGCCCT
akaw_EAWE3573_1     ---------ATGGTCTCAC---------AGCGTGACAGAGAGT---------TGGCTGTTGAATACCAGGGCTATCTCAGGGGTTTGTGGCATTACGC

I want to replace 7 patterns (aver, anid, anig, acar, azon, awen, akaw) in all the files. The resulting file should be like (No change in file name)


Code:
$cat clus_grp_seq10_g.phy 

 18 1002
anig     ATGGTTTCGCAGCGTGATAGAGAATTGTTTAGGGATGATATTCGCTCGCGAGGAACGAAGCTCAATGCTGCCGAGCGCGAGAGTCTGCTAAGGCCATATCTGCCAGATCCGTCTGACCTTCCACGCAGGCCACTTCAGCGGCGCAAGAAGGTTCCTCG
aver     ATGGTTTCGCAACGAGAT---------AGAGAATTGAATATCACGGCTTCCTCAGGGGTCTCTGGCATTATGCTGGTGCTCAGATGAGGTTTGGC
anid     ATGGTCTCACAGCGTGACAGAGAGTTGGCTGTTGAATACCAGGGCTATCTCAGGGGTTTGTGGCATTACGCTGGGGCCCAGATGCGATTTGGC
azon     ATGGCCCTAGCACGTGATAGAGAATTACTGAGGGACACTATTCGCACCCAAGGGACCGCACTTACTGCTGCCGATCGCGAAAATATCCTGAAGCCATATCTGCCGGATCCATCAGAACTTGCACGTCGGCCACTACAGCGACAGAAGAAAGC
awen     ATGGTATCACAACGGGATAGAGTGGTGTGTCTGCC------------------------------------------------CTCTACAGGTCA------AAACAGTGCGAAATA---------AA
acar     ATGGCCCT
akaw     ---------ATGGTCTCAC---------AGCGTGACAGAGAGT---------TGGCTGTTGAATACCAGGGCTATCTCAGGGGTTTGTGGCATTACGC

I wrote a bash script for this
Code:
#!/bin/bash
j=1
for ((i=0;i<=4000;i++));
do
echo "$j"

sed -e s/'aver_[^ ]*'/aver/g clus_grp_seq"$j"_g.phy | sed -e s/'anid_[^ ]*'/anid/g | sed -e s/'anig_[^ ]*'/anig/g | sed -e s/'acar_[^ ]*'/acar/g | sed -e s/'azon_[^ ]*'/azon/g | sed -e s/'awen_[^ ]*'/awen/g | sed -e s/'akaw_[^ ]*'/akaw/g -> clus_grp_seq"$j"_g.phy
wait
let j++
done

but the parser is making several files completely blank. In the folder some files like clus_grp_seq2000_g.phy does not exists, in such case blank file like clus_grp_seq2000_g.phy is OK. But in cases even the file exists in the folder like clus_grp_seq10_g.phy as shown above the parser is making blank files.
Please let me know the problem or suggest an alternative solution.
# 2  
Old 12-08-2017
Not digging too deep, I can see that
- your single quoting of the sed commands is consistently wrong - the entire respective command needs to be quoted.
- the redirection into the to-be-modified file truncates it before anything is read from it, so I'm very surprised that only several files should be completely blank.
- running 7 seds (i.e. creating 7 processes) for 4000 file is resource hungry and may become somewhat slow.
- running a for loop from 0 to 4000 with i the loop variable, why do you use another j variable?

How about
Code:
for FN in clus*phy; do sed '/aver\|anid\|anig\|acar\|azon\|awen\|akaw/ s/_[^ ]*//' $FN > ${FN}.tmp; done

Move the .tmp files to the origial ones when happy.

Last edited by RudiC; 12-08-2017 at 06:28 AM..
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Issue with search and replacing multiple items in multiple files

Im having an issue when trying to replace the first column with a new set of values in multiple files. The results from the following code only replaces the files with the last set of values in val.txt. I want to replace all the files with all the values. for date in {1..31} do for val in... (1 Reply)
Discussion started by: ncwxpanther
1 Replies

2. Shell Programming and Scripting

Replacing matched patterns in multiple files with awk

Hello all, I have since given up trying to figure this out and used sed instead, but I am trying to understand awk and was wondering how someone might do this in awk. I am trying to match on the first field of a specific file with the first field on multiple files, and append the second field... (2 Replies)
Discussion started by: karlmalowned
2 Replies

3. Shell Programming and Scripting

Replacing old TNS entries with New one in multiple files

I have requirement to replace old TNS entries with New one in multiple files. one file may contain more then one occurrence of tns. Example: Below is the one of occurrence in a current file(s). i am interested to replace only red part. <connection-pool name="Google_APP_CP"... (4 Replies)
Discussion started by: KDDubai333
4 Replies

4. Shell Programming and Scripting

USING sed to remove multiple strings/words from a line

Hi I use sed comnand to remove occurance of one workd from a line. However I need to removed occurance of dufferent words in ne line. Original-1 Hi this is the END of my begining Comand sed s/"END"/"start"/g Output-1 Hi this is the start of my beginig But I have more... (9 Replies)
Discussion started by: mnassiri
9 Replies

5. Shell Programming and Scripting

How to count the number of occurrence of words from multiple files?

File 1 aaa bbb ccc File 2 aaa xxx zzz bbb File 3 aaa bbb xxx Output: (4 Replies)
Discussion started by: Misa-Misa
4 Replies

6. Shell Programming and Scripting

Counting occurrences of all words in multiple files

Hey Unix gurus, I would like to count the number occurrences of all the words (regardless of case) across multiple files, preferably outputting them in descending order of occurrence. This is well beyond my paltry shell scripting ability. Researching, I can find many scripts/commands that... (4 Replies)
Discussion started by: twjolson
4 Replies

7. UNIX for Dummies Questions & Answers

best method of replacing multiple strings in multiple files - sed or awk? most simple preferred :)

Hi guys, say I have a few files in a directory (58 text files or somthing) each one contains mulitple strings that I wish to replace with other strings so in these 58 files I'm looking for say the following strings: JAM (replace with BUTTER) BREAD (replace with CRACKER) SCOOP (replace... (19 Replies)
Discussion started by: rich@ardz
19 Replies

8. Shell Programming and Scripting

Replacing text from multiple files at multiple location

Hi, I have many files scattered in all different folders. I want to replace the text within all the files using a single command ( awk, sed...) Is it possible? example find all the files in which there is text "memory" and replace it with "branded_memories". the files can be at the... (2 Replies)
Discussion started by: rudoraj
2 Replies

9. Shell Programming and Scripting

Replacing string in multiple files

Hi, I need to replace the string 'abcd' with 'xyz' in a file sample.xml This sample.xml is also present in the subdirectories of the current directory. Eg, If I am in /user/home/ the sample.xml if present in /user/home/ /user/home/folder1/ /user/home/folder2/... (3 Replies)
Discussion started by: arulanandsp
3 Replies

10. Shell Programming and Scripting

renaming multiple files while replacing string

hi, i've found a few examples of scripts to do this but for some reason can't get them to work properly. basically i have some dirs with a few hundred files mixed in with a bunch of other files that were made with a typo in part of them. long-file-names-tyo-example.ext want to be able... (2 Replies)
Discussion started by: kevin9
2 Replies
Login or Register to Ask a Question