Help with editing string elements


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Help with editing string elements
# 1  
Old 04-14-2011
Lightbulb Help with editing string elements

Hi All
I have a question.
I would like to edit some string characters by replacing with characters of choice located in another file. For example in sample file
Code:
>S5_SK1.chr01
NNNNNNNNNNNNNNNNNNNCAGCATGCAATAAGGTGACATAGATATACCCACACACCACACCCTAACACTAACCCTAATCTAACCCTGGCCAACCTGTTT
CTCAACTTACCCTCCATTACCCTACCTCCACTCGTTACCCTGTCCCATTCAACCATACCACTCCGAACCACCATCCATCCCTCTACTTACTACCACTCAC

excluding the first line beginning with ">" I would like to change string elements at position 10 which is N to T, at position 100 which is T to C and at position 157 which is an A to G in the above example based on information present in the following file:
Code:
chr01    10    T   
chr01    100   C    
chr01    157   G

So my expected output would be
Code:
>S5_SK1.chr01
NNNNNNNNNTNNNNNNNNNCAGCATGCAATAAGGTGACATAGATATACCCACACACCACACCCTAACACTAACCCTAATCTAACCCTGGCCAACCTGTTC
CTCAACTTACCCTCCATTACCCTACCTCCACTCGTTACCCTGTCCCATTCAACCATGCCACTCCGAACCACCATCCATCCCTCTACTTACTACCACTCAC

Can anyone suggest how I can do this, preferably using Perl as I'm learning that. I would like to perform this over multiple files and thus I'll really appreciate your input.
Hv a nice daySmilie
Cheers
# 2  
Old 04-14-2011
Try:
Code:
#!/usr/bin/perl
open I,"position_file";
@b=<I>;
%h=map {((split /\s+/)[1],(split /\s+/)[2]);} @b;
local $/;
open J,"main_file";
$_=<J>;
s/(>.*)//;
$n=$1;
s/\n//g;
@F=split //,$_;
for $i (keys %h){
  $F[$i-1]=$h{$i}
}
$_=join "", @F;
s/.{100}/$&\n/g;
print "$n\n";
print "$_\n";

Let me know if something needs explanation.
This User Gave Thanks to bartus11 For This Post:
# 3  
Old 04-14-2011
Code:
while read a a b
do
     sed "s/./$b/$a" infile >infile.tmp
     cat infile.tmp >infile
done <file2
rm infile.tmp

Where file2 is the file containing the
Code:
chr01 10 T  
chr01 100 C
...

stuff.
# 4  
Old 04-14-2011
@Bartus:
Hi Bartus I'm not getting any output with this apart from two blank lines !! Smilie
Any reason why that might be happening. My example main file is
Code:
>S5_SK1.chr01
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNCAGCATGCAATAAGGTGACATAGATATACCCACACACCACACCCTAACACTAACCCTAATCTAACCCTGGCCAACCTGTTT
CTCAACTTACCCTCCATTACCCTACCTCCACTCGTTACCCTGTCCCATTCAACCATACCACTCCGAACCACCATCCATCCCTCTACTTACTACCACTCAC
CCACCGTTACCCTCCAATTACCCATATCCAACTCCACTGCCACTTACCCTGCCATTCCTCTACCATCCACCATCTGCTACTCACTGTACTGTTGTTCTAC

and position file is
Code:
chr01   62      C
chr01   63      A
chr01   70      G
chr01   80      C
chr01   100     A
chr01   200     T
chr01   300     G
chr01   400     C
chr01   550     A
chr01   599     A

Cheers and thanks for looking into my question
Hv a nice day Smilie
# 5  
Old 04-14-2011
Weird... It is working for me:
Code:
oracle@solaris:~/unix/dna$ cat file
>S5_SK1.chr01
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNCAGCATGCAATAAGGTGACATAGATATACCCACACACCACACCCTAACACTAACCCTAATCTAACCCTGGCCAACCTGTTT
CTCAACTTACCCTCCATTACCCTACCTCCACTCGTTACCCTGTCCCATTCAACCATACCACTCCGAACCACCATCCATCCCTCTACTTACTACCACTCAC
CCACCGTTACCCTCCAATTACCCATATCCAACTCCACTGCCACTTACCCTGCCATTCCTCTACCATCCACCATCTGCTACTCACTGTACTGTTGTTCTAC

Code:
oracle@solaris:~/unix/dna$ cat pos
chr01   62      C
chr01   63      A
chr01   70      G
chr01   80      C
chr01   100     A
chr01   200     T
chr01   300     G
chr01   400     C
chr01   550     A
chr01   599     A

Code:
oracle@solaris:~/unix/dna$ ./a.pl
>S5_SK1.chr01
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCANNNNNNGNNNNNNNNNCNNNNNNNNNNNNNNNNNNNA
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNT
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNG
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAN
NNNNNNNNNNNNNNNNNNNCAGCATGCAATAAGGTGACATAGATATACCCACACACCACACCCTAACACTAACCCTAATCTAACCCTGGCCAACCTGTTT
CTCAACTTACCCTCCATTACCCTACCTCCACTCGTTACCCTGTCCCATTCAACCATACCACTCCGAACCACCATCCATCCCTCTACTTACTACCACTCAC
CCACCGTTACCCTCCAATTACCCATATCCAACTCCACTGCCACTTACCCTGCCATTCCTCTACCATCCACCATCTGCTACTCACTGTACTGTTGTTCTAC

Code:
oracle@solaris:~/unix/dna$ cat a.pl
#!/usr/bin/perl
open I,"pos";
@b=<I>;
%h=map {((split /\s+/)[1],(split /\s+/)[2]);} @b;
local $/;
open J,"file";
$_=<J>;
s/(>.*)//;
$n=$1;
s/\n//g;
@F=split //,$_;
for $i (keys %h){
  $F[$i-1]=$h{$i}
}
$_=join "", @F;
s/.{100}/$&\n/g;
print "$n\n";
print "$_\n";

This User Gave Thanks to bartus11 For This Post:
# 6  
Old 04-14-2011
Hi Bartus,
I understood why it was not workinig ... my file names were different from that in the script and so I changed it to
Code:
$ARGV[0] for position_file and $ARGV[1]

for main_file in the script and ran
Code:
./script.pl position_file main_file

and this worked Smilie
Cheers n hv a nice day ahead !!
Thanks

---------- Post updated at 11:46 AM ---------- Previous update was at 11:26 AM ----------

@Bartus:
Could you also explain the map part of the code the the use of Perl in built variables throughout the code? I'll appreciate that.
Cheers Smilie
# 7  
Old 04-14-2011
map function is used to create hash from the contents of @b array (which contains lines from postion_file). When map is executed, its code is run for each element of the @b array. When processing those elements, they are assigned into "$_" variable inside "map" block, so (split /\s+/)[1] and (split /\s+/)[2] operate on them, extracting second and third field (position and letter) from each line. Those two elements are then returned by map into %h hash, populating it with position as the key and the letter begin its value.

About second part of your question, which built-in variables do you have in mind?
This User Gave Thanks to bartus11 For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need help regarding String editing

Hi Geeks I am working on trimming the logs and extracting the XMLs from it. I am facing one problem here. My XML String is ending with ...........Request></Body></Envelope>S/R sometimes there is more then just S/R in the end. I want to delete anything comes after </Envelope>... (3 Replies)
Discussion started by: santy00110011
3 Replies

2. Shell Programming and Scripting

Editing part of the string

Hi guys got a problem here hope u all can help me. I learn that sed can actually edit a string but you need to know the old attribute to change to new 1. Example: sed "s/$title:$author/$title:$Nauthor/g" "Harry Potter - The Half Blood Prince:J.K Rowling:40.30:10:50" Each delimiter : represent... (4 Replies)
Discussion started by: GQiang
4 Replies

3. UNIX for Dummies Questions & Answers

Help with counting string elements

Hi All, I hv several files which have hundreds of lines each for example>XYZ.abc01 NNNTCGGTNNNNNCCACACACMYACACACCCACACCCACSCARCAC I'd like to exculde the first line beginning with ">" and then for the rest of the lines get a count for each string element. So for the above example I would like... (8 Replies)
Discussion started by: pawannoel
8 Replies

4. Fedora

Help with controlling string elements

Hi All, I have a general difficulty in understanding how to control single elements within a string. An example, XYZ1234 ABCD5678 My expected output is : ABCD1234 XYZ5678 (swapping subset of string elements of choice) XYZ37 ACBD1214 (making calculations... (6 Replies)
Discussion started by: pawannoel
6 Replies

5. Shell Programming and Scripting

Array with String Elements

How can I get my array to understand the double-quotes I'm passing into it are to separate text strings and not part of an element? here's what I'm working with... db2 -v connect to foo db2 -x "select '\"' || stats_command || '\",' from db2law1.parallel_runstats where tabname = 'BAZ'" set... (4 Replies)
Discussion started by: djschmitt
4 Replies

6. Homework & Coursework Questions

String editing using sed? awk?

1. The problem statement, all variables and given/known data: Problem Statement for project: When an account is created on the CS Unix network, a public html directory is created in the account's home directory. A default web page is put into that directory. Some users replace or... (13 Replies)
Discussion started by: peage1475
13 Replies

7. Shell Programming and Scripting

SH Script help. editing string

I have a string that looks like this username|field1|field2|field3 the data has a delimiter of "|" how can i edit field1, keeping the rest of the data the same also how can i edit field2 and 3. (3 Replies)
Discussion started by: nookie
3 Replies

8. Shell Programming and Scripting

string editing in files

Hi all, I'm fairly new to scripting in linux and need some help. I have an file that looks something like this: ~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Some comments # Some comments # Some comments # Some comments # Some comments # Some comments abc:/path/to/somewhere:X... (3 Replies)
Discussion started by: Avatar Gixxer
3 Replies

9. Shell Programming and Scripting

Search array elements as file for a matching string

I would like to find a list of files in a directory less than 2 days old and put them into an array variable. And then search for each file in the array for a matching string say "Return-code= 0". If it matches, then display the array element with a message as "OK". Your help will be greatly... (1 Reply)
Discussion started by: mkbaral
1 Replies

10. UNIX for Dummies Questions & Answers

Editing one string in multiple files

I am trying to edit multiple files from one directory and including all the files in all the sub directories. My string opens each file, puts the text on my screen and does not save the new information to the file. I am using a variable in my script, and wondering if that is what is choking it. ... (1 Reply)
Discussion started by: Skoshi
1 Replies
Login or Register to Ask a Question