Split strings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split strings
# 1  
Old 01-29-2011
Split strings

Hello

I started to learn bash about 1 or 2 weeks, please help me.

I have about 200000 strings like these:

Code:
 ATGCCAGGGGAGCCCAGAAGGTAAAACTTGATCTGAAATGTATGTTTATATATAATTTAGGTAATCAATTGGCATGTGAA

and I need to split each letter to get:

Code:
 A T G C C A G G G G A G C C C A G A A G G T A A A A C T T G A T C T G A A A T G T A T G T T T A T A T A T A A T T T A G G T A A T C A A T T G G C A T G T G A A

I need this because I want to process each letter like a column by awk, in order to get de porcentual frequency of letters in each position. So if you help me with the awk code too, I will appreciate it so much. Smilie
# 2  
Old 01-29-2011
Hi.

Does this give an idea? ... cheers, drl
Code:
echo abc | awk -F "" '{ print $2 }'
b

This User Gave Thanks to drl For This Post:
# 3  
Old 01-29-2011
I already had success splinting the strings whit this sed code:

Code:
 sed -e s/A/A\ /g  -e s/T/T\ /g -e s/G/G\ /g -e s/C/C\ /g

But whit

Code:
 awk -F ""

the it seems not necessary split the strings xD
thanks drl
# 4  
Old 01-29-2011
so why do you need to split the string?? Why not process it as is in awk?
You don't need the space to split it.
# 5  
Old 01-29-2011
Because I didn't know it isn't necessary
# 6  
Old 01-29-2011
k so have you done any C programing or 4GL?
So an array of 26 for the counter results. Then we can walk the string in a loop and depending on the value of the char at pointer we increment the array slot for that value. shell will feed the awk script each line 1 at a time.

Does this sound like a solution or do you have another way?
# 7  
Old 01-29-2011
I don't have previous experience programming.
I'm a undergraduate student of Biochemistry and I have to process large genomic databases to accomplish my thesis goals. So, that's why I started to learn Bash few days a go and I'll start to learn Python soon.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to pass strings from a list of strings from another file and create multiple files?

Hello Everyone , Iam a newbie to shell programming and iam reaching out if anyone can help in this :- I have two files 1) Insert.txt 2) partition_list.txt insert.txt looks like this :- insert into emp1 partition (partition_name) (a1, b2, c4, s6, d8) select a1, b2, c4, (2 Replies)
Discussion started by: nubie2linux
2 Replies

2. UNIX for Dummies Questions & Answers

Issue when using egrep to extract strings (too many strings)

Dear all, I have a data like below (n of rows=400,000) and I want to extract the rows with certain strings. I use code below. It works if there is not too many strings for example n of strings <5000. while I have 90,000 strings to extract. If I use the egrep code below, I will get error: ... (3 Replies)
Discussion started by: forevertl
3 Replies

3. Shell Programming and Scripting

Split certain strings in a line for a specific column.

Hi, i need help to extract certain strings/words from lines with different length. I have 3 columns separated by tab delimiter. like below Probable arabinan endo-1,5-alpha-L-arabinosidase A (EC 3.2.1.99) (Endo-1,5-alpha-L-arabinanase A) (ABN A) abnA Ady3G14620 Probable arabinan... (5 Replies)
Discussion started by: redse171
5 Replies

4. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

5. Shell Programming and Scripting

awk to split one field and print the last two fields within the split part.

Hello; I have a file consists of 4 columns separated by tab. The problem is the third fields. Some of the them are very long but can be split by the vertical bar "|". Also some of them do not contain the string "UniProt", but I could ignore it at this moment, and sort the file afterwards. Here is... (5 Replies)
Discussion started by: yifangt
5 Replies

6. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of... (11 Replies)
Discussion started by: raidzero
11 Replies

7. UNIX for Dummies Questions & Answers

Delete strings in file1 based on the list of strings in file2

Hello guys, should be a very easy questn for you: I need to delete strings in file1 based on the list of strings in file2. like file2: word1_word2_ word3_word5_ word3_word4_ word6_word7_ file1: word1_word2_otherwords..,word3_word5_others... (7 Replies)
Discussion started by: roussine
7 Replies

8. Shell Programming and Scripting

Split strings based on length

Hi All I am very much in need of help splitting strings based on length in Perl. e.g., Input text is : International NOUN Corp. NOUN 's POS Tulsa NOUN Output I want is : International I In Int Inte l al nal onal NOUN Corp. C Co Cor Corp . p. rp. orp. NOUN... (2 Replies)
Discussion started by: my_Perl
2 Replies

9. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

10. Shell Programming and Scripting

How to concatenate two strings or several strings into one string in B-shell?

like connect "summer" and "winter" to "summerwinter"? Can anybody help me? thanks a lot. (2 Replies)
Discussion started by: fontana
2 Replies
Login or Register to Ask a Question