Sponsored Content
Top Forums Shell Programming and Scripting Trimming sequences based on specific pattern Post 302432062 by Xterra on Wednesday 23rd of June 2010 06:36:26 PM
Old 06-23-2010
Trimming sequences based on specific pattern

My files look like this
Quote:
>GHXCZCC01AJ8CJ
TTGATGTGCTTGGTGTGTATCATTTCTGGGAAGCCCTACGCCCCGGGGC
>GHXCZCC01APUO5
T-ATGTGCCGTTGGTGTGTATCAGCTGGATTTCTGGGACGCAGCCCTACCCGGGGCGA
>GHXCZCC01AQSRP
TTGATGTTA---AGCTGGATTTTCTGGGACGCCCCGGGGAGCCCTA
>GHXCZCC01AQSRP
TTGTTGCCAGCTAGCTGAGCCCTAGATTTTCTGGGGCCCCGGGG
>GHXCZCC01AQSRP
TTGATGTTGCCCAGCCCTATAGCTGGATTTTCTGGGACGCCCCGGGGTGC
And I need to cut the sequences at the last "A" found in the following 'pattern' -highlighted for easier identification, the pattern is the actual file is not highlighted.
Quote:
AGCCCTA
The expected result should look like this
Quote:
>GHXCZCC01AJ8CJ
TTGATGTGCTTGGTGTGTATCATTTCTGGGAAGCCCTA
>GHXCZCC01APUO5
T-ATGTGCCGTTGGTGTGTATCAGCTGGATTTCTGGGACGCAGCCCTA
>GHXCZCC01AQSRP
TTGATGTTA---AGCTGGATTTTCTGGGACGCCCCGGGGAGCCCTA
>GHXCZCC01AQSRP
TTGTTGCCAGCTAGCTGAGCCCTA
>GHXCZCC01AQSRP
TTGATGTTGCCCAGCCCTA
Thus, all the sequences would end with AGCCCTA but whatever is to the left of that particular pattern and the identifiers (>GHXCZCC01AJ8CJ) should be kept intact.
Thanks in advance
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merge two file data together based on specific pattern match

My input: File_1: 2000_t g1110.b1 abb.1 2001_t g1111.b1 abb.2 abb.2 g1112.b1 abb.3 2002_t . . File_2: 2000_t Ali england 135 abb.1 Zoe british 150 2001_t Ali england 305 g1111.b1 Lucy russia 126 (6 Replies)
Discussion started by: patrick87
6 Replies

2. Shell Programming and Scripting

Concatenating and appending string based on specific pattern match

Input #GEO-1-type-1-fwd-Initial 890 1519 OPKHIJEFVTEFVHIJEFVOPKHIJTOPKEFVHIJTEFVOPKOPKHIJHIJHIJTTOPKHIJHIJEFVEFVOPKHIJOPKHIJOPKEFVEFVOPKHIJHIJEFVHIJHIJEFVTHIJOPKOPKTEFVEFVEFVOPKHIJOPKOPKHIJTTEFVEFVTEFV #GEO-1-type-2-fwd-Terminal 1572 2030... (7 Replies)
Discussion started by: patrick87
7 Replies

3. Shell Programming and Scripting

trimming sequences

My file looks like this: But I would like to 'trim' all sequences to the same lenght 32 characters, keeping intact all the identifier (>GHXCZCC01AJ8CJ) Would it be possible to use awk to perform this task? (2 Replies)
Discussion started by: Xterra
2 Replies

4. Shell Programming and Scripting

Removing specific sequences from file

My file looks like this But I need to remove the entry with the identifier >Reference1 along with the entire sequence. Thus, I will end up having the following file Thanks in advance! (2 Replies)
Discussion started by: Xterra
2 Replies

5. Shell Programming and Scripting

Trimming sequences based on Reference

My file looks something like this Wnat I need is to look for the Reference sequence (">Reference1") and based on the length of that sequence trim all the entries in that file. So, the rersulting file will contain all sequences with the same length, like this Thus, all sequences will keep... (5 Replies)
Discussion started by: Xterra
5 Replies

6. Shell Programming and Scripting

Delete files based on specific MMDDYYYY pattern in filename

Hi Unix gurus, I am trying to remove the filenames based on MMDDYYYY in the physical name as such so that the directory always has the recent 3 files based on MMDDYYYY. "HHMM" is just dummy in this case. You wont have two files with different HHMM on the same day. For example in a... (4 Replies)
Discussion started by: shankar1dada
4 Replies

7. Shell Programming and Scripting

Help with replace line based on specific pattern match

Input file data20714 7327 7366 detail data20714 7327 7366 main data250821 56532 57634 detail data250821 57527 57634 main data250821 57359 57474 main data250821 57212 57301 main data250821 57140 57159 detail data250821 56834 57082 main data250821 56708 56779 main ... (3 Replies)
Discussion started by: perl_beginner
3 Replies

8. UNIX for Dummies Questions & Answers

Trimming a string based on delimiter.

Hi, I have a string say "whateverCluster". I need everthing apart from the string "Cluster" Input: whateverCluster Desired output: whatever (5 Replies)
Discussion started by: mohtashims
5 Replies

9. Shell Programming and Scripting

Extract sequences based on the list

Hi, I have a file with more than 28000 records and it looks like below.. >mm10_refflat_ABCD range=chr1:1234567-2345678 tgtgcacactacacatgactagtacatgactagac....so on >mm10_refflat_BCD range=chr1:3234567-4545678... tgtgcacactacacatgactagtatgtgcacactacacatgactagta . . . . . so on ... (2 Replies)
Discussion started by: Diya123
2 Replies

10. Shell Programming and Scripting

Search for duplicates and delete but remain the first one based on a specific pattern

Hi all, I have been trying to delete duplicates based on a certain pattern but failed to make it works. There are more than 1 pattern which are duplicated but i just want to remove 1 pattern only and remain the rest. I cannot use awk '!x++' inputfile.txt or sed '/pattern/d' or use uniq and sort... (7 Replies)
Discussion started by: redse171
7 Replies
K5IDENTITY(5)							   MIT Kerberos 						     K5IDENTITY(5)

NAME
k5identity - Kerberos V5 client principal selection rules DESCRIPTION
The .k5identity file, which resides in a user's home directory, contains a list of rules for selecting a client principals based on the server being accessed. These rules are used to choose a credential cache within the cache collection when possible. Blank lines and lines beginning with # are ignored. Each line has the form: principal field=value ... If the server principal meets all of the field constraints, then principal is chosen as the client principal. The following fields are recognized: realm If the realm of the server principal is known, it is matched against value, which may be a pattern using shell wildcards. For host-based server principals, the realm will generally only be known if there is a domain_realm section in krb5.conf(5) with a map- ping for the hostname. service If the server principal is a host-based principal, its service component is matched against value, which may be a pattern using shell wildcards. host If the server principal is a host-based principal, its hostname component is converted to lower case and matched against value, which may be a pattern using shell wildcards. If the server principal matches the constraints of multiple lines in the .k5identity file, the principal from the first matching line is used. If no line matches, credentials will be selected some other way, such as the realm heuristic or the current primary cache. EXAMPLE
The following example .k5identity file selects the client principal alice@KRBTEST.COM if the server principal is within that realm, the principal alice/root@EXAMPLE.COM if the server host is within a servers subdomain, and the principal alice/mail@EXAMPLE.COM when accessing the IMAP service on mail.example.com: alice@KRBTEST.COM realm=KRBTEST.COM alice/root@EXAMPLE.COM host=*.servers.example.com alice/mail@EXAMPLE.COM host=mail.example.com service=imap SEE ALSO
kerberos(1), krb5.conf(5) AUTHOR
MIT COPYRIGHT
1985-2013, MIT 1.11.3 K5IDENTITY(5)
All times are GMT -4. The time now is 05:17 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy