Sponsored Content
Top Forums Shell Programming and Scripting Removing duplicate sequences and modifying a text file Post 302964555 by RudiC on Friday 15th of January 2016 12:35:55 PM
Old 01-15-2016
I guess the missing space in record 1 is by accident. Try
Code:
awk '{sub($1 FS,">")} !T[$1]++' RS=">" ORS="" file1
>gene=XLOC_000001
AATTGTGGTGAAATGACTTCTGTTAACGGAGACATCGATGATTGTTGTTACTATTTGTTCTCAGGATTCA
TTTGTCCGGTTCATACCCCGGACGGCGCCCCTTGCGGGCTGCTCAATCACCTGACAATGAACTGTATCGT
CACGAAGCATCCGGATCGCAAATTAAAGGCTGCGCTACCAACGGTGCTGGTGGATCTAGGAATGCTTCCG
TTGTCTGTTGCGAATAATTGGAAGGACTCGTACACGGTAATGCTGAATGGTAAAGTGATCGGCCTGATCG
AAGATAATATTGTTGATAAGGTGGCCCGCAAACTAAGGCAGCTGAAGATAATTGGTGAAGAGGTGCCGAA
CACGTTGGAGATCGCGCTGGTGCCGAAGAGGAAGG
>gene=XLOC_000002
TGGGTGAAGGTGCTGTGAGCCGTAAAACTTGTAAAAAGTGGTTTCAGAAGTTTCGGAATGGCGATTTCGA
TCTTACTGATCGCGAACGCAGTGGAATGCCGAGAAAAGTTGAAGACGAGGAACTGGAGCAACTATTGAAC
GAGAATCCTTGTAAGACGCAACAAGAACTTGCTGAGCAACTTGGTGTAACTCAACAAGCTATTTCCGTTC
GCTTAAAAAAGCTTGGAAGAATTTCCAAGGCAGGCCGTTGGGTTCCTCATGTGTTCAGCCCCAAACACAA
AGCGAGACGCTGTGACATTAGAATAACTAACCATGGTCAGTCAGTTTGCTTACGGCTTATGTCTTAAAGC
AAGGTTGTAAACAAGAACTTATCTCTTGTCTATGATCTTGCTTTAAAATATAAATAGTAATTAAATTGAC
CAACTACGATCGTTTATTGGAAGAATAATCGATCGTGGTTGGTTAGGTTATGTTTCACAATACGTCGTAT
GTCGCTGTCGG

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

removing duplicate lines from a file

Hi, I am trying to remove duplicate lines from a file. For example the contents of example.txt is: this is a test 2342 this is a test 34343 this is a test 43434 and i want to remove the "this is a test" lines only and end up with the numbers in the file, that is, end up with: 2342... (4 Replies)
Discussion started by: ocelot
4 Replies

2. Shell Programming and Scripting

removing the duplicate lines in a file

Hi, I need to concatenate three files in to one destination file.In this if some duplicate data occurs it should be deleted. eg: file1: ----- data1 value1 data2 value2 data3 value3 file2: ----- data1 value1 data4 value4 data5 value5 file3: ----- data1 value1 data4 value4 (3 Replies)
Discussion started by: Sharmila_P
3 Replies

3. UNIX for Dummies Questions & Answers

modifying ls() to support the display of file sequences?

Hi there, I'm new to the board and I did try a search, but couldn't quite find what I was looking for. I deal in mostly large sets of sequential files, usually images. I was wondering if someone has modified the standard ls() command, or created another command that would display standardly... (9 Replies)
Discussion started by: Dr_Flambe
9 Replies

4. Shell Programming and Scripting

Removing low frequency sequences

If I have a file with the following information And I would like to remove all the sequences with Freq less than 3, so I end up having the following file: I am currently using awk to accomplish this task but I am not getting the results I actually want. Any help will be greatly appreciated. (3 Replies)
Discussion started by: Xterra
3 Replies

5. Shell Programming and Scripting

Removing specific sequences from file

My file looks like this But I need to remove the entry with the identifier >Reference1 along with the entire sequence. Thus, I will end up having the following file Thanks in advance! (2 Replies)
Discussion started by: Xterra
2 Replies

6. Shell Programming and Scripting

Removing repeates sequences

Hai, How to remove the repeated 'Chr's in different sequences. In the given example, Chr19 is repeated in two samples with the same number i.e. +52245923. How to remove one of the entry in any of the samples and to give the range for each Chr which is -20 for minimum range value and +120 for... (1 Reply)
Discussion started by: hravisankar
1 Replies

7. Shell Programming and Scripting

Removing duplicate terms in a file

Hi everybody I have a .txt file that contains some assembly code for optimizing it i need to remove some replicated parts. for example I have:e_li r0,-1 e_li r25,-1 e_lis r25,0000 add r31, r31 ,r0 e_li r28,-1 e_lis r28,0000 add r31, r31 ,r0 e_li r28,-1 ... (3 Replies)
Discussion started by: Behrouzx77
3 Replies

8. UNIX for Dummies Questions & Answers

Removing a set of Duplicate lines from a file

Hi, How do i remove a set of duplicate lines from a file. My file contains the lines: abc def ghi abc def ghi jkl mno pqr jkl mno (1 Reply)
Discussion started by: raosr020
1 Replies

9. Shell Programming and Scripting

Removing Duplicate Rows in a file

Hello I have a file with contents like this... Part1 Field2 Field3 Field4 (line1) Part2 Field2 Field3 Field4 (line2) Part3 Field2 Field3 Field4 (line3) Part1 Field2 Field3 Field4 (line4) Part4 Field2 Field3 Field4 (line5) Part5 Field2 Field3 Field4 (line6) Part2 Field2 Field3 Field4... (7 Replies)
Discussion started by: ekbaazigar
7 Replies

10. Shell Programming and Scripting

How to remove escape sequences from a text file?

Hello friends, Could anyone please advise on how to remove escape sequences from a text file? $ file input.txt input.txt: ASCII English text, with escape sequences I'm able to see those escape characters when opened in vi editor like shown below: ^ but not when I run more... (6 Replies)
Discussion started by: magnus29
6 Replies
Bio::Tools::Grail(3pm)					User Contributed Perl Documentation				    Bio::Tools::Grail(3pm)

NAME
Bio::Tools::Grail - Results of one Grail run SYNOPSIS
$grail = Bio::Tools::Grail->new(-file => 'result.grail'); # filehandle: $grail = Bio::Tools::Grail->new( -fh => *INPUT ); # parse the results while($gene = $grail->next_prediction()) { # $gene is an instance of Bio::Tools::Prediction::Gene # $gene->exons() returns an array of # Bio::Tools::Prediction::Exon objects # all exons: @exon_arr = $gene->exons(); # initial exons only @init_exons = $gene->exons('Initial'); # internal exons only @intrl_exons = $gene->exons('Internal'); # terminal exons only @term_exons = $gene->exons('Terminal'); # singleton exons only -- should be same as $gene->exons() because # there are no other exons supposed to exist in this structure @single_exons = $gene->exons('Single'); } # essential if you gave a filename at initialization (otherwise the file # will stay open) $genscan->close(); DESCRIPTION
The Grail module provides a parser for Grail gene structure prediction output. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web: https://redmine.open-bio.org/projects/bioperl/ AUTHOR - Jason Stajich Email jason-at-bioperl.org APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ next_prediction Title : next_prediction Usage : while($gene = $grail->next_prediction()) { # do something } Function: Returns the next gene structure prediction of the Grail result file. Call this method repeatedly until FALSE is returned. Example : Returns : A Bio::Tools::Prediction::Gene object. Args : _parse_predictions Title : _parse_predictions() Usage : $obj->_parse_predictions() Function: Parses the prediction section. Automatically called by next_prediction() if not yet done. Example : Returns : _prediction Title : _prediction() Usage : $gene = $obj->_prediction() Function: internal Example : Returns : _add_prediction Title : _add_prediction() Usage : $obj->_add_prediction($gene) Function: internal Example : Returns : _predictions_parsed Title : _predictions_parsed Usage : $obj->_predictions_parsed Function: internal Example : Returns : TRUE or FALSE _has_cds Title : _has_cds() Usage : $obj->_has_cds() Function: Whether or not the result contains the predicted CDSs, too. Example : Returns : TRUE or FALSE perl v5.14.2 2012-03-02 Bio::Tools::Grail(3pm)
All times are GMT -4. The time now is 07:12 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy