Sponsored Content
Top Forums Shell Programming and Scripting Way to extract detail and its content above specific value problem asking Post 302404203 by patrick87 on Tuesday 16th of March 2010 12:39:12 AM
Old 03-16-2010
Way to extract detail and its content above specific value problem asking

Input file:
Code:
>position_10 sample:68711 coords:5453-8666 number:3 type:complete len:344
MSINQYSSDFHYHSLMWQQQQQQQQHQNDVVEEKEALFEKPLTPSDVGKLNRLVIPKQHA
ERYFPLAAAAADAVEKGLLLCFEDEEGKPWRFRYSYWNSSQSYVLTKGWSRYVKEKHLDA
NRTS*
>position_4 sample:68711 coords:553-866 number:4 type:partial len:483
MSGVVRSSPGSSQPPPPPPHHPPSSPVPVTSTPVIPPIRRHLAFASTKPPFHPSDDYHRF
KITPSDVENDESDYWLLSNAEISMTDIWKTDSGIDWDYGIADVSTPPPGMGEIAPTAVDS
TPR*
>position_7 sample:68711 coords:453-86 number:2 type:partial len:214
KAAETLEVQKRRIYDITNVLEGIDLIEKPFKNRILWKGVDACPGDEDADVSVLQLQAEIE
NLALEEQALDNQIRWLFVTEEDIKSLPGFQNQTLIAVKAPHGTTLEVPDPDEAADHPQRR
TDSGIDWDYGIADVSTPPPGMGEIAPTAVDSTPR*
>position_11 sample:68711 coords:53-86 number:1 type:complete len:558
MLGDFIIRLLVLILGYTYPAFECFKTVEKNKVDIEELRFWCQYWILLALISSFERVGDFF
RAPRPLNKSLSALRSLEKQTSRGRKWPPPTPPPTPGRDSAGTFNGDDGVNIPDTIPGSPL
TDARAKLRRSNSRTQPAA*
.
.

Output file:
Code:
>position_10 sample:68711 coords:5453-8666 number:3 type:complete len:344
MSINQYSSDFHYHSLMWQQQQQQQQHQNDVVEEKEALFEKPLTPSDVGKLNRLVIPKQHA
ERYFPLAAAAADAVEKGLLLCFEDEEGKPWRFRYSYWNSSQSYVLTKGWSRYVKEKHLDA
NRTS*
>position_11 sample:68711 coords:53-86 number:1 type:complete len:558
MLGDFIIRLLVLILGYTYPAFECFKTVEKNKVDIEELRFWCQYWILLALISSFERVGDFF
RAPRPLNKSLSALRSLEKQTSRGRKWPPPTPPPTPGRDSAGTFNGDDGVNIPDTIPGSPL
TDARAKLRRSNSRTQPAA*
.
.

I would like to extract the content and detail match with below criteria:
1. header must got the "complete" word (eg. type:complete )
2. lens must above or equal to 300 (eg. len:344 and len:558, etc)
It seems like perl, awk, sed able to archive my desired goal.
Thanks a lot for any advice Smilie

Last edited by patrick87; 03-16-2010 at 04:32 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script or command help to extract specific contents from a long list of content

Hi, I got a long list of contents: >sequence_1 ASSSSSSSSSSSDDDDDDDDDDDCCCCCCC ASDSFDFFDFDFFWERERERERFSDFESFSFD >sequence_2 ASDFDFDFFDDFFDFDSFDSFDFSDFSDFDSFASDSADSADASD ASDFFDFDFASFASFASFAFSFFSDASFASFASFAFS >sequence_3 VEDFGSDGSDGSDGSDGSDGSDGSDG dDFSDFSDFSDFSDFSDFSDFSDFSDF... (2 Replies)
Discussion started by: patrick87
2 Replies

2. Shell Programming and Scripting

Extract specific content from a file

My input file: >sequence_1 ASSSSSSSSSSSDDDDDDDDDDDCCCCCCC ASDSFDFFDFDFFWERERERERFSDFESFSFD >sequence_2 ASDFDFDFFDDFFDFDSFDSFDFSDFSDFDSFASDSADSADASD ASDFFDFDFASFASFASFAFSFFSDASFASFASFAFS >sequence_3 VEDFGSDGSDGSDGSDGSDGSDGSDG dDFSDFSDFSDFSDFSDFSDFSDFSDF SDGFDGSFDGSGSDGSDGSDGSDGSDG My... (22 Replies)
Discussion started by: patrick87
22 Replies

3. Shell Programming and Scripting

Extract all the content after a specific data

My input: >seq_1 DSASSTRRARRRRTPRTPSLRSRRSDVTCS >seq_3 RMRLRRWRKSCSERS*RRSN >seq_8 RTTGLSERPRLPTTASRSISSRWTR >seq_10 NELPLEKGSLDSISIE >seq_9 PNQGDAREPQAHLPRRQGPRDRPLQAYA+ QVQHRRHDHSRTQH*LCRRRQREDCDRLHR >seq_4 DRGKGQAGCRRPQEGEALVRRCS>seq_6 FA*GLAAQDGEA*SGRG My output: Extract all... (22 Replies)
Discussion started by: patrick87
22 Replies

4. Shell Programming and Scripting

Extract specific data content from a long list of data

My input: Data name: ABC001 Data length: 1000 Detail info Data Direction Start_time End_time Length 1 forward 10 100 90 1 forward 15 200 185 2 reverse 50 500 450 Data name: XFG110 Data length: 100 Detail info Data Direction Start_time End_time Length 1 forward 50 100 50 ... (11 Replies)
Discussion started by: patrick87
11 Replies

5. Shell Programming and Scripting

Manipulate data in detail problem facing

Input Participant number: HAC Position type Location Distance_start Distance_end Range Mark 1 1 + Front 808 1083 276 2 1 + Front 1373 1636 264 3 1 - Back 1837 2047 211 Participant number: BCD Position type... (6 Replies)
Discussion started by: patrick87
6 Replies

6. Shell Programming and Scripting

Remove specific pattern header and its content problem facing

Input file: >TRACK: Position: 1 TYPE: 1 Pos: SVAVPQRHHPGGTVFREPIIIPAIPRLVPGWNKPIIIGRHAFGDQYRATDRVIPGPGKLE LVYTPVNGEPETVKVYDFQGGGIAQTQYNTDESIRGFAHASFQMALLKGLPLYMSTKNTI LKRYDGRFKDIFQEIYESTYQKDFEAKNLWYEHRLIDDMVAQMIKSEGGFVMALKNYDGD >TRACK: Position: 1 TYPE: 2 Pos: FAHASFQMALLKGLPLYMS... (8 Replies)
Discussion started by: patrick87
8 Replies

7. Shell Programming and Scripting

Extract specific content from data and rename its header problem asking

Input file 1: >pattern_5 GAATTCGTTCATGTAGGTTGASDASFGDSGRTYRYGHDGSDFGSDGGDSGSDGSDFGSDF ATTTAATTATGATTCATACGTCATATGTTATTATTCAATCGTATAAAATTATGTGACCTT SDFSDGSDFKSDAFLKJASLFJASKLFSJAKJFHASJKFHASJKFHASJKFHSJAKFHAW >pattern_1 AAGTCTTAAGATATCACCGTCGATTAGGTTTATACAGCTTTTGTGTTATTTAAATTTGAC... (10 Replies)
Discussion started by: patrick87
10 Replies

8. Shell Programming and Scripting

Extract all content that match exactly only specific word

Input: 21 templeta parent 35718 36554 . - . ID=parent_cluster_50.21.11; Name=Partial%20parent%20for%20training%20set; 21 templeta kids 35718 36554 . - . ID=_52; Parent=parent_cluster_5085.21.11; 21 templeta ... (7 Replies)
Discussion started by: patrick87
7 Replies

9. Shell Programming and Scripting

Help with remove duplicate content and only keep the first content detail

Input data_10 SSA data_2 TYUE data_3 PEOCV data_6 SSAT data_21 SSA data_19 TYUEC data_14 TYUE data_15 SSA data_32 PEOCV . . Desired Output data_10 SSA data_2 TYUE data_3 PEOCV data_6 SSAT data_19 TYUEC (9 Replies)
Discussion started by: patrick87
9 Replies

10. Shell Programming and Scripting

Extract specific line in an html file starting and ending with specific pattern to a text file

Hi This is my first post and I'm just a beginner. So please be nice to me. I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file? I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies
Digest(3o)							   OCaml library							Digest(3o)

NAME
Digest - MD5 message digest. Module Module Digest Documentation Module Digest : sig end MD5 message digest. This module provides functions to compute 128-bit ``digests'' of arbitrary-length strings or files. The digests are of cryptographic qual- ity: it is very hard, given a digest, to forge a string having that digest. The algorithm used is MD5. type t = string The type of digests: 16-character strings. val string : string -> t Return the digest of the given string. val substring : string -> int -> int -> t Digest.substring s ofs len returns the digest of the substring of s starting at character number ofs and containing len characters. val channel : Pervasives.in_channel -> int -> t If len is nonnegative, Digest.channel ic len reads len characters from channel ic and returns their digest, or raises End_of_file if end-of-file is reached before len characters are read. If len is negative, Digest.channel ic len reads all characters from ic until end-of-file is reached and return their digest. val file : string -> t Return the digest of the file whose name is given. val output : Pervasives.out_channel -> t -> unit Write a digest on the given output channel. val input : Pervasives.in_channel -> t Read a digest from the given input channel. val to_hex : t -> string Return the printable hexadecimal representation of the given digest. OCamldoc 2012-06-26 Digest(3o)
All times are GMT -4. The time now is 07:03 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy