Sponsored Content
Full Discussion: Parsing a list
Top Forums Shell Programming and Scripting Parsing a list Post 302861867 by narachaid on Wednesday 9th of October 2013 05:49:41 PM
Old 10-09-2013
Parsing a list

Hello,

I have a very long list of file (see input below). I only need the first "chunk" of the line before the space and omit the rest. Also, the > sign needs to be excluded. Can anyone help me please?

Thank you so much!

INPUT:
Code:
>gi|24976465|gb|AL935113.1|AL935113 AL935113 Homo sapiens library
>gi|24978364|gb|AL93981336.1|AL93981336 AL93981336 Homo sapiens library
>gi|24973415|gb|AL931542.1|AL931542 AL931542 Homo sapiens library
>gi|24939375|gb|AL93376241.1|AL93376241 AL93376241 Homo sapiens library
>gi|24937965|gb|AL9343716.1|AL9343716 AL9343716 Homo sapiens library

OUTPUT:
Code:
gi|24976465|gb|AL935113.1|AL935113
gi|24978364|gb|AL93981336.1|AL93981336
gi|24973415|gb|AL931542.1|AL931542
gi|24939375|gb|AL93376241.1|AL93376241
gi|24937965|gb|AL9343716.1|AL9343716


Last edited by Scott; 10-09-2013 at 06:51 PM.. Reason: Added Code Tags [6th time]
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

2. Shell Programming and Scripting

Parsing file list in variable

Hello, somewhere in a shell script, i am storing the output of "ls" into a variable. My question is how can i parse this variable to get each filepath. I don't want to create a temporary file to write down all the filenames and then parse it.. is there a easy way out.. here is what... (3 Replies)
Discussion started by: prasbala
3 Replies

3. Shell Programming and Scripting

Parsing the list in korn shell

Hi I wanted to print/store just a specific element of the list . I have got the list as an output of grep command. here is code snap below : end_no=`egrep -ni '!return code: 0|return code other than 0' temp.log | cut -d':' -f1` this will return the line numbers in end_no. I just... (2 Replies)
Discussion started by: Shell@korn
2 Replies

4. Shell Programming and Scripting

Help with parsing mailbox folder list (identify similar folders)

List sample: user/xxx/Archives/2010 user/xxx/BLARG user/xxx/BlArG user/xxx/Burton user/xxx/DAY user/yyy/Trainees/Nutrition interns user/yyy/Trainees/Primary Care user/yyy/Trainees/Psychiatric NP interns user/yyy/Trainees/Psychiatric residents user/yyy/Trainees/Psychology... (4 Replies)
Discussion started by: spacegoose
4 Replies

5. Shell Programming and Scripting

parsing a list with awk

Hi folks, I have a list of XML files with entries like this one: <Item Name="Author" Type="String">Stark F</Item> <Item Name="Author" Type="String">Pfannstiel J</Item> <Item Name="Author" Type="String">Klaiber I</Item> <Item Name="Author" Type="String">Raabe T</Item> and what I would like... (1 Reply)
Discussion started by: euval
1 Replies

6. Shell Programming and Scripting

Parsing fields from class list files to use output with newusers command

Hello I am trying to develop a shell script that takes a text file such as this... E-mail@ Soc.Sec.No. *--------Name-----------* Class *School.Curriculum.Major.* Campus.Phone JCC2380 XXX-XX-XXXX CAREY, JULIE C JR-II BISS CPSC BS INFO TECH 412/779-9445 JAC1936 XXX-XX-XXXX... (7 Replies)
Discussion started by: crimputt
7 Replies

7. UNIX for Dummies Questions & Answers

Parsing a list of data

Hi I have a vcf file with 20000 lines, it looks like this- 23 122691 . C 1345.09 PASS 33 122961 . C 833.45 PASS 43 122970 . A 689.75 PASS 53 123009 . T 118.99 PASS 63 123033 . T 46.85 PASS 73 123042 . A 127.51 PASS 83 123060 . T 299.64 PASS 93 123081 . T 299.64 PASS... (3 Replies)
Discussion started by: baika
3 Replies

8. Shell Programming and Scripting

Process List Parsing?

Most of the code I've seen is been listing processes or capturing process ids, etc. But here's what I need to do. Preferably in Korn shell. 1. do a ps -ef |grep tns |grep -v grep in order to get a list or Oracle listeners that are running. 2. parse the line into components which... (7 Replies)
Discussion started by: MRMonteith
7 Replies

9. Shell Programming and Scripting

Parsing through list of files

I have a requirement where I need parse through files in a directory which have a specific pattern and then check whether the file has been processed or not. The exit condition is any file that has been processed will have an entry in database. If it is processed i.e., if an entry is present for... (4 Replies)
Discussion started by: abhilashnair
4 Replies

10. Shell Programming and Scripting

Parsing through a list of items

Hi there, Here is my checklist of items, 4.1.1 Alerter 4.1.2 Client Services for Netware 4.1.3 Clipbook 4.1.4 Fax Service 4.1.5 File Replication 4.1.6 File Services for Macintosh 4.1.7 FTP Publishing Service 4.1.8 Help and Support 4.1.9 HTTP SSL 4.1.10 IIS Admin Service ... (1 Reply)
Discussion started by: alvinoo
1 Replies
Bio::Das::Lite(3pm)					User Contributed Perl Documentation				       Bio::Das::Lite(3pm)

NAME
Bio::Das::Lite - Perl extension for the DAS (HTTP+XML) Protocol (http://biodas.org/) VERSION
See $Bio::Das::Lite::VERSION SYNOPSIS
use Bio::Das::Lite; my $bdl = Bio::Das::Lite->new_from_registry({'category' => 'GRCh_37,Chromosome,Homo sapiens'}); my $results = $bdl->features('22'); SUBROUTINES
/METHODS new : Constructor my $das = Bio::Das::Lite->new('http://das.ensembl.org/das/ensembl1834'); my $das = Bio::Das::Lite->new({ 'timeout' => 60, 'dsn' => 'http://user:pass@das.ensembl.org/das/ensembl1834', 'http_proxy' => 'http://user:pass@webcache.local.com:3128/', }); Options can be: dsn (optional scalar or array ref, URLs of DAS services) timeout (optional int, HTTP fetch timeout in seconds) http_proxy (optional scalar, web cache or proxy if not set in %ENV) no_proxy (optional list/ref, non-proxiable domains if not set in %ENV) caching (optional bool, primitive caching on/off) callback (optional code ref, callback for processed XML blocks) registry (optional array ref containing DAS registry service URLs defaults to 'http://das.sanger.ac.uk/registry/services/das') proxy_user (optional scalar, username for authenticating forward-proxy) proxy_pass (optional scalar, password for authenticating forward-proxy) user_agent (optional scalar, User-Agent HTTP request header value) new_from_registry : Constructor Similar to 'new' above but supports 'capability' and 'category' in the given hashref, using them to query the DAS registry and configuring the DSNs accordingly. my $das = Bio::Das::Lite->new_from_registry({ 'capability' => ['features'], 'category' => ['Protein Sequence'], }); Options are as above, plus capability OR capabilities (optional arrayref of capabilities) category (optional arrayref of categories) For a complete list of capabilities and categories, see: http://das.sanger.ac.uk/registry/ The category can optionally be a full coordinate system name, allowing further restriction by authority, version and species. For example: 'Protein Sequence' OR 'UniProt,Protein Sequence' OR 'GRCh_37,Chromosome,Homo sapiens' http_proxy : Get/Set http_proxy $das->http_proxy('http://user:pass@squid.myco.com:3128/'); proxy_user : Get/Set proxy username for authenticating forward-proxies This is only required if the username wasn't specified when setting http_proxy $das->proxy_user('myusername'); proxy_pass : Get/Set proxy password for authenticating forward-proxies This is only required if the password wasn't specified when setting http_proxy $das->proxy_pass('secretpassword'); no_proxy : Get/Set domains to not use proxy for $das->no_proxy('ebi.ac.uk', 'localhost'); OR $das->no_proxy( ['ebi.ac.uk', 'localhost'] ); Always returns an arrayref user_agent : Get/Set user-agent for request headers $das->user_agent('GroovyDAS/1.0'); timeout : Get/Set timeout $das->timeout(30); caching : Get/Set caching $das->caching(1); callback : Get/Set callback code ref $das->callback(sub { }); basename : Get base URL(s) of service $das->basename(optional $dsn); dsn : Get/Set DSN $das->dsn('http://das.ensembl.org/das/ensembl1834/'); # give dsn (scalar or arrayref) here if not specified in new() Or, if you want to add to the existing dsn list and you're feeling sneaky... push @{$das->dsn}, 'http://my.server/das/additionalsource'; dsns : Retrieve information about other sources served from this server. Note this call is 'dsns', as differentiated from 'dsn' which is the current configured source my $src_data = $das->dsns(); entry_points : Retrieve the list of entry_points for this source e.g. chromosomes and associated information (e.g. sequence length and version) my $entry_points = $das->entry_points(); Types of argument for 'types', 'features', 'sequence' calls: Segment Id: '1' Segment Id with range: '1:1,1000' Segment Id with range and type: { 'segment' => '1:1,1000', 'type' => 'exon', } Multiple Ids with ranges and types: [ { 'segment' => '1:1,1000', 'type' => 'exon', }, { 'segment' => '2:1,1000', 'type' => 'exon', }, ] See DAS specifications for other parameters types : Find out about different data types available from this source my $types = $das->types(); # takes optional args - see DAS specs Retrieve the types of data available for this source e.g. 32k_cloneset, karyotype, swissprot features : Retrieve features from a segment e.g. clones on a chromosome ######### # Different ways to fetch features - # my $feature_data1 = $das->features('1:1,100000'); my $feature_data2 = $das->features(['1:1,100000', '2:20435000,21435000']); my $feature_data3 = $das->features({ 'segment' => '1:1,1000', 'type' => 'karyotype', # optional args - see DAS Spec }); my $feature_data4 = $das->features([ {'segment' => '1:1,1000000','type' => 'karyotype',}, {'segment' => '2:1,1000000',}, {'group_id' => 'OTTHUMG00000036084',}, ]); ######### # Feature fetch with callback # my $callback = sub { my $struct = shift; print {*STDERR} Dumper($struct); }; # then: $das->callback($callback); $das->features('1:1,1000000'); # or: $das->features('1:1,1000000', $callback); # or: $das->features(['1:1,1000000', '2:1,1000000', '3:1,1000000'], $callback); # or: $das->features([{'group_id' => 'OTTHUMG00000036084'}, '2:1,1000000', '3:1,1000000'], $callback); alignment : Retrieve protein alignment data for a query. This can be a multiple sequence alignment or pairwise alignment. Note - this has not been tested for structural alignments as there is currently no Das source avialable. my $alignment = $das->alignment({query => 'Q01234'}); structure : Retrieve known structure (i.e. PDB) for a query my $structure = $das->structure({ query => 'pdb_id'}); sources : Retrieves the list of sources form the DAS registry, via a DAS call. my $sources = $das->source; sequence : Retrieve sequence data for a segment (probably dna or protein) my $sequence = $das->sequence('2:1,1000'); # segment:start,stop (e.g. chromosome 2, bases 1 to 1000) stylesheet : Retrieve stylesheet data my $style_data = $das->stylesheet(); my $style_data2 = $das->stylesheet($callback); statuscodes : Retrieve HTTP status codes for request URLs my $code = $das->statuscodes($url); my $code_hashref = $das->statuscodes(); max_hosts set number of running concurrent host connections THIS METHOD IS NOW DEPRECATED AND HAS NO EFFECT $das->max_hosts(7); print $das->max_hosts(); max_req set number of running concurrent requests per host THIS METHOD IS NOW DEPRECATED AND HAS NO EFFECT $das->max_req(5); print $das->max_req(); registry : Get/Set accessor for DAS-Registry service URLs $biodaslite->registry('http://www.dasregistry.org/das'); my $registry_arrayref = $biodaslite->registry(); registry_sources : Arrayref of dassource objects from the configured registry services my $sources_ref = $biodaslite->registry_sources(); my $sources_ref = $biodaslite->registry_sources({ 'capability' => ['features','stylesheet'], }); my $sources_ref = $biodaslite->registry_sources({ 'category' => ['Protein Sequence'], }); build_queries Constructs an arrayref of DAS requests including parameters for each call build_requests Constructs the WWW::Curl callbacks postprocess Applies processing to the result set, e.g. removal of whitespace from sequence responses. DESCRIPTION
This module is an implementation of a client for the DAS protocol (XML over HTTP primarily for biological-data). DEPENDENCIES
strict warnings WWW::Curl HTTP::Response Carp English Readonly DIAGNOSTICS
Set $Bio::Das::Lite::DEBUG = 1; CONFIGURATION AND ENVIRONMENT
INCOMPATIBILITIES
BUGS AND LIMITATIONS
The max_req and max_hosts methods are now deprecated and have no effect. SEE ALSO
DAS Specifications at: http://biodas.org/documents/spec.html ProServer (A DAS Server implementation also by the author) at: http://www.sanger.ac.uk/proserver/ The venerable Bio::Das suite (CPAN and http://www.biodas.org/download/Bio::Das/). The DAS Registry at: http://das.sanger.ac.uk/registry/ AUTHOR
Roger Pettett, <rpettett@cpan.org> LICENSE AND COPYRIGHT
Copyright (C) 2007 GRL, by Roger Pettett This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available. perl v5.14.2 2011-11-26 Bio::Das::Lite(3pm)
All times are GMT -4. The time now is 01:12 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy