Regular expression -tab Post: 302534827

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular Expression + Aritmetical Expression

Is it possible to combine a regular expression with a aritmetical expression? For example, taking a 8-numbers caracter sequece and casting each output of a grep, comparing to a constant. THX!

2. UNIX for Dummies Questions & Answers

Regular expression

Hello, I have a string of the form " &x.y.z" I would like to grep all the lines when "in" can be found in either x or y. How to write the corresponding regular expression ? I have tried the following but it does not work: grep -i " *&.*(in)*.*.*(in)*.*" Any ideas ? Thank you Max

3. Shell Programming and Scripting

regular expression

Hi all, My log file is like 19:40:22 INFO :Total time taken to Service External Request---15ms 19:40:22 INFO : External service failed with status KO 19:40:22 FATAL: External service failed with status KO 19:40:22 DEBUG : Batch started with 19:40:22 ERROR: Member: dmidecode.x86_64...

4. Linux

Regular expression to extract "y" from "abc/x.y.z" .... i need regular expression

Regular expression to extract "y" from "abc/x.y.z"

5. UNIX for Dummies Questions & Answers

ls with regular expression

I currently list and sort all the files in a directory which begin with an Upper Case C and end with the extension '#finished#'. Here is the command I use: ls -tr $currentDir/*.#finished# what i need to do now is list all the files in a directory that begin with upper case C and end...

6. Shell Programming and Scripting

Integer expression expected: with regular expression

CA_RELEASE has a value of 6. I need to check if that this is a numeric value. if not error. source $CA_VERSION_DATA if * ] then echo "CA_RELESE $CA_RELEASE is invalid" exit -1 fi + source /etc/ncgl/ca_version_data ++ CA_PRODUCT_ID=samxts ++ CA_RELEASE=6 ++ CA_WEEK_NO=7 ++...

7. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print...

8. UNIX for Dummies Questions & Answers

Help with regular expression

Hi All, Need some help with a regular expression that I cant seem to work on. Say there is the following text - Input We all (need some) help some day (ps: when it rains) and I need to save some hay now. During the rainy season (ps: when it might rain heavily) Seattle gets drenched but not...

9. Shell Programming and Scripting

Regular expression

Hi I need to write a regular expression for a language. That language can have {a, b, c} as alphabet and it must contain minimum one "a" and minimum one "b". Can u help me to write it plz! thanks in advance!

10. UNIX for Advanced & Expert Users

sed: -e expression #1, char 0: no previous regular expression

Hello All, I'm trying to extract the lines between two consecutive elements of an array from a file. My array looks like: problem_arr=(PRS111 PRS213 PRS234) j=0 while } ] do k=`expr $j + 1` sed -n "/${problem_arr}/,/${problem_arr}/p" problemid.txt ---some operation goes...

LEARN ABOUT DEBIAN

map2slim

MAP2SLIM(1p)						User Contributed Perl Documentation					      MAP2SLIM(1p)

NAME

       map2slim - maps gene associations to a 'slim' ontology

SYNOPSIS

	 cd go
	 map2slim GO_slims/goslim_generic.obo ontology/gene_ontology.obo gene-associations/gene_association.fb

DESCRIPTION

       Given a GO slim file, and a current ontology (in one or more files), this script will map a gene association file (containing annotations
       to the full GO) to the terms in the GO slim.

       The script can be used to either create a new gene association file, containing the most pertinent GO slim accessions, or in count-mode, in
       which case it will give distinct gene product counts for each slim term

       The association file format is described here:

       <http://www.geneontology.org/GO.annotation.shtml#file>

ARGUMENTS

       -b bucket slim file
	   This argument adds bucket terms to the slim ontology; see the documentation below for an explanation. The new slim ontology file,
	   including bucket terms will be written to bucket slim file

       -outmap slim mapping file
	   This will generate a mapping file for every term in the full ontology showing both the most pertinent slim term and all slim terms that
	   are ancestors. If you use this option, do NOT supply a gene-associations file

       shownames
	   (Only works with -outmap)

	   Show the names of the term in the slim mapping file

       -c  This will force map2slim to give counts of the assoc file, rather than map it

       -t  When used in conjunction with -c will tab the output so that the indentation reflects the tree hierarchy in the slim file

       -o out file
	   This will write the mapped assocs (or counts) to the specified file, rather than to the screen

DOWNLOAD

       This script is part of the go-perl package, available from CPAN

       http://search.cpan.org/~cmungall/go-perl/ <http://search.cpan.org/~cmungall/go-perl/>

       This script will not work without installing go-perl

   MAPPING ALGORITHM
       GO is a DAG, not a tree. This means that there is often more than one path from a GO term up to the root Gene_Ontology node; the path may
       intersect multiple terms in the slim ontology - which means that one annotation can map to multiple slim terms!

       (note you need to view this online to see the image below - if you are not viewing this on the http://www.geneontology.org site, you can
       look at the following URL: http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/go-perl/doc/map2slim.gif
       <http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/go-perl/doc/map2slim.gif> )

       A hypothetical example  blue circles show terms in the GO slim, and yellow circles show terms in the full ontology. The full ontology
       subsumes the slim, so the blue terms are also in the ontology.

	 GO ID	MAPS TO SLIM ID        ALL SLIM ANCESTORS
	 =====	===============        ==================
	 5	2+3		       2,3,1
	 6	3 only		       3,1
	 7	4 only		       4,3,1
	 8	3 only		       3,1
	 9	4 only		       4,3,1
	 10	2+3		       2,3,1

       The 2nd column shows the most pertinent ID(s) in the slim  the direct mapping. The 3rd column shows all ancestors in the slim.

       Note  in particular the mapping of ID 9	although this has two paths to the root through the slim via 3 and 4, 3 is discarded because it is
       subsumed by 4.

       On the other hand, 10 maps to both 2 and 3 because these are both the first slim ID in the two valid paths to the root, and neither
       subsumes the other.

       The algorithm used is:

       to map any one term in the full ontology: find all valid paths through to the root node in the full ontology

       for each path, take the first slim term encountered in the path

       discard any redundant slim terms in this set  ie slim terms subsumed by other slim terms in the set

   BUCKET TERMS
       If you run the script with the -b option, bucket terms will be added. For any term P in the slim, if P has at least one child C, a bucket
       term P' will be created under P. This is a catch-all term for mapping any term in the full ontology that is a descendant of P, but NOT a
       descendant of any child of P in the slim ontology.

       For example, the slim generic.0208 has the following terms and structure:

	   %DNA binding ; GO:0003677
	    %chromatin binding ; GO:0003682
	    %transcription factor activity ; GO:0003700, GO:0000130

       After adding bucket terms, it will look like this:

	  %DNA binding ; GO:0003677
	   %chromatin binding ; GO:0003682
	   %transcription factor activity ; GO:0003700 ; synonym:GO:0000130
	   @bucket:Z-OTHER-DNA binding ; slim_temp_id:12

       Terms from the full ontology that are other children of DNA binding, such as single-stranded DNA binding and its descendents will map to
       the bucket term.

       The bucket term has a slim ID which is transient and is there only to facilitate the mapping. It should not be used externally.

       The bucket term has the prefix Z-OTHER; the Z is a hack to make sure that the term is always listed last in the alphabetic ordering.

       The algorithm is slightly modified if bucket terms are used. The bucket term has an implicit relationship to all OTHER siblings not in the
       slim.

       Do I need bucket terms?

       Nowadays most slim files are entirely or nearly 'complete', that is there are no gaps. This means the the -b option will not produce
       noticeable different results. For example, you may see a bucket term OTHER-binding created, with nothing annotated to it: because all the
       children of binding in the GO are represented in the slim file.

       The bucket option is really only necessary for some of the older archived slim files, which are static and were generated in a fairly ad-
       hoc way; they tend to accumulate 'gaps' over time (eg GO will add a new child of binding, but the static slim file won't be up to date, so
       any gene products annotated to this new term will map to OTHER-binding in the slim)

   GRAPH MISMATCHES
       Note that the slim ontology file(s) may be out of date with respect to the current ontology.

       Currently map2slim does not flag graph mismatches between the slim graph and the graph in the full ontology file; it takes the full
       ontology as being the real graph. However, the slim ontology will be used to format the results if you select -t -c as options.

   OUTPUT
       In normal mode, a standard format gene-association file will be written. The GO ID column(5) will contain GO slim IDs. The mapping
       corresponds to the 2nd column in the table above. Note that the output file may contain more lines that the input file. This is because
       some full GO IDs have more than one pertinent slim ID.

       COUNT MODE

       map2slim can be run with the -c option, which will gives the counts of distinct gene products mapped to each slim term. The columns are as
       follows

       GO Term
	   The first column is the GO ID followed by the term name (the term name is provided as it is found in both the full GO and slim
	   ontologies - these will usually be the same but occasionally the slim file will lage behind changes in the GO file)

       Count of gene products for which this is the most relevant slim term
	   the number of distinct gene products for which this is the most pertinent/direct slim ID. By most direct we mean that either the
	   association is made directly to this term, OR the association is made to a child of this slim term AND there is no child slim term
	   which the association maps to.

	   For most slims, this count will be equivalent to the number of associations directly mapped to this slim term. However, some older slim
	   files are "spotty" in that they admit "gaps". For example, if the slim has all children of "biological process" with the exception of
	   "behavior" then all annotations to "behavior" or its children will be counted here

	   see example below

       Count of gene products inferred to be associated with slim term
	   and the number of distinct gene products which are annotated to any descendant of this slim ID (or annotated directly to the slim ID).

       obsoletion flag
       GO ontology

       To take an example; if we use -t and -c like this:

	 map2slim -t -c GO_slims/goslim_generic.obo ontology/gene_ontology.obo gene-associations/gene_association.fb

       Then part of the results may look like this:

	GO:0008150 biological_process (biological_process)     34      10025	       biological_process
	 GO:0007610 behavior (behavior)        632     632	       biological_process
	 GO:0000004 biological process unknown (biological process unknown)    832     832	       biological_process
	 GO:0007154 cell communication (cell communication)    333     1701	       biological_process
	  GO:0008037 cell recognition (cell recognition)       19      19	       biological_process
       19 products were mapped to GO:0008037 or one of its children. (GO:0008037 is a leaf node in the slim, so the two counts are identical).

       On the other hand, GO:0008150 only gets 34 products for which this is the most relevant term. This is because most annotations would map to
       some child of GO:0008150 in the slim, such as GO:0007610 (behavior). These 34 gene products are either annotated directly to GO:0008150, or
       to some child of this term which is not in the slim. This can point to 'gaps' in the slim. Note that running map2slim with the -b option
       will 'plug' these gaps with artificial filler terms.

AUTHOR

       Chris Mungall BDGP

SEE ALSO

       http://www.godatabase.org/dev

       GO::Parser

       GO::Model::Graph

perl v5.14.2							    2010-05-12							      MAP2SLIM(1p)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular Expression + Aritmetical Expression

Discussion started by: Z0mby

2. UNIX for Dummies Questions & Answers

Regular expression

Discussion started by: maxvirrozeito

3. Shell Programming and Scripting

regular expression

Discussion started by: subin_bala

4. Linux

Regular expression to extract "y" from "abc/x.y.z" .... i need regular expression

Discussion started by: rag84dec

5. UNIX for Dummies Questions & Answers

ls with regular expression

Discussion started by: rkgudde

6. Shell Programming and Scripting

Integer expression expected: with regular expression

Discussion started by: ketkee1985

7. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Discussion started by: jessy83

8. UNIX for Dummies Questions & Answers

Help with regular expression

Discussion started by: ManoharMa

9. Shell Programming and Scripting

Regular expression

Discussion started by: nishrestha

10. UNIX for Advanced & Expert Users

sed: -e expression #1, char 0: no previous regular expression

Discussion started by: InduInduIndu

LEARN ABOUT DEBIAN

map2slim