Assigning the same frequency to more than one words in a file Post: 302850841

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting Concatenated Words in Input File with Words from a Master File

Hello, I have a complex problem. I have a file in which words have been joined together: Theboy ranslowly I want to be able to correctly split the words using a lookup file in which all the words occur: the boy ran slowly slow put child ly The lookup file which is meant for look up...

2. Shell Programming and Scripting

count frequency of words in a file

I need to write a shell script "cmn" that, given an integer k, print the k most common words in descending order of frequency. Example Usage: user@ubuntu:/$ cmn 4 < example.txt :b:

3. Shell Programming and Scripting

Splitting concatenated words in input file with words from the same file

Dear all, I am working with names and I have a large file of names in which some words are written together (upto 4 or 5) and their corresponding single forms are also present in the word-list. An example would make this clear annamarie mariechristine johnsmith johnjoseph smith john smith...

4. Shell Programming and Scripting

Script to sort large file with frequency

Hello, I have a very large file of around 2 million records which has the following structure: I have used the standard awk program to sort: # wordfreq.awk --- print list of word frequencies { # remove punctuation #gsub(/_]/, "", $0) for (i = 1; i <= NF; i++) freq++ } END { for (word...

5. Shell Programming and Scripting

Sorting a file with frequency on length

Hello, I have a file which has the following structure word space Frequency The file is around 30,000 headwords each along with its frequency. The words have different lengths. What I need is a PERL or AWK script which can sort the file on length of the headword and once the file is sorted on...

6. Shell Programming and Scripting

Creating Frequency of words from a file by accessing a corpus

Hello, I have a large file of syllables /strings in Urdu. Each word is on a separate line. Example in English: be at for if being attract I need to identify the frequency of each of these strings from a large corpus (which I cannot attach unfortunately because of size limitations) and...

7. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Hi , I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l

8. UNIX for Dummies Questions & Answers

Replace the words in the file to the words that user type?

Hello, I would like to change my setting in a file to the setting that user input. For example, by default it is ONBOOT=ON When user key in "YES", it would be ONBOOT=YES -------------- This code only adds in the entire user input, but didn't replace it. How do i go about...

9. Shell Programming and Scripting

Frequency of Words in a File, sed script from 1980

tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | sed ${1:-25} < book7.txt This is not my script, it can be found way back from 1980 but once it worked fine to give me the most used words in a text file. Now the shell is complaining about an error in sed sed: -e...

10. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Hi All, I need one help to replace particular words in file based on if finds another words in that file . i.e. my self is peter@king. i am staying at north sydney. we all are peter@king. How to replace peter to sham if it finds @king in any line of that file. Please help me...

LEARN ABOUT DEBIAN

bio::das::segmenti

Bio::Das::SegmentI(3pm) 				User Contributed Perl Documentation				   Bio::Das::SegmentI(3pm)

NAME

       Bio::Das::SegmentI - DAS-style access to a feature database

SYNOPSIS

	 # Get a Bio::Das::SegmentI object from a Bio::DasI database...

	 $segment = $das->segment(-name=>'Landmark',
				  -start=>$start,
				  -end => $end);

	 @features = $segment->overlapping_features(-type=>['type1','type2']);
	 # each feature is a Bio::SeqFeatureI-compliant object

	 @features = $segment->contained_features(-type=>['type1','type2']);

	 @features = $segment->contained_in(-type=>['type1','type2']);

	 $stream = $segment->get_feature_stream(-type=>['type1','type2','type3'];
	 while (my $feature = $stream->next_seq) {
	    # do something with feature
	 }

	 $count = $segment->features_callback(-type=>['type1','type2','type3'],
					      -callback => sub { ... { }
					      );

DESCRIPTION

       Bio::Das::SegmentI is a simplified alternative interface to sequence annotation databases used by the distributed annotation system. In
       this scheme, the genome is represented as a series of landmarks.  Each Bio::Das::SegmentI object ("segment") corresponds to a genomic
       region defined by a landmark and a start and end position relative to that landmark.  A segment is created using the Bio::DasI segment()
       method.

       Features can be filtered by the following attributes:

	 1) their location relative to the segment (whether overlapping,
		 contained within, or completely containing)

	 2) their type

	 3) other attributes using tag/value semantics

       Access to the feature list uses three distinct APIs:

	 1) fetching entire list of features at a time

	 2) fetching an iterator across features

	 3) a callback

FEEDBACK

   Mailing Lists
       User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one
       of the Bioperl mailing lists.  Your participation is much appreciated.

	 bioperl-l@bio.perl.org

   Support
       Please direct usage questions or support issues to the mailing list:

       bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address
       it. Please include a thorough description of the problem with code and data examples if at all possible.

   Reporting Bugs
       Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution.  Bug reports can be submitted via the
       web:

	 https://redmine.open-bio.org/projects/bioperl/

AUTHOR - Lincoln Stein
       Email lstein@cshl.org

APPENDIX

       The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

   seq_id
	Title	: seq_id
	Usage	: $ref = $s->seq_id
	Function: return the ID of the landmark
	Returns : a string
	Args	: none
	Status	: Public

   display_name
	Title	: seq_name
	Usage	: $ref = $s->seq_name
	Function: return the human-readable name for the landmark
	Returns : a string
	Args	: none
	Status	: Public

       This defaults to the same as seq_id.

   start
	Title	: start
	Usage	: $s->start
	Function: start of segment
	Returns : integer
	Args	: none
	Status	: Public

       This is a read-only accessor for the start of the segment.  Alias to low() for Gadfly compatibility.

   end
	Title	: end
	Usage	: $s->end
	Function: end of segment
	Returns : integer
	Args	: none
	Status	: Public

       This is a read-only accessor for the end of the segment. Alias to high() for Gadfly compatibility.

   length
	Title	: length
	Usage	: $s->length
	Function: length of segment
	Returns : integer
	Args	: none
	Status	: Public

       Returns the length of the segment.  Always a positive number.

   seq
	Title	: seq
	Usage	: $s->seq
	Function: get the sequence string for this segment
	Returns : a string
	Args	: none
	Status	: Public

       Returns the sequence for this segment as a simple string.

   ref
	Title	: ref
	Usage	: $ref = $s->ref([$newlandmark])
	Function: get/set the reference landmark for addressing
	Returns : a string
	Args	: none
	Status	: Public

       This method is used to examine/change the reference landmark used to establish the coordinate system.  By default, the landmark cannot be
       changed and therefore this has the same effect as seq_id().  The new landmark might be an ID, or another Das::SegmentI object.

   absolute
	Title	: absolute
	Usage	: $s->absolute([$new_value])
	Function: get/set absolute addressing mode
	Returns : flag
	Args	: new flag (optional)
	Status	: Public

       Turn on and off absolute-addressing mode.  In absolute addressing mode, coordinates are relative to some underlying "top level" coordinate
       system (such as a chromosome). ref() returns the identity of the top level landmark, and start() and end() return locations relative to
       that landmark.  In relative addressing mode, coordinates are relative to the landmark sequence specified at the time of segment creation or
       later modified by the ref() method.

       The default is to return false and to do nothing in response to attempts to set absolute addressing mode.

   features
	Title	: features
	Usage	: @features = $s->features(@args)
	Function: get features that overlap this segment
	Returns : a list of Bio::SeqFeatureI objects
	Args	: see below
	Status	: Public

       This method will find all features that intersect the segment in a variety of ways and return a list of Bio::SeqFeatureI objects.  The
       feature locations will use coordinates relative to the reference sequence in effect at the time that features() was called.

       The returned list can be limited to certain types, attributes or range intersection modes.  Types of range intersection are one of:

	  "overlaps"	  the default
	  "contains"	  return features completely contained within the segment
	  "contained_in"  return features that completely contain the segment

       Two types of argument lists are accepted.  In the positional argument form, the arguments are treated as a list of feature types.  In the
       named parameter form, the arguments are a series of -name=>value pairs.

	 Argument    Description
	 --------   ------------

	 -types      An array reference to type names in the format
		     "method:source"

	 -attributes A hashref containing a set of attributes to match

	 -rangetype  One of "overlaps", "contains", or "contained_in".

	 -iterator   Return an iterator across the features.

	 -callback   A callback to invoke on each feature

       The -attributes argument is a hashref containing one or more attributes to match against:

	 -attributes => { Gene => 'abc-1',
			  Note => 'confirmed' }

       Attribute matching is simple string matching, and multiple attributes are ANDed together.  More complex filtering can be performed using
       the -callback option (see below).

       If -iterator is true, then the method returns an object reference that implements the next_seq() method.  Each call to next_seq() returns a
       new Bio::SeqFeatureI object.

       If -callback is passed a code reference, the code reference will be invoked on each feature returned.  The code will be passed two
       arguments consisting of the current feature and the segment object itself, and must return a true value. If the code returns a false value,
       feature retrieval will be aborted.

       -callback and -iterator are mutually exclusive options.	If -iterator is defined, then -callback is ignored.

       NOTE: the following methods all build on top of features(), and do not need to be explicitly implemented.

	   overlapping_features()
	   contained_features()
	   contained_in()
	   get_feature_stream()

   overlapping_features
	Title	: overlapping_features
	Usage	: @features = $s->overlapping_features(@args)
	Function: get features that overlap this segment
	Returns : a list of Bio::SeqFeatureI objects
	Args	: see below
	Status	: Public

       This method is identical to features() except that it defaults to finding overlapping features.

   contained_features
	Title	: contained_features
	Usage	: @features = $s->contained_features(@args)
	Function: get features that are contained in this segment
	Returns : a list of Bio::SeqFeatureI objects
	Args	: see below
	Status	: Public

       This method is identical to features() except that it defaults to a range type of 'contained'.

   contained_in
	Title	: contained_in
	Usage	: @features = $s->contained_in(@args)
	Function: get features that contain this segment
	Returns : a list of Bio::SeqFeatureI objects
	Args	: see below
	Status	: Public

       This method is identical to features() except that it defaults to a range type of 'contained_in'.

   get_feature_stream
	Title	: get_feature_stream
	Usage	: $iterator = $s->get_feature_stream(@args)
	Function: get an iterator across the segment
	Returns : an object that implements next_seq()
	Args	: see below
	Status	: Public

       This method is identical to features() except that it always generates an iterator.

       NOTE: This is defined in the interface in terms of features().  You do not have to implement it.

   factory
	Title	: factory
	Usage	: $factory = $s->factory
	Function: return the segment factory
	Returns : a Bio::DasI object
	Args	: see below
	Status	: Public

       This method returns a Bio::DasI object that can be used to fetch more segments.	This is typically the Bio::DasI object from which the
       segment was originally generated.

   primary_tag
	Title	: primary_tag
	Usage	: $tag = $s->primary_tag
	Function: identifies the segment as type "DasSegment"
	Returns : a string named "DasSegment"
	Args	: none
	Status	: Public, but see below

       This method provides Bio::Das::Segment objects with a primary_tag() field that identifies them as being of type "DasSegment".  This allows
       the Bio::Graphics engine to render segments just like a feature in order nis way useful.

       This does not need to be implemented.  It is defined by the interface.

   strand
	Title	: strand
	Usage	: $strand = $s->strand
	Function: identifies the segment strand as 0
	Returns : the number 0
	Args	: none
	Status	: Public, but see below

       This method provides Bio::Das::Segment objects with a strand() field that identifies it as being strandless.  This allows the Bio::Graphics
       engine to render segments just like a feature in order nis way useful.

       This does not need to be implemented.  It is defined by the interface.

perl v5.14.2							    2012-03-02						   Bio::Das::SegmentI(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting Concatenated Words in Input File with Words from a Master File

Discussion started by: gimley

2. Shell Programming and Scripting

count frequency of words in a file

Discussion started by: mohit_iitk

3. Shell Programming and Scripting

Splitting concatenated words in input file with words from the same file

Discussion started by: gimley

4. Shell Programming and Scripting

Script to sort large file with frequency

Discussion started by: gimley

5. Shell Programming and Scripting

Sorting a file with frequency on length

Discussion started by: gimley

6. Shell Programming and Scripting

Creating Frequency of words from a file by accessing a corpus

Discussion started by: gimley

7. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Discussion started by: jmarx

8. UNIX for Dummies Questions & Answers

Replace the words in the file to the words that user type?

Discussion started by: malfolozy

9. Shell Programming and Scripting

Frequency of Words in a File, sed script from 1980

Discussion started by: 1in10

10. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Discussion started by: Rajib Podder

LEARN ABOUT DEBIAN

bio::das::segmenti