Match col 1 of File 1 with col 1 File 2 and create a 3rd file

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace col 23 - 26 with new value, non delimited file

hello, i have a undelimited file which contains 229 byte records. i want to change column 23 - 26 with a new value and also change the sign of the data in colulmn 30 - 70. i've tried SED for the first change, but nothing happens: sed 's/$^.\{22\}$.\{4\}$.*$/\0603\2/' inputfile heres an...

2. Shell Programming and Scripting

Modifying col values based on another col

Hi, Please help with this. I have several excel files (with and .xlsx format) with 10-15 columns each. They all have the same type of data but the columns are not ordered in the same way. Here is a 3 column example. What I want to do add the alphabet from column 2 to column 3, provided...

3. Shell Programming and Scripting

Run a program-print parameters to output file-replace op file contents with max 4th col

Hi Friends, This is the only solution to my task. So, any help is highly appreciated. I have a file cat input1.bed chr1 100 200 abc chr1 120 300 def chr1 145 226 ghi chr2 567 600 unix Now, I have another file by name input2.bed (This file is a binary file not readable by the...

4. Shell Programming and Scripting

Printing from col x to end of line, except last col

Hello, I have some tab delimited data and I need to move the last col. I could hard code it, awk '{ print $1,$NF,$2,$3,$4,etc }' infile > outfile but it would be nice to know the syntax to print a range cols. I know in cut you can do, cut -f 1,4-8,11- to print fields 1,...

5. UNIX for Advanced & Expert Users

Print line based on highest value of col (B) and repetion of values in col (A)

Hello everyone, I am writing a script to process data from the ATP world tour. I have a file which contains: t=540 y=2011 r=1 p=N409 t=540 y=2011 r=2 p=N409 t=540 y=2011 r=3 p=N409 t=540 y=2011 r=4 p=N409 t=520 y=2011 r=1 p=N409 t=520 y=2011 r=2 p=N409 t=520 y=2011 r=3 p=N409 The...

6. Shell Programming and Scripting

how to add new col in a file

Hi, Experts, I have a requirement as following: my source file: a a a b b c c c c I need add one more colume as following: 1 a 2 a 3 a 1 b 2 b 1 c 2 c

7. Shell Programming and Scripting

Get columns from another file for match in col 2 in 1st file

Hi, My first file has 592155 9 rs16916098 1 592156 19 rs7249604 1 592157 4 rs885156 1 592158 5 rs350067 12nd file has 9 rs16916098 0 113228129 2 4 19 rs7249604 0 58709070 4 2 2 rs17042833 0 113558750 4 2...

8. Shell Programming and Scripting

Compare - 1st col of file

Hi, I have two different files, one has two columns and other has only one column. I would like to compare the first column in the first file with the data in the second file and write a third file with the data that is not present is not common to them. First file:...

9. Shell Programming and Scripting

sort and split file by 2 cols (1 col after the other)

Dear All, I am a newbie to shell scripting so this one is really over my head. I have a text file with five fields as below: 76576.867188 6232.454102 2.008904 55.000000 3 76576.867188 6232.454102 3.607231 55.000000 4 76576.867188 6232.454102 1.555146 65.000000 3 76576.867188 6232.454102...

10. Shell Programming and Scripting

sum(col) finding from a file

LEARN ABOUT DEBIAN

grinder::kmercollection

Grinder::KmerCollection(3pm)				User Contributed Perl Documentation			      Grinder::KmerCollection(3pm)

NAME

       Grinder::KmerCollection - A collection of kmers from sequences

SYNOPSIS

	 my $col = Grinder::KmerCollection->new( -k    => 10,
						 -file => 'seqs.fa' );

DESCRIPTION

       Manage a collection of kmers found in various sequences. Store information about what sequence a kmer was found in and its starting
       position on the sequence.

AUTHOR

       Florent Angly <florent.angly@gmail.com>

APPENDIX

       The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

   new
	Title	: new
	Usage	: my $col = Grinder::KmerCollection->new( -k => 10, -file => 'seqs.fa', -revcom => 1 );
	Function: Build a new kmer collection
	Args	: -k	    set the kmer length (default: 10 bp)
		  -revcom   count kmers before and after reverse-complementing sequences
			    (default: 0)
		  -seqs     count kmers in the provided arrayref of sequences (Bio::Seq
			    objects)
		  -ids	    if specified, index the sequences provided to -seq using the
			    use the IDs in this arrayref instead of using the sequences
			    $seq->id() method
		  -file     count kmers in the provided file of sequences
		  -weights  if specified, assign the abundance of each sequence from the
			    values in this arrayref

	Returns : Grinder::KmerCollection object

   k
	Usage	: $col->k;
	Function: Get the length of the kmers
	Args	: None
	Returns : Positive integer

   weights
	Usage	: $col->weights({'seq1' => 3, 'seq10' => 0.45});
	Function: Get or set the weight of each sequence. Each sequence is given a
		  weight of 1 by default.
	Args	: hashref where the keys are sequence IDs and the values are the weight
		  of the corresponding (e.g. their relative abundance)
	Returns : Grinder::KmerCollection object

   collection_by_kmer
	Usage	: $col->collection_by_kmer;
	Function: Get the collection of kmers, indexed by kmer
	Args	: None
	Returns : A hashref of hashref of arrayref:
		     hash->{kmer}->{ID of sequences with this kmer}->[starts of kmer on sequence]

   collection_by_seq
	Usage	: $col->collection_by_seq;
	Function: Get the collection of kmers, indexed by sequence ID
	Args	: None
	Returns : A hashref of hashref of arrayref:
		     hash->{ID of sequences with this kmer}->{kmer}->[starts of kmer on sequence]

   add_file
	Usage	: $col->add_file('seqs.fa');
	Function: Process the kmers in the given file of sequences.
	Args	: filename
	Returns : Grinder::KmerCollection object

   add_seqs
	Usage	: $col->add_seqs([$seq1, $seq2]);
	Function: Process the kmers in the given sequences.
	Args	: * arrayref of Bio::Seq objects
		  * arrayref of IDs to use for the indexing of the sequences
	Returns : Grinder::KmerCollection object

   filter_rare
	Usage	: $col->filter_rare( 2 );
	Function: Remove kmers occurring at less than the (weighted) abundance specified
	Args	: integer
	Returns : Grinder::KmerCollection object

   filter_shared
	Usage	: $col->filter_shared( 2 );
	Function: Remove kmers occurring in less than the number of sequences specified
	Args	: integer
	Returns : Grinder::KmerCollection object

   counts
	Usage	: $col->counts
	Function: Calculate the total count of each kmer. Counts are affected by the
		  weights you gave to the sequences.
	Args	: * restrict sequences to search to specified sequence ID (optional)
		  * starting position from which counting should start (optional)
		  * 0 to report counts (default), 1 to report frequencies (normalize to 1)
	Returns : * arrayref of the different kmers
		  * arrayref of the corresponding total counts

   sources
	Usage	: $col->sources()
	Function: Return the sources of a kmer and their (weighted) abundance.
	Args	: * kmer to get the sources of
		  * sources to exclude from the results (optional)
		  * 0 to report counts (default), 1 to report frequencies (normalize to 1)
	Returns : * arrayref of the different sources
		  * arrayref of the corresponding total counts
		  If the kmer requested does not exist, the array will be empty.

   kmers
	Usage	: $col->kmers('seq1');
	Function: This is the inverse of sources(). Return the kmers found in a sequence
		  (given its ID) and their (weighted) abundance.
	Args	: * sequence ID to get the kmers of
		  * 0 to report counts (default), 1 to report frequencies (normalize to 1)
	Returns : * arrayref of sequence IDs
		  * arrayref of the corresponding total counts
		  If the sequence ID requested does not exist, the arrays will be empty.

   positions
	Usage	: $col->positions()
	Function: Return the positions of the given kmer on a given sequence. An error
		  is reported if the kmer requested does not exist
	Args	: * desired kmer
		  * desired sequence with this kmer
	Returns : Arrayref of the different positions. The arrays will be empty if the
		  desired combination of kmer and sequence was not found.

perl v5.14.2							    2012-01-17					      Grinder::KmerCollection(3pm)

Ubuntu