Sponsored Content
Operating Systems Linux Print the 1st column and the value in 2nd or 3rd column if that is different from the values in 1st Post 302959119 by Syeda Sumayya on Thursday 29th of October 2015 03:06:14 AM
Old 10-29-2015
Print the 1st column and the value in 2nd or 3rd column if that is different from the values in 1st

I have file that looks like this,

Code:
DIP-17571N|refseq:NP_651151   DIP-17460N|refseq:NP_511165|uniprotkb:P45890      DIP-17571N|refseq:NP_651151
DIP-19241N|refseq:NP_524261    DIP-19241N|refseq:NP_524261       DIP-17151N|refseq:NP_524316|uniprotkb:O16797
DIP-19588N|refseq:NP_731165     DIP-19588N|refseq:NP_731165       DIP-19589N|refseq:NP_647684
DIP-20632N|refseq:NP_476602     DIP-492N|refseq:NP_477499|uniprotkb:P23647        DIP-20632N|refseq:NP_476602
DIP-23436N|refseq:NP_536784     DIP-23436N|refseq:NP_536784       DIP-23130N|refseq:NP_652017
DIP-18269N|refseq:NP_523724     DIP-20786N|refseq:NP_649297       DIP-18269N|refseq:NP_523724
DIP-20861N|refseq:NP_647634    DIP-20861N|refseq:NP_647634       DIP-19344N|refseq:NP_572751
DIP-23837N|refseq:NP_573057   DIP-23837N|refseq:NP_573057       DIP-5N|refseq:NP_476859|uniprotkb:P07207
DIP-59926N|refseq:NP_228099     DIP-59926N|refseq:NP_228099       DIP-59927N|refseq:NP_228100
DIP-23655N|refseq:NP_648922    DIP-17971N|refseq:NP_648929       DIP-23655N|refseq:NP_648922
DIP-22713N|refseq:NP_524108    DIP-21138N|refseq:NP_722721       DIP-22713N|refseq:NP_524108
DIP-21320N|refseq:NP_730973     DIP-17533N|refseq:NP_611700       DIP-21320N|refseq:NP_730973
DIP-22051N|refseq:NP_573109     DIP-28047N        DIP-22051N|refseq:NP_573109

I want to print the 1st column and the value in 2nd or 3rd column if that is different from the values in 1st column, side by side.

This is how I want the output to be like,
Code:
DIP-17571N|refseq:NP_651151   DIP-17460N|refseq:NP_511165|uniprotkb:P45890
DIP-19241N|refseq:NP_524261   DIP-17151N|refseq:NP_524316|uniprotkb:O16797
DIP-19588N|refseq:NP_731165    DIP-19589N|refseq:NP_647684
DIP-20632N|refseq:NP_476602    DIP-492N|refseq:NP_477499|uniprotkb:P23647 
DIP-23436N|refseq:NP_536784    DIP-23130N|refseq:NP_652017
DIP-18269N|refseq:NP_523724    DIP-20786N|refseq:NP_649297

and so on...

Any help would be highly appreciated.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

print unique values of a column and sum up the corresponding values in next column

Hi All, I have a file which is having 3 columns as (string string integer) a b 1 x y 2 p k 5 y y 4 ..... ..... Question: I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
Discussion started by: amigarus
6 Replies

2. Shell Programming and Scripting

comparing column of two different files and print the column from in order of 2nd file

Hi friends, My file is like: Second file is : I need to print the rows present in file one, but in order present in second file....I used while read gh;do awk ' $1=="' $gh'" {print >> FILENAME"output"} ' cat listoffirstfile done < secondfile but the output I am... (14 Replies)
Discussion started by: CAch
14 Replies

3. Shell Programming and Scripting

1st column,2nd column on first line 3rd,4th on second line ect...

I need to take one column of data and put it into the following format: 1st line,2nd line 3rd line,4th line 5th line,6th line ... Thanks! (6 Replies)
Discussion started by: batcho
6 Replies

4. Shell Programming and Scripting

Calculate 2nd Column Based on 1st Column

Dear All, I have input file like this. input.txt CE2_12-15 3950.00 589221.0 9849709.0 768.0 CE2_12_2012 CE2_12-15 3949.00 589199.0 9849721.0 768.0 CE2_12_2012 CE2_12-15 3948.00 589178.0 9849734.0 768.0 CE2_12_2012 CE2_12-52 1157.00 ... (3 Replies)
Discussion started by: attila
3 Replies

5. Shell Programming and Scripting

Print every 5 4th column values as separate row with different first column

Hi, I have the following file, chr1 100 200 20 chr1 201 300 22 chr1 220 345 23 chr1 230 456 33.5 chr1 243 567 90 chr1 345 600 20 chr1 430 619 21.78 chr1 870 910 112.3 chr1 914 920 12 chr1 930 999 13 My output would be peak1 20 22 23 33.5 90 peak2 20 21.78 112.3 12 13 Here the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

6. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

7. Shell Programming and Scripting

Changing values only in 3rd column and 4th column

#cat file testing test! nipw asdkjasjdk ok! what !ok host server1 check_ssh_disk!102.56.1.101!30!50!/ other host server 2 des check_ssh_disk!192.6.1.10!40!30!/ #grep check file| awk -F! '{print $3,$4}'|awk '{gsub($1,"",$1)}1' 50 30 # Output: (6 Replies)
Discussion started by: kenshinhimura
6 Replies

8. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

9. UNIX for Dummies Questions & Answers

Want the UNIX code - I want to sum of the 1st column wherever the first 2nd and 3rd columns r equal

I have the code for the below things.. File1 has the content as below 8859 0 subscriberCreate 18 0 subscriberPaymentMethodChange 1650 0 subscriberProfileUpdate 7668 0 subscriberStatusChange 13 4020100 subscriberProfileUpdate 1 4020129 subscriberStatusChange 2 4020307 subscriberCreate 8831... (5 Replies)
Discussion started by: Mahen
5 Replies

10. UNIX for Beginners Questions & Answers

Compare 1st column from 2 file and if match print line from 1st file and append column 7 from 2nd

hi I have 2 file with more than 10 columns for both 1st file apple,0,0,0...... orange,1,2,3..... mango,2,4,5..... 2nd file apple,2,3,4,5,6,7... orange,2,3,4,5,6,8... watermerlon,2,3,4,5,6,abc... mango,5,6,7,4,6,def.... (1 Reply)
Discussion started by: tententen
1 Replies
Bio::Tools::Run::Maq(3pm)				User Contributed Perl Documentation				 Bio::Tools::Run::Maq(3pm)

NAME
Bio::Tools::Run::Maq - Run wrapper for the Maq short-read assembler *BETA* SYNOPSIS
# create an assembly $maq_fac = Bio::Tools::Run::Maq->new(); $maq_assy = $maq_fac->run( 'reads.fastq', 'refseq.fas' ); # if IO::Uncompress::Gunzip is available... $maq_assy = $maq_fac->run( 'reads.fastq.gz', 'refseq.gz'); # paired-end $maq_assy = $maq_fac->run( 'reads.fastq', 'refseq.fas', 'paired-reads.fastq'); # be more strict $maq_fac->set_parameters( -c2q_min_map_quality => 60 ); $maq_assy = $maq_fac->run( 'reads.fastq', 'refseq.fas', 'paired-reads.fastq'); # run maq commands separately $maq_fac = Bio::Tools::Run::Maq->new( -command => 'pileup', -single_end_quality => 1 ); $maq_fac->run_maq( -bfa => 'refseq.bfa', -map => 'maq_assy.map', -txt => 'maq_assy.pup.txt' ); DESCRIPTION
This module provides a wrapper interface for Heng Li's reference-directed short read assembly suite "maq" (see http://maq.sourceforge.net/maq-man.shtml <http://maq.sourceforge.net/maq-man.shtml> for manuals and downloads). There are two modes of action. o EasyMaq The first is a simple pipeline through the "maq" commands, taking your read data in and squirting out an assembly object of type Bio::Assembly::IO::maq. The pipeline is based on the one performed by "maq.pl easyrun": Action maq commands ------ ------------ data conversion to fasta2bfa, fastq2bfq maq binary formats map sequence reads map to reference seq assemble, creating assemble consensus convert map & cns mapview, cns2fq files to plaintext (for B:A:IO:maq) Command-line options can be directed to the "map", "assemble", and "cns2fq" steps. See "OPTIONS" below. o BigMaq The second mode is direct access to "maq" commands. To run a "maq" command, construct a run factory, specifying the desired command using the "-command" argument in the factory constructor, along with options specific to that command (see "OPTIONS"): $maqfac->Bio::Tools::Run::Maq->new( -command => 'fasta2bfa' ); To execute, use the "run_maq" methods. Input and output files are specified in the arguments of "run_maq" (see "FILES"): $maqfac->run_maq( -fas => "myref.fas", -bfa => "myref.bfa" ); OPTIONS
"maq" is complex, with many subprograms (commands) and command-line options and file specs for each. This module attempts to provide commands and options comprehensively. You can browse the choices like so: $maqfac = Bio::Tools::Run::Maq->new( -command => 'assemble' ); # all maq commands @all_commands = $maqfac->available_parameters('commands'); @all_commands = $maqfac->available_commands; # alias # just for assemble @assemble_params = $maqfac->available_parameters('params'); @assemble_switches = $maqfac->available_parameters('switches'); @assemble_all_options = $maqfac->available_parameters(); Reasonably mnemonic names have been assigned to the single-letter command line options. These are the names returned by "available_parameters", and can be used in the factory constructor like typical BioPerl named parameters. See http://maq.sourceforge.net/maq-manpage.shtml <http://maq.sourceforge.net/maq-manpage.shtml> for the gory details. FILES
When a command requires filenames, these are provided to the "run_maq" method, not the constructor ("new()"). To see the set of files required by a command, use "available_parameters('filespec')" or the alias "filespec()": $maqfac = Bio::Tools::Run::Maq->new( -command => 'map' ); @filespec = $maqfac->filespec; This example returns the following array: map bfa bfq1 #bfq2 2>#log This indicates that map ("maq" binary mapfile), bfa ("maq" binary fasta), and bfq ("maq" binary fastq) files MUST be specified, another bfq file MAY be specified, and a log file receiving STDERR also MAY be specified. Use these in the "run_maq" call like so: $maqfac->run_maq( -map => 'my.map', -bfa => 'myrefseq.bfa', -bfq1 => 'reads1.bfq', -bfq2 => 'reads2.bfq' ); Here, the "log" parameter was unspecified. Therefore, the object will store the programs STDERR output for you in the "stderr()" attribute: handle_map_warning($maqfac) if ($maqfac->stderr =~ /warning/); STDOUT for a run is also saved, in "stdout()", unless a file is specified to slurp it according to the filespec. "maq" STDOUT usually contains useful information on the run. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: http://redmine.open-bio.org/projects/bioperl/ AUTHOR - Mark A. Jensen Email maj -at- fortinbras -dot- us APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ new() Title : new Usage : my $obj = new Bio::Tools::Run::Maq(); Function: Builds a new Bio::Tools::Run::Maq object Returns : an instance of Bio::Tools::Run::Maq Args : run Title : run Usage : $assembly = $maq_assembler->run($read1_fastq_file, $refseq_fasta_file, $read2_fastq_file); Function: Run the maq assembly pipeline. Returns : Assembly results (file, IO object or Assembly object) Args : - fastq file containing single-end reads - fasta file containing the reference sequence - [optional] fastq file containing paired-end reads Note : gzipped inputs are allowed if IO::Uncompress::Gunzip is available run_maq() Title : run_maq Usage : $obj->run_maq( @file_args ) Function: Run a maq command as specified during object contruction Returns : Args : a specification of the files to operate on: stdout() Title : stdout Usage : $fac->stdout() Function: store the output from STDOUT for the run, if no file specified in run_maq() Example : Returns : scalar string Args : on set, new value (a scalar or undef, optional) stderr() Title : stderr Usage : $fac->stderr() Function: store the output from STDERR for the run, if no file is specified in run_maq() Example : Returns : scalar string Args : on set, new value (a scalar or undef, optional) Bio::Tools::Run::AssemblerBase overrides _check_sequence_input() No-op. _check_optional_quality_input() No-op. _prepare_input_sequences Convert input fastq and fasta to maq format. _collate_subcmd_args() Title : _collate_subcmd_args Usage : $args_hash = $self->_collate_subcmd_args Function: collate parameters and switches into command-specific arg lists for passing to new() Returns : hash of named argument lists Args : [optional] composite cmd prefix (scalar string) [default is 'run'] _run() Title : _run Usage : $factory->_run() Function: Run a maq assembly pipeline Returns : depends on call (An assembly file) Args : - single end read file in maq bfq format - reference seq file in maq bfa format - [optional] paired end read file in maq bfq format available_parameters() Title : available_parameters Usage : @cmds = $fac->available_commands('commands'); Function: Use to browse available commands, params, or switches Returns : array of scalar strings Args : 'commands' : all maq commands 'params' : parameters for this object's command 'switches' : boolean switches for this object's command 'filespec' : the filename spec for this object's command 4Geeks : Overrides Bio::ParameterBaseI via Bio::Tools::Run::AssemblerBase perl v5.12.3 2011-06-18 Bio::Tools::Run::Maq(3pm)
All times are GMT -4. The time now is 05:14 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy