Sponsored Content
Operating Systems Linux Print the 1st column and the value in 2nd or 3rd column if that is different from the values in 1st Post 302959119 by Syeda Sumayya on Thursday 29th of October 2015 03:06:14 AM
Old 10-29-2015
Print the 1st column and the value in 2nd or 3rd column if that is different from the values in 1st

I have file that looks like this,

Code:
DIP-17571N|refseq:NP_651151   DIP-17460N|refseq:NP_511165|uniprotkb:P45890      DIP-17571N|refseq:NP_651151
DIP-19241N|refseq:NP_524261    DIP-19241N|refseq:NP_524261       DIP-17151N|refseq:NP_524316|uniprotkb:O16797
DIP-19588N|refseq:NP_731165     DIP-19588N|refseq:NP_731165       DIP-19589N|refseq:NP_647684
DIP-20632N|refseq:NP_476602     DIP-492N|refseq:NP_477499|uniprotkb:P23647        DIP-20632N|refseq:NP_476602
DIP-23436N|refseq:NP_536784     DIP-23436N|refseq:NP_536784       DIP-23130N|refseq:NP_652017
DIP-18269N|refseq:NP_523724     DIP-20786N|refseq:NP_649297       DIP-18269N|refseq:NP_523724
DIP-20861N|refseq:NP_647634    DIP-20861N|refseq:NP_647634       DIP-19344N|refseq:NP_572751
DIP-23837N|refseq:NP_573057   DIP-23837N|refseq:NP_573057       DIP-5N|refseq:NP_476859|uniprotkb:P07207
DIP-59926N|refseq:NP_228099     DIP-59926N|refseq:NP_228099       DIP-59927N|refseq:NP_228100
DIP-23655N|refseq:NP_648922    DIP-17971N|refseq:NP_648929       DIP-23655N|refseq:NP_648922
DIP-22713N|refseq:NP_524108    DIP-21138N|refseq:NP_722721       DIP-22713N|refseq:NP_524108
DIP-21320N|refseq:NP_730973     DIP-17533N|refseq:NP_611700       DIP-21320N|refseq:NP_730973
DIP-22051N|refseq:NP_573109     DIP-28047N        DIP-22051N|refseq:NP_573109

I want to print the 1st column and the value in 2nd or 3rd column if that is different from the values in 1st column, side by side.

This is how I want the output to be like,
Code:
DIP-17571N|refseq:NP_651151   DIP-17460N|refseq:NP_511165|uniprotkb:P45890
DIP-19241N|refseq:NP_524261   DIP-17151N|refseq:NP_524316|uniprotkb:O16797
DIP-19588N|refseq:NP_731165    DIP-19589N|refseq:NP_647684
DIP-20632N|refseq:NP_476602    DIP-492N|refseq:NP_477499|uniprotkb:P23647 
DIP-23436N|refseq:NP_536784    DIP-23130N|refseq:NP_652017
DIP-18269N|refseq:NP_523724    DIP-20786N|refseq:NP_649297

and so on...

Any help would be highly appreciated.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

print unique values of a column and sum up the corresponding values in next column

Hi All, I have a file which is having 3 columns as (string string integer) a b 1 x y 2 p k 5 y y 4 ..... ..... Question: I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
Discussion started by: amigarus
6 Replies

2. Shell Programming and Scripting

comparing column of two different files and print the column from in order of 2nd file

Hi friends, My file is like: Second file is : I need to print the rows present in file one, but in order present in second file....I used while read gh;do awk ' $1=="' $gh'" {print >> FILENAME"output"} ' cat listoffirstfile done < secondfile but the output I am... (14 Replies)
Discussion started by: CAch
14 Replies

3. Shell Programming and Scripting

1st column,2nd column on first line 3rd,4th on second line ect...

I need to take one column of data and put it into the following format: 1st line,2nd line 3rd line,4th line 5th line,6th line ... Thanks! (6 Replies)
Discussion started by: batcho
6 Replies

4. Shell Programming and Scripting

Calculate 2nd Column Based on 1st Column

Dear All, I have input file like this. input.txt CE2_12-15 3950.00 589221.0 9849709.0 768.0 CE2_12_2012 CE2_12-15 3949.00 589199.0 9849721.0 768.0 CE2_12_2012 CE2_12-15 3948.00 589178.0 9849734.0 768.0 CE2_12_2012 CE2_12-52 1157.00 ... (3 Replies)
Discussion started by: attila
3 Replies

5. Shell Programming and Scripting

Print every 5 4th column values as separate row with different first column

Hi, I have the following file, chr1 100 200 20 chr1 201 300 22 chr1 220 345 23 chr1 230 456 33.5 chr1 243 567 90 chr1 345 600 20 chr1 430 619 21.78 chr1 870 910 112.3 chr1 914 920 12 chr1 930 999 13 My output would be peak1 20 22 23 33.5 90 peak2 20 21.78 112.3 12 13 Here the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

6. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

7. Shell Programming and Scripting

Changing values only in 3rd column and 4th column

#cat file testing test! nipw asdkjasjdk ok! what !ok host server1 check_ssh_disk!102.56.1.101!30!50!/ other host server 2 des check_ssh_disk!192.6.1.10!40!30!/ #grep check file| awk -F! '{print $3,$4}'|awk '{gsub($1,"",$1)}1' 50 30 # Output: (6 Replies)
Discussion started by: kenshinhimura
6 Replies

8. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

9. UNIX for Dummies Questions & Answers

Want the UNIX code - I want to sum of the 1st column wherever the first 2nd and 3rd columns r equal

I have the code for the below things.. File1 has the content as below 8859 0 subscriberCreate 18 0 subscriberPaymentMethodChange 1650 0 subscriberProfileUpdate 7668 0 subscriberStatusChange 13 4020100 subscriberProfileUpdate 1 4020129 subscriberStatusChange 2 4020307 subscriberCreate 8831... (5 Replies)
Discussion started by: Mahen
5 Replies

10. UNIX for Beginners Questions & Answers

Compare 1st column from 2 file and if match print line from 1st file and append column 7 from 2nd

hi I have 2 file with more than 10 columns for both 1st file apple,0,0,0...... orange,1,2,3..... mango,2,4,5..... 2nd file apple,2,3,4,5,6,7... orange,2,3,4,5,6,8... watermerlon,2,3,4,5,6,abc... mango,5,6,7,4,6,def.... (1 Reply)
Discussion started by: tententen
1 Replies
Bio::Perl(3pm)						User Contributed Perl Documentation					    Bio::Perl(3pm)

NAME
Bio::Perl - Functional access to BioPerl for people who don't know objects SYNOPSIS
use Bio::Perl; # will guess file format from extension $seq_object = read_sequence($filename); # forces genbank format $seq_object = read_sequence($filename,'genbank'); # reads an array of sequences @seq_object_array = read_all_sequences($filename,'fasta'); # sequences are Bio::Seq objects, so the following methods work # for more info see Bio::Seq, or do 'perldoc Bio/Seq.pm' print "Sequence name is ",$seq_object->display_id," "; print "Sequence acc is ",$seq_object->accession_number," "; print "First 5 bases is ",$seq_object->subseq(1,5)," "; # get the whole sequence as a single string $sequence_as_a_string = $seq_object->seq(); # writing sequences write_sequence(">$filename",'genbank',$seq_object); write_sequence(">$filename",'genbank',@seq_object_array); # making a new sequence from just a string $seq_object = new_sequence("ATTGGTTTGGGGACCCAATTTGTGTGTTATATGTA", "myname","AL12232"); # getting a sequence from a database (assumes internet connection) $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); $seq_object = get_sequence('embl',"AI129902"); $seq_object = get_sequence('genbank',"AI129902"); # BLAST a sequence (assummes an internet connection) $blast_report = blast_sequence($seq_object); write_blast(">blast.out",$blast_report); DESCRIPTION
Easy first time access to BioPerl via functions. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated. bioperl-l@bioperl.org Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web: https://redmine.open-bio.org/projects/bioperl/ AUTHOR - Ewan Birney Email birney@ebi.ac.uk APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ read_sequence Title : read_sequence Usage : $seq = read_sequence('sequences.fa') $seq = read_sequence($filename,'genbank'); # pipes are fine $seq = read_sequence("my_fetching_program $id |",'fasta'); Function: Reads the top sequence from the file. If no format is given, it will try to guess the format from the filename. If a format is given, it forces that format. The filename can be any valid perl open() string - in particular, you can put in pipes Returns : A Bio::Seq object. A quick synopsis: $seq_object->display_id - name of the sequence $seq_object->seq - sequence as a string Args : Two strings, first the filename - any Perl open() string is ok Second string is the format, which is optional For more information on Seq objects see Bio::Seq. read_all_sequences Title : read_all_sequences Usage : @seq_object_array = read_all_sequences($filename); @seq_object_array = read_all_sequences($filename,'genbank'); Function: Just as the function above, but reads all the sequences in the file and loads them into an array. For very large files, you will run out of memory. When this happens, you've got to use the SeqIO system directly (this is not so hard! Don't worry about it!). Returns : array of Bio::Seq objects Args : two strings, first the filename (any open() string is ok) second the format (which is optional) See Bio::SeqIO and Bio::Seq for more information write_sequence Title : write_sequence Usage : write_sequence(">new_file.gb",'genbank',$seq) write_sequence(">new_file.gb",'genbank',@array_of_sequence_objects) Function: writes sequences in the specified format Returns : true Args : filename as a string, must provide an open() output file format as a string one or more sequence objects new_sequence Title : new_sequence Usage : $seq_obj = new_sequence("GATTACA", "kino-enzyme"); Function: Construct a sequency object from sequence string Returns : A Bio::Seq object Args : sequence string name string (optional, default "no-name-for-sequence") accession - accession number (optional, no default) blast_sequence Title : blast_sequence Usage : $blast_result = blast_sequence($seq) $blast_result = blast_sequence('MFVEGGTFASEDDDSASAEDE'); Function: If the computer has Internet accessibility, blasts the sequence using the NCBI BLAST server against nrdb. It chooses the flavour of BLAST on the basis of the sequence. This function uses Bio::Tools::Run::RemoteBlast, which itself use Bio::SearchIO - as soon as you want to know more, check out these modules Returns : Bio::Search::Result::GenericResult.pm Args : Either a string of protein letters or nucleotides, or a Bio::Seq object write_blast Title : write_blast Usage : write_blast($filename,$blast_report); Function: Writes a BLAST result object (or more formally a SearchIO result object) out to a filename in BLAST-like format Returns : none Args : filename as a string Bio::SearchIO::Results object get_sequence Title : get_sequence Usage : $seq_object = get_sequence('swiss',"ROA1_HUMAN"); Function: If the computer has Internet access this method gets the sequence from Internet accessible databases. Currently this supports Swissprot ('swiss'), EMBL ('embl'), GenBank ('genbank'), GenPept ('genpept'), and RefSeq ('refseq'). Swissprot and EMBL are more robust than GenBank fetching. If the user is trying to retrieve a RefSeq entry from GenBank/EMBL, the query is silently redirected. Returns : A Bio::Seq object Args : database type - one of swiss, embl, genbank, genpept, or refseq translate Title : translate Usage : $seqobj = translate($seq_or_string_scalar) Function: translates a DNA sequence object OR just a plain string of DNA to amino acids Returns : A Bio::Seq object Args : Either a sequence object or a string of just DNA sequence characters translate_as_string Title : translate_as_string Usage : $seqstring = translate_as_string($seq_or_string_scalar) Function: translates a DNA sequence object OR just a plain string of DNA to amino acids Returns : A string of just amino acids Args : Either a sequence object or a string of just DNA sequence characters reverse_complement Title : reverse_complement Usage : $seqobj = reverse_complement($seq_or_string_scalar) Function: reverse complements a string or sequence argument producing a Bio::Seq - if you want a string, you can use reverse_complement_as_string Returns : A Bio::Seq object Args : Either a sequence object or a string of just DNA sequence characters revcom Title : revcom Usage : $seqobj = revcom($seq_or_string_scalar) Function: reverse complements a string or sequence argument producing a Bio::Seq - if you want a string, you can use reverse_complement_as_string This is an alias for reverse_complement Returns : A Bio::Seq object Args : Either a sequence object or a string of just DNA sequence characters reverse_complement_as_string Title : reverse_complement_as_string Usage : $string = reverse_complement_as_string($seq_or_string_scalar) Function: reverse complements a string or sequence argument producing a string Returns : A string of DNA letters Args : Either a sequence object or a string of just DNA sequence characters revcom_as_string Title : revcom_as_string Usage : $string = revcom_as_string($seq_or_string_scalar) Function: reverse complements a string or sequence argument producing a string Returns : A string of DNA letters Args : Either a sequence object or a string of just DNA sequence characters perl v5.14.2 2012-03-02 Bio::Perl(3pm)
All times are GMT -4. The time now is 11:11 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy