Sponsored Content
Top Forums Shell Programming and Scripting Differential substring removal using coordinates Post 302584223 by ahamed101 on Thursday 22nd of December 2011 11:57:56 AM
Old 12-22-2011
Try this...
Code:
awk 'NR==FNR{split($2,c,",");split($3,b,",");a[$1]=c[2]" "b[1]; next}
{if($1 in a){split(a[$1],d," "); print substr($2,d[1]+1,d[2]-d[1]-1)}}' file2 file1

If solaris, use nawk!

--ahamed
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Tar differential backup

I am backing up some data to an NTFS formatted backup drive. I have to preserve the Unix permissions of the data being backed up and therfore use backup into a tar file. I would like to backup the differnential data in the tar file similiar to how Rsync works so as to save on backup time as it... (1 Reply)
Discussion started by: jelloir
1 Replies

2. High Performance Computing

Differential Equations

I`m having a cluster with Rocks 5.2 distribution and I want to solve differential equations and I`m interested to know if are some programs already developed to do this. (3 Replies)
Discussion started by: rapo
3 Replies

3. UNIX for Advanced & Expert Users

Differential or Incremental backups in Unix

Hi, Just wanted to know is there any way that we can take differential or incremental backups in Unix(Solaris/AIX/Linux or Hpunix).What is the procedure. Is any doc avaialble on this? Its urgent and any help/suggestions would be highly appreciable. Regards, Ravi Dwivedi (3 Replies)
Discussion started by: dwiravi
3 Replies

4. Shell Programming and Scripting

Search for particular tag and arrange as coordinates

Hi I have a file whose sample contents are shown here, 1.2.3.4->2.4.2.4 a(10) b(20) c(30) 1.2.3.4->2.9.2.4 a(10) c(20) 2.3.4.3->3.6.3.2 b(40) d(50) c(20) 2.3.4.3->3.9.0.2 a(40) e(50) c(20) 1.2.3.4->3.4.2.4 a(10) c(30) 6.2.3.4->2.4.2.5 c(10) . . . . Here I need to search... (5 Replies)
Discussion started by: AKD
5 Replies

5. Shell Programming and Scripting

Determination n points between two coordinates

Hi guys. Can anyone tell me how to determine points between two coardinates. For example: Which type of command line gives me 50 points between (8, -5, 7) and (2, 6, 9) points Thanks (5 Replies)
Discussion started by: rpf
5 Replies

6. UNIX for Dummies Questions & Answers

removal by substring

Hi guys, I am trying to remove lines that have a duplicate substring from any part in the file. So, for ex: 433043950359.3 5033 305935 2 2dd 5ffgs DOG43453552.A 3443565634 95460 3435 45 23d 56ggh DOG343211 3423895702359 34 66699 9455 2324 DOG43453552.B This is a very large file, and... (1 Reply)
Discussion started by: verse123
1 Replies

7. UNIX for Dummies Questions & Answers

[SOLVED] Restoring differential backup files

I'm using a script (automysqlbackup) to dump mysql db's to .sql file followed by taking one full backup of the .sql file and the differential backups of the newer sql file every day using the tool diff. Now the backup destination folder contains files like, I would like to how do i restore... (3 Replies)
Discussion started by: csengineer
3 Replies

8. UNIX for Dummies Questions & Answers

overlapped genomic coordinates

Hi, I would like to know how can I get the ID of a feature if its genomic coordinates overlap the coordinates of another file. Example: Get the 4th column (ID) of this file1: chr1 10 100 gene1 chr2 3000 5000 gene2 chr3 200 1500 gene3 if it overlaps with a feature in this file2: chr2... (1 Reply)
Discussion started by: fadista
1 Replies

9. UNIX for Beginners Questions & Answers

Help with processing coordinates in a file.

I have a variation table (variation.txt) which is a very big file. The first column in the chromosome number and the second column is the position of the variation. I have a second file annotation.txt which has a list of 37,000 genes (1st column), their chromosome number(2nd column), their start... (1 Reply)
Discussion started by: Sanchari
1 Replies
GLAM2-PURGE(1)							   glam2 Manual 						    GLAM2-PURGE(1)

NAME
glam2-purge - Removes redundant sequences from a FASTA file SYNOPSIS
glam2-purge file score [options] DESCRIPTION
glam2-purge is a modified version of Andrew Neuwald's purge program that removes redundant sequences from a FASTA file. This is recommended in order to prevent highly similar sequences distorting the search for motifs. Purge works with either DNA or protein sequences and creates an output file such that no two sequences have a (gapless) local alignment score greater than a threshold specified by the user. The output file is named <file>.<score>. The alignment score is based on the BLOSUM62 matrix for proteins, and on a +5/-1 scoring scheme for DNA. Purge can also be used to mask tandem repeats. It uses the XNU program for this purpose. OPTIONS
-n Sequences are DNA (default: protein). -b Use blast heuristic method (default for protein). -e Use an exhaustive method (default for DNA). -q Keep first sequence in the set. -x Use xnu to mask protein tandem repeats. SEE ALSO
glam2(1), glam2format(1), glam2mask(1), glam2scan(1), xnu(1) The full Hypertext documentation of GLAM2 is available online at http://bioinformatics.org.au/glam2/ or on this computer in /usr/share/doc/glam2/. REFERENCES
Purge was written by Andy Neuwald and is described in more detail in Neuwald et al., "Gibbs motif sampling: detection of bacterial outer membrane protein repeats", Protein Science, 4:1618-1632, 1995. Please cite it if you use Purge. If you use GLAM2, please cite: MC Frith, NFW Saunders, B Kobe, TL Bailey (2008) Discovering sequence motifs with arbitrary insertions and deletions, PLoS Computational Biology (in press). AUTHORS
Andrew Neuwald Author of purge, renamed glam2-purge in Debian. Martin Frith Modified purge to be ANSI standard C and improved the user interface. Timothy Bailey Modified purge to be ANSI standard C and improved the user interface. Charles Plessy <plessy@debian.org> Formatted this manpage in DocBook XML for the Debian distribution. COPYRIGHT
The source code and the documentation of Purge and GLAM2 are released in the public domain. GLAM2 1056 05/19/2008 GLAM2-PURGE(1)
All times are GMT -4. The time now is 08:51 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy