Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Reporting characters after string Post 302970780 by Scrutinizer on Monday 11th of April 2016 09:35:51 PM
Old 04-11-2016
OK, that should be:
Code:
rec=substr(rec, RSTART+RLENGTH+len)

could you try again ?

Also since you are using GNU awk, you could try replacing the function with
Code:
  function reverse_complement(s,        t,i,n,F) {
    n=split(s,F,"")
    for(i=1;i<=n;i++)
      t=C[F[i]] t
    return t
  }

which should also work for mawk..

A mawk package is available for Bio-Linux you should be able to just install it, if it is not installed on your system by default. Could you try and test with that and see if the results are correct and what the performance results are? The difference can be extraordinary in certain cases..

Last edited by Scrutinizer; 04-12-2016 at 12:23 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing characters from a string

I need help to strip out the first two characters of the variable $FileName. Please help. FileName=`find . -mtime +0 -name '*'` Contents of variable $FileName: ./SRIZVI4.MCR_IDEAS_REPORT.LAST.052705.075405.csv I want to strip out "./" and place the contents in another variable. How do I... (3 Replies)
Discussion started by: mh53j_fe
3 Replies

2. Shell Programming and Scripting

Looking for a string in files and reporting matches

Can someone please help me figure out what the command syntax I need to use is? Here is what I am wanting to do. I have hundreds of thousands of files I need to look for a specific search string in. These files are spread across multiple subdirectories from one main directory. I would like... (4 Replies)
Discussion started by: btrotter
4 Replies

3. Shell Programming and Scripting

Add string after another string with special characters

Hello everyone, I'm writing a script to add a string to an XML file, right after a specified string that only occurs once in the file. For testing purposes I created a file 'testfile' that looks like this: 1 2 3 4 5 6 6 7 8 9 And this is the script as far as I've managed: ... (2 Replies)
Discussion started by: heliode
2 Replies

4. Programming

string with invalid characters

This is a pretty straight-forward question. Within a program of mine, I have a string that's going to be used as a filename, but it might have some invalid characters in it that wouldn't be valid in a filename. If there are any invalid characters, I want to get rid of them and essentially squeeze... (4 Replies)
Discussion started by: cleopard
4 Replies

5. Shell Programming and Scripting

get certain characters in a string

Hi Everyone, I have a.txt 12341" <sip:191@vo.my>;asdf=q" 116aaaa<sip:00091@vo.my>;penguin would like to get the output 191 00091 Please advice. Thanks (4 Replies)
Discussion started by: jimmy_y
4 Replies

6. UNIX for Dummies Questions & Answers

Count the characters in a string

Hi all, I like to know how to get the count of each character in a given word. Using the commands i can easily get the output. How do it without using the commands ( in shell programming or any programming) if you give outline of the program ( pseudo code ) i used the following commands ... (3 Replies)
Discussion started by: itkamaraj
3 Replies

7. Programming

C++ Special Characters in a String?

Hello. How can i put all of the special characters on my keyboard into a string in c++ ? I tried this but it doesn't work. string characters("~`!@#$%^&*()_-+=|\}]{ How can i accomplish this? Thanks in advance. (1 Reply)
Discussion started by: cbreiny
1 Replies

8. Shell Programming and Scripting

remove characters from string based on occurrence of a string

Hello Folks.. I need your help .. here the example of my problem..i know its easy..i don't all the commands in unix to do this especiallly sed...here my string.. dwc2_dfg_ajja_dfhhj_vw_dec2_dfgh_dwq desired output is.. dwc2_dfg_ajja_dfhhj it's a simple task with tail... (5 Replies)
Discussion started by: victor369
5 Replies

9. UNIX for Beginners Questions & Answers

Extract characters from a string name

Hi All, I am trying to extract only characters from a string value eg: abcdedg1234.cnf How can I extract only characters abcdedg and assign to a variable. Please help. Thanks (2 Replies)
Discussion started by: abhi_123
2 Replies

10. Shell Programming and Scripting

Outputting characters after a given string and reporting the characters in the row below --sed

I have this fastq file: @M04961:22:000000000-B5VGJ:1:1101:9280:7106 1:N:0:86 GGGGGGGGGGGGCATGAAAACATACAAACCGTCTTTCCAGAAATTGTTCCAAGTATCGGCAACAGCTTTATCAATACCATGAAAAATATCAACCACACCA +test-1 GGGGGGGGGGGGGGGGGCCGGGGGFF,EDFFGEDFG,@DGGCGGEGGG7DCGGGF68CGFFFGGGG@CGDGFFDFEFEFF:30CGAFFDFEFF8CAF;;8... (10 Replies)
Discussion started by: Xterra
10 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 02:14 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy