[ask]filtering file to indexing...

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a file using 2-D indexing system

I have a file and want to split it using a 2-D index system for example if the file is p.dat with 6 data sets separated by ">". I want to set nx=3, ny=2. I need to create files p.dat.1.1 p.dat.1.2 p.dat.1.3 p.dat.2.1 p.dat.2.2 p.dat.2.3 I have tried using a single index and want...

2. Shell Programming and Scripting

Filtering first file columns based on second file column

Hi friends, I have one file like below. (.csv type) SNo,data1,data2 1,1,2 2,2,3 3,3,2 and another file like below. Exclude data1 where Exclude should be treated as column name in file2. I want the output shown below. SNo,data2 1,2 2,3 3,2 Where my data1 column got removed from...

3. UNIX for Dummies Questions & Answers

Filtering records from 1 file based on some manipulation doen on second file

Hi, I am looking for an awk script which should help me to meet the following requirement: File1 has records in following format INF: FAILEd RECORD AB1234 INF: FAILEd RECORD PQ1145 INF: FAILEd RECORD AB3215 INF: FAILEd RECORD AB6114 ............................

4. Shell Programming and Scripting

indexing a file

hello guys, I have a file like this: input.dat Push-to-talk No Coonection IP support Support for IP telephony Yes Built-in SIP stack Yes Support via software Yes Microsoft Support for Microsoft Exchange Yes UMA

5. Shell Programming and Scripting

indexing list of words in a file

Hey all, I'm doing a project currently and want to index words in a webpage. So there would be a file with webpage content and a file with list of words, I want an output file with true and false that would show which word exists in the webpage. example: Webpage content data.html ...

6. Shell Programming and Scripting

filtering the rows in a file

hi all, please help on this isssue, i have a file which contains something like this and i want to seprate the servers which has vasd.pid ,i need only server names. i want output something like this which vasd.pid . server1 server3 server4

7. UNIX for Dummies Questions & Answers

Filtering a file

I have a list of directories looking something like; /usr/local/1/in /usr/local/1/out /usr/local/1/archive /usr/local/2/in /usr/local/2/out /usr/local/2/archive /usr/local/3/in /usr/local/3/out /usr/local/3/archive Is there a way I can filter the out and archive directories so I...

8. UNIX for Dummies Questions & Answers

Filtering Log file

Hi, Iam trying to filter a log file in the below format |fffff|hhhhh|ffff|dd|mm|yy|hh|min||dd|mm|yy|hh|min the first set of |dd|mm|yy|hh|min is when the application ran the second set of |dd|mm|yy|hh|min when it ended. I will be removing the last of the months in the log file to...

9. Shell Programming and Scripting

Indexing or Filtering code- Pattern Search by comparing two files

So here is goes to the Gurus of shell programming......I have tried a lot of different ways and its a very challenging code to write but i am enjoying it as i troubleshoot and hopefully someone can provide me a better option....Thank you in advance for your time and support....Much appreciated... ...

10. UNIX for Dummies Questions & Answers

problem in filtering the file

-------------------------------------------------------------------------------- Hi, Plz help me out with this. I have some requirement like this..... I have a file like this... * CS sent email (11.20) CALYPSO 1031276 9076673 CDSHY FAILED Nov 19 2007 7:28AM OASYS: Unable to find CUSTOMER...

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)

Shell Programming and Scripting