Hi All!
I have obtained following output from a tool "pdftohtml" ::
So, my input is as under:
<text top="246" left="160" width="84" height="16" font="3">Business purpose</text>
<text top="260" left="506" width="220" height="16" font="3">giving the right information and new insights... (3 Replies)
Hi All,
Do anyone of you have any idea how to extract each<info> tag to each different file. I have 1000 raw files, which come in every 15 mins.( I am using bash)
I have tried my script as below, but it took hours to finish, which is inefficiency.
perl -n -e '/^<info>/ and open FH,">file".$n++;... (2 Replies)
Hi All,
I have a large xml file of invoices. The file looks like below:
<INVOICES>
<INVOICE>
<NAME>Customer A</NAME>
<INVOICE_NO>1234</INVOICE_NO>
</INVOICE>
<INVOICE>
<NAME>Customer A</NAME>
<INVOICE_NO>2345</INVOICE_NO>
</INVOICE>
<INVOICE>
<NAME>Customer A</NAME>... (9 Replies)
Hi All,
Need your assistance on another xml tag related issue. I have a xml file as below:
<INVOICES>
<INVOICE>
<BILL>
<BILL_NO>1234</BILL_NO>
<BILL_DATE>01 JAN 2011</BILL_DATE>
</BILL>
<NAMEINFO>
<NAME>ABC</NAME>
</NAMEINFO>
</INVOICE>
<INVOICE>
<BILL>
<BILL_NO>5678</BILL_NO>... (12 Replies)
Hi,
Here is a sample xml file and expected output.
I need to extract the element/tag name (not value) and xpath (sample output.txt).
But the main problem is I put here one simple xml file where I can clearly see the number of elements, but in real time I have a xml file which have over 500... (18 Replies)
Hello,
Hope you are doing fine. I have an log file which looks like as follows:
Some junk text1
Date: Thu Mar 15 13:38:46 CDT 2012 DATA SENT SUCCESSFULL:
Some jun text 2
Date: Thu Mar 15 13:38:46 CDT 2012 DATA SENT SUCCESSFULL: ... (3 Replies)
Hi ,
I have an file like below ,
cat input.txt
'Pattern2' => 'blahdalskdahdlahldahdlakhdlahdlkajdlkaadasdadadadadadadasda
ajlalnalndklandlaksdlkaddd'
'Pattern2' => 'aohaonalkndlanldandlandklasnldnaldnak'
............
........
.....
Here is what am trying to do ,
I want to grep for... (3 Replies)
In the awk below which executes as is, I am trying to add a condition that will extract the text or
value after the FR= for the lines in each line of file1 compared
to file2. As is the lines between the two files are either a match, Missing in file 1, or Missing in file2,
but I can not add the... (1 Reply)
Hi I have one file,
:16R::GENL
:20C::RELA//SET//ABC123456
:22F::XYZYESR
:20C::MITI//NETT/QWERTY12345
:16S::GENL
:16R::GENL
:20C::RELA//SET//XYZ23456
:22F::XYZYESR
:16S::GENL
The requirement is, if :20C::MITI// is present in any block, then replace the data of :20C::MITI// in... (8 Replies)
Discussion started by: Soumyadip Dutta
8 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)