Somebody HELP!
I have a huge log file (TEXT) 76298035 bytes.
It's a logfile of IMEIs and IMSIS that I get from my EIR node.
Here is how the contents of the file look like:
000000,
1 33016382000913 652020100423994
1 33016382002353 652020100430743
1 33017035101003 652020100441736... (4 Replies)
hi friens, :)
if i need to find files with extension .c++,.C++,.cpp,.Cpp,.CPp,.cPP,.CpP,.cpP,.c,.C
wat is the pattern for finding them
:confused: (2 Replies)
Good day, great gurus,
I'm new to Perl, and programming in general. I'm trying to retrieve a column of data from my text file which spans a non-specific number of lines. So I did a regexp that will pick out the columns. However,my pattern would vary. I tried using a foreach loop unsuccessfully.... (2 Replies)
>testfile
while read x
do
if then
echo $x >> testfile
else
fi
if then
echo $x >> testfile
else
fi
done < list_of_files
is there any efficient way to search abc.dml and xyz.dml ? (2 Replies)
hello
i have two files
temp.txt
and temp_unique.text
the second file consists the unique fields from the temp.txt file
the strings stored are in the following form
4,4
17,12
15,65
4,4
14,41
15,65
65,89
1254,1298i'm able to run the following script to get the total count of a... (3 Replies)
Hello Linux Masters,
I am not a linux expert therefore i need help from linux gurus.
Well i have a requirement where i need to search all files based on first patterns and after seraching all files then serach second pattern in all files which i have extracted based on first pattern.... (1 Reply)
Hi,
I need to correct line breaks for huge files (more than 1MM records in a file) and then format it properly.
Except the header and trailer, each record starts with 'D'.
Requirement:Scan the whole file except the header and trailer records and see if any of the records start with... (19 Replies)
Hi All,
I want to search for a certain string in thousands of files and these files are distributed over different directories created daily. For that I created a small script in bash but while running it I am getting the below error:
/ms.sh: xrealloc: subst.c:5173: cannot allocate... (17 Replies)
Hello Friends,
I have the below scenario in my current project. Suggest me which tool ( perl,python etc) is best to this scenario. Or should I go for Programming language ( C/Java )..
(1) I will be having a very big file ( information about 200million subscribers will be stored in it ). This... (5 Replies)
Discussion started by: panyam
5 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)