parsing file names and then grouping similar files Post: 302384080

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding & Moving Oldest File by Parsing/Sorting Date Info in File Names

I'm trying to write a script that will look in an /exports folder for the oldest export file and move it to a /staging folder. "Oldest" in this case is actually determined by date information embedded in the file names themselves. Also, the script should only move a file from /exports to...

2. Shell Programming and Scripting

Help with parsing mailbox folder list (identify similar folders)

List sample: user/xxx/Archives/2010 user/xxx/BLARG user/xxx/BlArG user/xxx/Burton user/xxx/DAY user/yyy/Trainees/Nutrition interns user/yyy/Trainees/Primary Care user/yyy/Trainees/Psychiatric NP interns user/yyy/Trainees/Psychiatric residents user/yyy/Trainees/Psychology...

3. Shell Programming and Scripting

Script to move files with similar names to folder

I have in directory /media/AUDIO/WAVE many .mp3 files with names like: my filename_01of02.mp3 my filename_02of02.mp3 Your File_01of06.mp3 Your File_02of06.mp3 etc.... In the same directory, /media/AUDIO/WAVE, I have many folders with names like 9780743579490 9780743579491 etc.. Inside...

4. Shell Programming and Scripting

Merging two columns from two files with similar names into a loop

I have two files like this: fileA.net A B C fileA.dat 1 2 3 and I want the output output_expected A 1 B 2 C 3 I know that the easier way is to do a paste fileA.net fileA.dat, but the problem is that I have 10,000 couple of files (fileB.net with fileB.dat; fileC.net with...

5. UNIX for Dummies Questions & Answers

Find the average based on similar names in the first column

I have a table, say this: name1 num1 num2 num3 num4 name2 num5 num6 num7 num8 name3 num1 num3 num4 num9 name2 num8 num9 num1 num2 name2 num4 num5 num6 num4 name4 num4 num5 num7 num8 name5 num1 num3 num9 num7 name5 num6 num8 num3 num4 I want a code that will sort my data according...

6. Windows & DOS: Issues & Discussions

Issue: Unzipping file containing files/folders with a similar name

Hi, I have a zip file created on a Linxux server that I need to extract on a Windows machine... The zip file containing folders with the same name but they each have a different case, one if camel case and the other is just capitalised. When I extract using 7zip, I get prompted if I want to...

7. Shell Programming and Scripting

Finding files in directory with similar names

So, I have a directory tree that has many files named thusly: X_REVY.PDF I need to find any files that have the same X portion (which can be nearly anything) as any another file (in any directory) but have different Y portions (which can be any number from 1-99). I then need it to return...

8. Shell Programming and Scripting

Exclude certain file names while selectingData files coming in different names in a file name called

Data files coming in different names in a file name called process.txt. 1. shipments_yyyymmdd.gz 2 Order_yyyymmdd.gz 3. Invoice_yyyymmdd.gz 4. globalorder_yyyymmdd.gz The process needs to discard all the below files and only process two of the 4 file names available ...

9. Answers to Frequently Asked Questions

Why Parsing Can't be Done With sed ( or similar tools)

Regularly we have questions like: i have an XML (C, C++, ...) file with this or that property and i want to extract the content of this or that tag (function, ...). How do i do it in sed? Yes, in some (very limited) cases this is possible, but in general this can't be done. That is: you can do...

10. Shell Programming and Scripting

To group the text (rows) by similar columns-names in a file

As part of some report generation, I've written a script to fetch the values from DB. But, unluckily, for certain Time ranges(1-9.99,10-19.99 etc), I don't have data in DB. In such cases, I would like to write zero (0) instead of empty. The desired output will be exported to csv file. ...

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding & Moving Oldest File by Parsing/Sorting Date Info in File Names

Discussion started by: nikosey

2. Shell Programming and Scripting

Help with parsing mailbox folder list (identify similar folders)

Discussion started by: spacegoose

3. Shell Programming and Scripting

Script to move files with similar names to folder

Discussion started by: glev2005

4. Shell Programming and Scripting

Merging two columns from two files with similar names into a loop

Discussion started by: valente