Sponsored Content
Top Forums UNIX for Advanced & Expert Users Find and eliminate duplcate tokens Post 302355470 by MWPita on Tuesday 22nd of September 2009 06:50:42 PM
Old 09-22-2009
Quote:
Originally Posted by Scrutinizer
A little bit smaller:
Code:
 cut -d']' -f1 prueba2.txt|sort|uniq -d|wc -l

Nice one, at least is smaller that mine, hehe
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

tokens in unix ?

im trying to remove all occurences of " OF xyz " in a file where xyz could be any word assuming xyz is the last word on the line but I won't always be. at the moment I have sed 's/OF.*//' but I want a nicer solution which could be in pseudo code sed 's/OF.* (next token)//' Is... (6 Replies)
Discussion started by: seaten
6 Replies

2. Shell Programming and Scripting

reverse tokens with sed

I currently use this bash for loop below to reverse a set of tokens, example "abc def ghi" to "ghi def abc" but in looking at various sed one liner postings I notice two methods to reverse lines of text from a file (emulating tac) and reversing letters in a string (emulating rev) so I've spent some... (1 Reply)
Discussion started by: markc
1 Replies

3. Shell Programming and Scripting

: + : more tokens expected

Hello- Trying to add two numbers in a ksh shell scripts and i get this error every time I execute stat1_ex.ksh: + : more tokens expected stat1=`cat .stat1a.tmp | cut -f2 -d" "` stat2=`cat .stat2a.tmp | cut -f2 -d" "` j=$(($stat1 + $stat2)) # < Here a the like the errors out echo $j... (3 Replies)
Discussion started by: Nomaad
3 Replies

4. Shell Programming and Scripting

selecting tokens from a string...

i store the output of ls in a variable FL $FL=`ls` $echo $FL f1.txt f2.txt f3.txt f4.txt f5.txt script.sh script.sh~ test.txt now if i want to retrive the sub-string "f1.txt" from $FL we were taught that this is what i have to do $set $FL $echo $1 f1.txt and echo $2 would give... (1 Reply)
Discussion started by: c_d
1 Replies

5. Shell Programming and Scripting

Removing tokens from cmd line

Hi everyone. I am trying to develop my own shell,and i am in the part of redirection. let's say the user gives as input cat test > test2 in the array of arguments i want to keep only arg=cat,arg=test. ">" token is not an input file so cat cannot worka and test2 is output.how can i remove > and... (1 Reply)
Discussion started by: bashuser2
1 Replies

6. Shell Programming and Scripting

Replacing tokens

Hi all, I have a variable with value DateFileFormat=NAME.CODE.CON.01.#.S001.V1.D$.hent.txt I want this variable to get replaced with : var2 is a variable with string value DateFileFormat=NAME\\.CODE\\.CON\\.01\\.var2\\.S001\\.V1\\.D+\\.hent\\.txt\\.xml$ Please Help (3 Replies)
Discussion started by: abhinav192
3 Replies

7. Shell Programming and Scripting

+: more tokens expected

Hey everyone, i needed some help with this one. We move into a new file system (which should be the same as the previous one, other than the name directory has changed) and the script worked fine in the old file system and not the new. I'm trying to add the results from one with another but i'm... (4 Replies)
Discussion started by: senormarquez
4 Replies

8. Shell Programming and Scripting

Need tokens in shell script

Hi All, Im writing a shell script in which I want to get the folder names in one folder to be used in for loop. I have used: packsName=$(cd ~/packs/Acquisitions; ls -l| awk '{print $9}') echo $packsName o/p: opt temp user1 user2 ie. Im getting the output as a string. But I want... (3 Replies)
Discussion started by: AB10
3 Replies

9. Programming

Reading tokens

I have a String class with a function that reads tokens using a delimiter. For example String sss = "6:8:12:16"; nfb = sss.nfields_b (':'); String tkb1 = sss.get_token_b (':'); String tkb2 = sss.get_token_b (':'); String tkb3 = sss.get_token_b (':'); String tkb4 =... (1 Reply)
Discussion started by: kristinu
1 Replies

10. Programming

C++ getline, parse and take first tokens by condition

Hello, Trying to parse a file (in FASTA format) and reformat it. 1) Each record starts with ">" and followed by words separated by space, but they are in one same line for sure; 2) Sequences are following that may be in multiple rows with possible spaces inside until the next ">".... (18 Replies)
Discussion started by: yifangt
18 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 06:16 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy