Sponsored Content
Top Forums Shell Programming and Scripting Select distinct rows in a file by last column Post 302777651 by apenkov on Friday 8th of March 2013 08:32:06 AM
Old 03-08-2013
Thanks a lot!!
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

select distinct row from a file

Hi, buddies out there. I have a text file ( only one column ) which I created using vi editor. The file contains duplicate rows and I would like to select distinct rows, how to go on it using unix command: file content = apple apple orange watermelon apple orange Can it be done... (7 Replies)
Discussion started by: merry susana
7 Replies

2. UNIX for Dummies Questions & Answers

Select Distinct on multiple fields

How do I create a script that provides a count of distinct values of a particular field in a file utilizing commonly available UNIX commands (sh or awk)? Field1|Field2|Field3|Field4 AAA|BBB|CCC|DDD 111|222|333|777 AAA|EEE|ZZZ|EEE 111|555|333|444 AAA|EEE|CCC|DDD 111|222|555|444 For... (2 Replies)
Discussion started by: Refresher
2 Replies

3. Shell Programming and Scripting

awk to select rows based on condition on column

I have got a file like this 003ABC00281020091005000100042.810001 ... (8 Replies)
Discussion started by: Maruti
8 Replies

4. Shell Programming and Scripting

Select distinct values from a flat file

Hi , I have a similar problem. Please can anyone help me with a shell script or a perl. I have a flat file like this fruit country apple germany apple india banana pakistan banana saudi mango india I want to get a output like fruit country apple ... (7 Replies)
Discussion started by: smalya
7 Replies

5. UNIX for Dummies Questions & Answers

how to get distinct counts in a column of a file

If i have a file sample.txt with more than 10 columns and 11th column as following data. would it be possible to get the distinct counts of values in single shot,Thank you. Y Y N N N P P o Expected Result: Value count Y 2 N 3 P 2 (2 Replies)
Discussion started by: Ariean
2 Replies

6. Shell Programming and Scripting

Select rows where the 3rd column value is over xx

Hi All, I've got a text file which is Postcode,Postcode, Travel_time and I want to select all of the rows where the 3rd column value is over 255. Can someone show me the magic on how to do this? Originally it did contain 4 columns but i've managed to strip out the first 3 using: cat textfile |... (3 Replies)
Discussion started by: gman
3 Replies

7. UNIX for Dummies Questions & Answers

merging rows into new file based on rows and first column

I have 2 files, file01= 7 columns, row unknown (but few) file02= 7 columns, row unknown (but many) now I want to create an output with the first field that is shared in both of them and then subtract the results from the rest of the fields and print there e.g. file 01 James|0|50|25|10|50|30... (1 Reply)
Discussion started by: A-V
1 Replies

8. Shell Programming and Scripting

How to find DISTINCT rows and combine in one row?

Hi , i need to display only one of duplicated values and merged them in one record only when tag started with 3100.2.128.8 3100.2.97.1=192.168.0.12 3100.2.128.8=418/66/03e9/0044801 3100.2.128.8=418/66/03ea/0044601 3100.2.128.8=418/66/03e9/0044801 3100.2.128.8=418/66/03ea/0044601... (5 Replies)
Discussion started by: OTNA
5 Replies

9. Shell Programming and Scripting

Script to select the rows from the feed file based on the input value provided

Hi Folks, I have the below feed file named abc1.txt in which you can see there is a title and below is the respective values in the rows and it is completely pipe delimited file ,. ... (3 Replies)
Discussion started by: punpun66
3 Replies

10. UNIX for Dummies Questions & Answers

Select distinct sequences from fasta file and list

Hi How can I extract sequences from a fasta file with respect a certain criteria? The beginning of my file (containing in total more than 1000 sequences) looks like this: >H8V34IS02I59VP SDACNDLTIALLQIAREVRVCNPTFSFRWHPQVKDEVMRECFDCIRQGLG YPSMRNDPILIANCMNWHGHPLEEARQWVHQACMSPCPSTKHGFQPFRMA... (6 Replies)
Discussion started by: Marion MPI
6 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 03:56 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy