Sponsored Content
Full Discussion: Faster search needed
Top Forums Shell Programming and Scripting Faster search needed Post 302671169 by Klashxx on Friday 13th of July 2012 07:47:30 AM
Old 07-13-2012
Works fine in a HP-UX box:
Code:
# cat file1
111,222,333,444,555,666
777,888,999,000,111,222
111,222,333,444,555,888
# cat file2
666,AAA
222,BBB
888,CCC
# sort  -t, -k 6,6  -o file1 file1
# sort  -t, -k 1,1  -o file2 file2
#cat file1 file2
777,888,999,000,111,222
111,222,333,444,555,666
111,222,333,444,555,888
222,BBB
666,AAA
888,CCC
# join -t, -o 1.1,1.2,1.3,1.4,1.5,1.6,2.2 -j1 6 -j2 1 file1 file2
777,888,999,000,111,222,BBB
111,222,333,444,555,666,AAA
111,222,333,444,555,888,CCC

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Help needed in search string

Hi , I learning shell scripting.. I need to do the following in my shell script. Search a given logfile for two\more strings. If the the two strings are found. write it to a outputfile if only one of the string is found, write the found string in one output file and other in other... (2 Replies)
Discussion started by: amitrajvarma
2 Replies

2. UNIX for Advanced & Expert Users

search a replace each line- help needed ASAP

can someone help me with the find and replace command. I have a input file which is in the below format: 0011200ALN00000000009EGYPT 000000000000199900000 0011200ALN00000000009EGYPT 000000000000199900000 0011200ALN00000000008EGYPT 000000000000199800000 0011200ALN00000000009EGYPT ... (20 Replies)
Discussion started by: bsandeep_80
20 Replies

3. Shell Programming and Scripting

Complex Search/Replace Multiple Files Script Needed

I have a rather complicated search and replace I need to do among several dozen files and over a hundred occurrences. My site is written in PHP and throughout the old code, you will find things like die("Operation Aborted due to....."); For my new design skins for the site, I need to get... (2 Replies)
Discussion started by: UCCCC
2 Replies

4. Shell Programming and Scripting

Printing 10 lines above and below the search string: help needed

Hi, The below code will search a particular string(say false in this case) and return me 10 lines above and below the search string in a file. " awk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r;print("***********************************");print;c=a;}b{r=$ 0}' b=10 a=10 s="false" " ... (5 Replies)
Discussion started by: vimalm22
5 Replies

5. Shell Programming and Scripting

Help needed with basic search

hi, im trying to find the longest word in /usr/share/dict/words that does not contain the letter i. i've tried using the wc -L command like so: $ wc -L /usr/share/dict/words which basically tells me the longest word which is good but how do i get the longest word which Does not contain the... (7 Replies)
Discussion started by: tryintolearn
7 Replies

6. Shell Programming and Scripting

search needed part in text file (awk?)

Hello! I have text file: From aaa@bbb Fri Jun 1 10:04:29 2010 --____OSPHWOJQGRPHNTTXKYGR____ Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline My code '234565'. ... (2 Replies)
Discussion started by: candyme
2 Replies

7. Shell Programming and Scripting

Search for a pattern and replace. Help needed

I have three variables $a, $b and $c $a = file_abc_123.txt $b = 123 $c = 100 I want to search if $b is present in $a. If it is present, then i want to replace that portion by $c. Here $b = 123 is present in "file_abc_123.txt", so i need the output as "file_abc_100.txt' How can this be... (3 Replies)
Discussion started by: irudayaraj
3 Replies

8. UNIX for Dummies Questions & Answers

Help needed - find command for recursive search

Hi All I have a requirement to find the file that are most latest to be modified in each directory. Can somebody help with the command please? E.g of the problem. The directory A is having sub directory which are having subdirectory an so on. I need a command which will find the... (2 Replies)
Discussion started by: sudeep.id
2 Replies

9. Shell Programming and Scripting

Recursive folder search faster than find?

I'm trying to find folders created by a propritary data aquisition software with the .aps ending--yes, I have never encountered folder with a suffix before (some files also end in .aps) and sort them by date. I need the whole path ls -dt "$dataDir"*".aps"does exactly what I want except for the... (2 Replies)
Discussion started by: Michael Stora
2 Replies

10. Shell Programming and Scripting

A faster way to read and search

I have a simple script that reads in data from fileA.txt and searches line by line for that data in multiple files (*multfiles.txt). It only prints the data when there is more than 1 instance of it. The problem is that its really slow (3+ hours) to complete the entire process. There are nearly 1500... (10 Replies)
Discussion started by: ncwxpanther
10 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 01:17 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy