07-31-2014
RudiC gave you code that meets all of your stated requirements except that you don't like it. In what way does sort fail to meet your needs?
When we see statements like this after a seemingly good solution is proposed, it often means that the submitter is working on a homework assignment and is not allowed to use the tools proposed. Is this a homework assignment?
Are there other requirements that you haven't stated? If there are multiple lines for a given key in the first file, do you really want a copy of every line with that value in the second file duplicated in your new file for each occurrence of that value in the first file?
Is it important that your program run very slowly for medium sized files and run at a snail's pace for large files?
Is it important that your output file maintain the (possibly unsorted) order of lines in the 1st file and for all copies of lines from the 2nd file?
What OS are you using?
The people who read your posts and try to help you find solutions for your problems are volunteers. We are not on your payroll. This problem may be urgent for you, but you can't expect volunteers to ignore other things that they might want to do to satisfy your personal requirements. There are special forums that can be used for "Urgent" requests. In the regular forums requests that need to be responded to urgently are inappropriate. Is this another indication that you have a homework assignment that is due soon and you haven't figured out how to do it yet?
10 More Discussions You Might Find Interesting
1. UNIX for Advanced & Expert Users
I have a machine A NFS mounted on machine B
I am doing a build from machine B on the MFS mounted dir of machine A but I keep getting the following:
NFS server A not responding still trying.
I go to machine A and can log onto machine A and everything seems fine.
How do I go about finding... (6 Replies)
Discussion started by: brv
6 Replies
2. Shell Programming and Scripting
Hi All,
read dif
echo `date +%Y%m%d`|./add $dif|./fmtdt %mon%dd
The above script is for adding days to current date to find the new date. This script divides the current date into 20060220(YYYYMMDD) format and pass this output to add script. The add script will add the days to the... (2 Replies)
Discussion started by: muthu_nix
2 Replies
3. UNIX for Dummies Questions & Answers
Hi, I have 600 text files. In each txt file, I have 3 columns, e.g:
File 1
a 0.21 0.003
b 0.34 0.004
c 0.72 0.002
File 2
a 0.25 0.0083
b 0.38 0.0047
c 0.79 0.00234
File 3
a 0.45 0.0063
b 0.88 0.0027
c 0.29 0.00204
...
my filename as "sc2408_0_5278.txt sc2408_0_5279.txt... (2 Replies)
Discussion started by: libenhelen
2 Replies
4. Shell Programming and Scripting
hi,
i used paste file1.txt file2.txt > file3.txt to merge 2 columns from file1 and 4 columns from file2.
file1
scaffold_217 scaffold_217
file2
CHRSM N scaffold_217.pf scaffold_217.fsa
the result is as follows:-
scaffold_217 scaffold_217
CHRSM ... (6 Replies)
Discussion started by: redse171
6 Replies
5. Shell Programming and Scripting
Hi all,
I have list of two kind of files and I want to compare the rows and print the merged data by applying if condition.
First kind of file looks like:
and second kind of file looks like :
I want to print the rows present in second file followed by 3 more columns from first... (6 Replies)
Discussion started by: CAch
6 Replies
6. Shell Programming and Scripting
I am working on a problem in which I need to merge 4 files (say f1,f2,f3 & f4 log files) & then prepare a final file.
1) If the final file created has size more than 1 GB then need to throw error (display error).
2) Need to check after every merge (say f1 + f2, f1 + f2 + f3) that whether... (2 Replies)
Discussion started by: nrm
2 Replies
7. UNIX for Advanced & Expert Users
Hello Folks,
i have to write shell scripting for given expected output manner.
in given input we have to write shell script in such a way that sequence no can b merged/link between start and end digit with hyphen "-" symbol and rest of digit separated by ","
Eg :
For Input "2 6 7 8 11 12... (9 Replies)
Discussion started by: panchalh
9 Replies
8. Linux
I'm trying to performance tune the I/O of my web server, which is at 41.1% reads merged (If my math is correct), which seems a tad high to just be going along with the defaults. Will modifying read_ahead_kb affect the value of "reads merged" in diskstats? If not, what's a good way of tracking... (2 Replies)
Discussion started by: thmnetwork
2 Replies
9. Shell Programming and Scripting
i have a file in the format
acti_id|signature
1|abc
def
xyz
2|lmn
pqr
lmn
3|ggg
ppp
mmm
it is in csv format
i want the file in the format
act_id|signature
1|abcdefxyz
2|lmnpqrlmn
3|gggpppmmm
i have tried awk but without much success. i replaced the new line with null but it... (10 Replies)
Discussion started by: djrulz123
10 Replies
10. UNIX for Beginners Questions & Answers
Like to have shell script to Read the given file contents into a merged one file with header of path+file name followed by file contents into a single output file.
While reading and merging the file contents into a single file, Like to keep the format of the source file.
... (4 Replies)
Discussion started by: Siva SQL
4 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)
NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS
--predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO
bup-midx(1), bup-save(1)
BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown- bup-margin(1)