08-13-2014
What's the exact contents of each file?
Right now, you're literally running millions of greps, comparing each ID to every other ID. That's a huge waste of time, obviously. You only really need to compare each ID to itself.
Because it seems to me that you're just looking for IDs that are in both files. If each ID only occurs once in each file, pull the IDs out of each file, combine them into one file, sort them, and only count IDs that are duplicated in the combined file.
Or you can figure out how to use the "join" utility, though that may not be any faster than what you're already doing, although it does have the huge advantage of not having to fork() and exec() a new process millions of times.
10 More Discussions You Might Find Interesting
1. UNIX for Advanced & Expert Users
I am trying to cat a file and then grep that file for a number. I can do it fine on other files but this particular file will not do anything. I tried running it on an older file from the same device but it is just not working. The file is nothing more than a flat file on a unix box. Here is just a... (3 Replies)
Discussion started by: jphess
3 Replies
2. Shell Programming and Scripting
Hello,
So I sorted my file as I was supposed to:
sort -n -r -k 2 -k 1 file1 | uniq > file2
and when I wrote
> cat file2
in the command line, I got what I was expecting, but in the script itself
...
sort -n -r -k 2 -k 1 averages | uniq > temp
cat file2
It wrote a whole... (21 Replies)
Discussion started by: shira
21 Replies
3. Shell Programming and Scripting
Hi all,
I have a file like the following:
ID,
2,Andrew,0,1,2,3,4,2,5,6,7,7,9,3,4,5,34,3,2,1,5,6,78,89,8,7,6......................
4,James,0,6,7,0,5,6,4,7,8,9,6,46,6,3,2,5,6,87,0,341,0,5,2,5,6....................
END,
(there are more entires on each line but to keep it simple I've left... (10 Replies)
Discussion started by: Donkey25
10 Replies
4. Shell Programming and Scripting
Hi all,
Here is my requirement
I have to search 'ORA' word in out.log file,if it is present then i need to send that file (out.log) content to some mail id.If 'ORA' word is not in that file then i need to send 'load succesful' message to some mail id.
The below the shell script is not... (5 Replies)
Discussion started by: mak_boop
5 Replies
5. Shell Programming and Scripting
Hi All,
I'd like to do this
cat /etc/passwd
and grep -v on the /etc/shells list
I'd like to find all shell that doesn't exist on the /etc/passwd.
Is there an easy way without doing a egrep -v "/bin/sh|/bin/bash................"?
How do I use a file /etc/shells as my list for... (4 Replies)
Discussion started by: itik
4 Replies
6. Shell Programming and Scripting
I am not sure if using cat -n is the most efficient way to split a file into multiple files, one file per line in the source file.
I thought using cat -n would make it easy to process the file because it produces an output that numbers each line that I could then grep for with the regex "^ *$i".... (3 Replies)
Discussion started by: kapu
3 Replies
7. Shell Programming and Scripting
Is there a way using grep or cat a file to create a new file based on whether the first 9 positions of each record is less than 399999999?
This is a fixed file format. (3 Replies)
Discussion started by: ski
3 Replies
8. UNIX for Dummies Questions & Answers
Hello,
i need to search one word (snp1) from many files and copy the content of the columns of this word in new file.
example:
file 1:
SNP BP CHR P
snp1 1 3 0.01
snp2 2 2 0.05
.
.
file 2:
SNP BP CHR P
snp1 1 3 0.06
snp2 2 2 0.3
output... (6 Replies)
Discussion started by: biopsy
6 Replies
9. Shell Programming and Scripting
Hello
someone told me to use
OS=`awk '{print int($3)}' < /etc/redhat-release`
instead of
OS=cat /etc/redhat-release | `awk '{print int($3)}'`
any idea for the reason ? (5 Replies)
Discussion started by: nimafire
5 Replies
10. UNIX for Dummies Questions & Answers
Hi Guys
This is my first post so I am not sure how things go here. I'm sorry if I'm breaking the rule or something. Feel free to correct me about that :)
So as I was saying...
I'd been trying to grep this folder containing 900,000 txt files but seems no luck. I get either "No such file... (6 Replies)
Discussion started by: Nexeu
6 Replies
LEARN ABOUT DEBIAN
numsum
NUMSUM(1) User Contributed Perl Documentation NUMSUM(1)
NAME
numsum - numsum program file
SYNOPSIS
numsum [-iIcdhrsvxy] <FILE>
| numsum [-iIcdhrsvxy] (Input on STDIN from pipeline.)
numsum [-iIcdhrsvxy] (Input on STDIN. Use Ctrl-D to stop.)
DESCRIPTION
numsum will take all the numbers on stdin and return the sum of those numbers. Currently it only processes the first number on each line.
Besides positive numbers, it also handles negative numbers and numbers with decimals.
OPTIONS
-i Only return the integer portion of the final sum.
-I Only return the decimal portion of the final sum.
-c Print out the sum of each column.
-r Print out the sum of each row.
-x <n> Specify a comma seperated list of columns to print.
-y <n> Specify a comma seperated list of rows to print.
-s <string> Specify a string to use as a seperator for columns.
This defaults to be consecutive whitespace (s+).
-h Help: You're looking at it.
-V Increase verbosity.
-d Debug mode. For developers
-q Quiet mode, don't print any warnings.
EXAMPLES
Simply add up the numbers in a file.
$ numsum numbers.txt
4315
Enter your own numbers on STDIN. The last number is the answer.
$ numsum
4
21
98
100
223
Use it in a command pipeline.
$ ls -1s | grep .mp3 | numsum -c -x 5
72288
Add up the total byte count in a http log file.
$ cat access_log | awk {'print $10'} numsum
or
numsum -c -x 10 access_log
Add up the columns of numbers of a file.
$ cat columns
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
$ numsum -c columns
15 40 65 90 115
Add up the 1st, 2nd and 5th columns only.
$ numsum -c -x 1,2,5 columns
15 40 115
Add up the rows of numbers of a file.
$ numsum -r columns
55
60
65
70
75
Add up the 2nd and 4th rows.
$ numsum -r -y 2,4 columns
60
70
SEE ALSO
numaverage(1), numbound(1), numinterval(1), numnormalize(1), numgrep(1), numprocess(1), numrandom(1), numrange(1), numround(1)
COPYRIGHT
numsum is part of the num-utils package, which is copyrighted by Suso Banderas and released under the GPL license. Please read the COPYING
and LICENSE files that came with the num-utils package
Developers can read the GOALS file and contact me about providing
submitions or help for the project.
MORE INFO
More info on numsum can be found at:
http://suso.suso.org/programs/num-utils/
perl v5.10.1 2009-10-31 NUMSUM(1)