I have a large number of input files with two columns of numbers.
For example:
I only wish to retain lines where the numbers fullfil two requirements. E.g:
[X]=83
1000<=[Y]<=2000
To do this I use the following command:
PROBLEM: My inputfiles contain >50 million lines, so the awk command is too slow (it takes >2 minutes and I have thousands of inputfiles). Is there a way to make it faster? I have been told that it would be faster if I use Perl.
Last edited by Scrutinizer; 07-04-2012 at 04:46 PM..
Reason: code tags also for data sample
If I just wanted to get andred08 from the following ldap dn
would I be best to use AWK or CUT?
uid=andred08,ou=People,o=example,dc=com
It doesn't make a difference if it's just one ldap search I am getting it from but when there's a couple of hundred people in the group that retruns all... (10 Replies)
I am processing some terabytes of information on a computer having 8 processors (each with 4 cores) with a 16GB RAM and 5TB hard drive implemented as a RAID. The processing doesn't seem to be blazingly fast perhaps because of the IO limitation.
I am basically running a perl script to read some... (13 Replies)
Hi -- I have the following SQL query in my UNIX shell script -- but the subquery in the second section is very slow. I know there must be a way to do this with a union or something which would be better. Can anyone offer an alternative to this query? Thanks.
select
count(*)
from
... (2 Replies)
Hi,
I have a script below for extracting xml from a file.
for i in *.txt
do
echo $i
awk '/<.*/ , /.*<\/.*>/' "$i" | tr -d '\n'
echo -ne '\n'
done
.
I read about using multi threading to speed up the script.
I do not know much about it but read it on this forum.
Is it a... (21 Replies)
Can someone help me edit the below script to make it run faster?
Shell: bash
OS: Linux Red Hat
The point of the script is to grab entire chunks of information that concerns the service "MEMORY_CHECK".
For each chunk, the beginning starts with "service {", and ends with "}".
I should... (15 Replies)
awk "/May 23, 2012 /,0" /var/tmp/datafile
the above command pulls out information in the datafile. the information it pulls is from the date specified to the end of the file.
now, how can i make this faster if the datafile is huge? even if it wasn't huge, i feel there's a better/faster way to... (8 Replies)
I have the below command which is referring a large file and it is taking 3 hours to run. Can something be done to make this command faster.
awk -F ',' '{OFS=","}{ if ($13 == "9999") print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12 }' ${NLAP_TEMP}/hist1.out|sort -T ${NLAP_TEMP} |uniq>... (13 Replies)
I have nginx web server logs with all requests that were made and I'm filtering them by date and time.
Each line has the following structure:
127.0.0.1 - xyz.com GET 123.ts HTTP/1.1 (200) 0.000 s 3182 CoreMedia/1.0.0.15F79 (iPhone; U; CPU OS 11_4 like Mac OS X; pt_br)
These text files are... (21 Replies)
Discussion started by: brenoasrm
21 Replies
LEARN ABOUT DEBIAN
numsum
NUMSUM(1) User Contributed Perl Documentation NUMSUM(1)NAME
numsum - numsum program file
SYNOPSIS
numsum [-iIcdhrsvxy] <FILE>
| numsum [-iIcdhrsvxy] (Input on STDIN from pipeline.)
numsum [-iIcdhrsvxy] (Input on STDIN. Use Ctrl-D to stop.)
DESCRIPTION
numsum will take all the numbers on stdin and return the sum of those numbers. Currently it only processes the first number on each line.
Besides positive numbers, it also handles negative numbers and numbers with decimals.
OPTIONS -i Only return the integer portion of the final sum.
-I Only return the decimal portion of the final sum.
-c Print out the sum of each column.
-r Print out the sum of each row.
-x <n> Specify a comma seperated list of columns to print.
-y <n> Specify a comma seperated list of rows to print.
-s <string> Specify a string to use as a seperator for columns.
This defaults to be consecutive whitespace (s+).
-h Help: You're looking at it.
-V Increase verbosity.
-d Debug mode. For developers
-q Quiet mode, don't print any warnings.
EXAMPLES
Simply add up the numbers in a file.
$ numsum numbers.txt
4315
Enter your own numbers on STDIN. The last number is the answer.
$ numsum
4
21
98
100
223
Use it in a command pipeline.
$ ls -1s | grep .mp3 | numsum -c -x 5
72288
Add up the total byte count in a http log file.
$ cat access_log | awk {'print $10'} numsum
or
numsum -c -x 10 access_log
Add up the columns of numbers of a file.
$ cat columns
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
$ numsum -c columns
15 40 65 90 115
Add up the 1st, 2nd and 5th columns only.
$ numsum -c -x 1,2,5 columns
15 40 115
Add up the rows of numbers of a file.
$ numsum -r columns
55
60
65
70
75
Add up the 2nd and 4th rows.
$ numsum -r -y 2,4 columns
60
70
SEE ALSO numaverage(1), numbound(1), numinterval(1), numnormalize(1), numgrep(1), numprocess(1), numrandom(1), numrange(1), numround(1)COPYRIGHT
numsum is part of the num-utils package, which is copyrighted by Suso Banderas and released under the GPL license. Please read the COPYING
and LICENSE files that came with the num-utils package
Developers can read the GOALS file and contact me about providing
submitions or help for the project.
MORE INFO
More info on numsum can be found at:
http://suso.suso.org/programs/num-utils/
perl v5.10.1 2009-10-31 NUMSUM(1)