Using grep -f works, but with two files as large as indicated will take its (serious) time, and may eventually run out of memory. Try
, then use the resultant file in similar way (uniq -u) to extract unique values from either original file.
Comparison of both approaches on ~20k files: EDIT: Times spent for two files with roughly 4E6 entries each, and about 1E6 lines overlap (on a two processor linux host):
Hello
I do want to write a script which will check any errors say "-error" in the log file then have to send email to the concern person . And the concern person will correct the error .
Next time if the script runs eventhough the error has been corrected it will ... (1 Reply)
Hi all,
I'm having some trouble with a shell script that I have put together to search our web pages for links to PDFs.
The first thing I did was:
ls -R | grep .pdf > /tmp/dave_pdfs.outWhich generates a list of all of the PDFs on the server. For the sake of arguement, say it looks like... (8 Replies)
I have a file that is 20 - 80+ MB in size that is a certain type of log file.
It logs one of our processes and this process is multi-threaded. Therefore the log file is kind of a mess. Here's an example:
The logfile looks like: "DATE TIME - THREAD ID - Details", and a new file is created... (4 Replies)
I need help in the following script. I want to grep the sql errors insert into the error table and exit the shell script if there is any error, otherwise keep running the scripts.
Here is my script
#!/bin/csh -f
source .orapass
set user = $USER
set pass = $PASS
cd /opt/data/scripts
echo... (2 Replies)
Hi guys - below is my script that is checking for current file, size and timestamp.
However I added a "grep" feature in it (line in red), but not getting the desired result.
I am trying to acheive in output:
1. Show me the file name, timestamp, size and grep'ed words
It would be a... (2 Replies)
Hi all,
I have problem with searching hundreds of CSV files, the problem is that search is lasting too long (over 5min).
Csv files are "," delimited, and have 30 fields each line, but I always grep same 4 fields - so is there a way to grep just those 4 fields to speed-up search.
Example:... (11 Replies)
Hello everybody,
I'm still slowly treading my way into bash scripting (without any prior programming experience) and hence my code is mostly what some might call "creative" if they meant well :D
I have created a script that serves its purpose but it does so very slowly, since it needs to work... (4 Replies)
This is my first experience writing unix script. I've created the following script. It does what I want it to do, but I need it to be a lot faster. Is there any way to speed it up?
cat 'Tax_Provision_Sample.dat' | sort | while read p; do fn=`echo $p|cut -d~ -f2,4,3,8,9`; echo $p >> "$fn.txt";... (20 Replies)
Hi,
I've written a ksh script that read a file and parse/filter/format each line. The script runs as expected but it runs for 24+ hours for a file that has 2million lines. And sometimes, the input file has 10million lines which means it can be running for more than 2 days and still not finish.... (9 Replies)
Hello experts,
we have input files with 700K lines each (one generated for every hour). and we need to convert them as below and move them to another directory once.
Sample INPUT:-
# cat test1
1559205600000,8474,NormalizedPortInfo,PctDiscards,0.0,Interface,BG-CTA-AX1.test.com,Vl111... (7 Replies)
Discussion started by: prvnrk
7 Replies
LEARN ABOUT DEBIAN
bup-random
bup-random(1) General Commands Manual bup-random(1)NAME
bup-random - generate a stream of random output
SYNOPSIS
bup random [-S seed] [-fv]
DESCRIPTION
bup random produces a stream of pseudorandom output bytes to stdout. Note: the bytes are not generated using a cryptographic algorithm and
should never be used for security.
Note that the stream of random bytes will be identical every time bup random is run, unless you provide a different seed value. This is
intentional: the purpose of this program is to be able to run repeatable tests on large amounts of data, so we want identical data every
time.
bup random generates about 240 megabytes per second on a modern test system (Intel Core2), which is faster than you could achieve by read-
ing data from most disks. Thus, it can be helpful when running microbenchmarks.
OPTIONS
the number of bytes of data to generate.
Can be used with the suffices k, M, or G to indicate kilobytes, megabytes, or gigabytes, respectively.
-S, --seed=seed
use the given value to seed the pseudorandom number generator. The generated output stream will be identical for every stream
seeded with the same value. The default seed is 1. A seed value of 0 is equivalent to 1.
-f, --force
generate output even if stdout is a tty. (Generating random data to a tty is generally considered ill-advised, but you can do if
you really want.)
-v, --verbose
print a progress message showing the number of bytes that has been output so far.
EXAMPLES
$ bup random 1k | sha1sum
2108c55d0a2687c8dacf9192677c58437a55db71 -
$ bup random -S1 1k | sha1sum
2108c55d0a2687c8dacf9192677c58437a55db71 -
$ bup random -S2 1k | sha1sum
f71acb90e135d98dad7efc136e8d2cc30573e71a -
$ time bup random 1G >/dev/null
Random: 1024 Mbytes, done.
real 0m4.261s
user 0m4.048s
sys 0m0.172s
$ bup random 1G | bup split -t --bench
Random: 1024 Mbytes, done.
bup: 1048576.00kbytes in 18.59 secs = 56417.78 kbytes/sec
1092599b9c7b2909652ef1e6edac0796bfbfc573
BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-random(1)