The very best tool for this is a database application - mysql, oracle, etc. Create an indexed table from your "big file", update it once a month. You gain scalability, meaning you can write one small db app, and run many separate parallel processes. Or threads.
Otherwise you would need a hash of 200 million records to do real time lookups. Not that this is not possible, it just seems like an unstable or error prone approach to me.
Plus it may not scale well as load increases.
So, with no database you need major hash support in your app- and tons of free memory
probably way more 4GB.
perl, ruby, C will work either with or without a db. Shell/awk will not work at all well.
Hello,
I really would appreciate some help with a bash script for some string manipulation on an SQL dump:
I'd like to be able to rename "sites/WHATEVER/files" to "sites/SOMETHINGELSE/files" within the sql dump.
This is quite easy with sed:
sed -e... (1 Reply)
I have the following situation:
a text file with 50000 string patterns:
abc2344536
gvk6575556
klo6575556
....
and 3 text files each with more than 1 million lines:
...
000000 abc2344536 46575 0000
000000 abc2344536 46575 4444
000000 abc2344555 46575 1234
...
I... (8 Replies)
hey guys,
I have a directory with about 600 files. I need to find a specific word inside a command and replace only that instance of the word in many files. For example, lets say I have a command called 'foo' in many files. One of the input arguments of the 'foo' call is 'bar'. The word 'bar'... (5 Replies)
I met a problem on HPUX with 64G RAM and 20 CPU.
There are 5 million files with file name from file0000001.dat to file9999999.dat, in the same directory, and with some other files with random names.
I was trying to remove all the files from file0000001.dat to file9999999.dat at the same time.... (9 Replies)
Hi Experts,
I had to edit (a particular value) in header line of a very huge file so for that i wanted to search & replace a particular value on a file which was of 24 GB in Size. I managed to do it but it took long time to complete. Can anyone please tell me how can we do it in a optimised... (7 Replies)
Hi,
I have to search a number in a very long listing of files.the total size of the files in which I have to search is 10 Tera Bytes.
How to search a number in such a huge amount of data effectively.I used fgrep but it is taking many hours to search. Is there any other feasible solution to... (3 Replies)
We have a folder XYZ with large number of files (>350,000). how can i split the folder and create say 10 of them XYZ1 to XYZ10 with 35,000 files each. (doesnt matter which files go where). (12 Replies)
Discussion started by: AlokKumbhare
12 Replies
LEARN ABOUT DEBIAN
pegasus-config
PEGASUS-CONFIG(1)PEGASUS-CONFIG(1)NAME
pegasus-config - The authority for where parts of the Pegasus system exists on the filesystem. pegasus-config can be used to find libraries
such as the DAX generators.
SYNOPSIS
pegasus-config [-h] [--help] [-V] [--version] [--noeoln]
[--perl-dump] [--perl-hash] [--python-dump] [--sh-dump]
[--bin] [--conf] [--java] [--perl] [--python]
[--python-externals] [--schema] [--classpath]
[--local-site] [--full-local]
DESCRIPTION
pegasus-config is used to find locations of Pegasus system components. The tool is used internally in Pegasus and by users who need to find
paths for DAX generator libraries and schemas.
OPTIONS -h, --help
Prints help and exits.
-V, --version
Prints Pegasus version information
--perl-dump
Dumps all settings in perl format as separate variables.
--perl-hash
Dumps all settings in perl format as single perl hash.
--python-dump
Dumps all settings in python format.
--sh-dump
Dumps all settings in shell format.
--bin
Print the directory containing Pegasus binaries.
--conf
Print the directory containing configuration files.
--java
Print the directory containing the jars.
--perl
Print the directory to include into your PERL5LIB.
--python
Print the directory to include into your PYTHONLIB.
--python-externals
Print the directory to the external Python libraries.
--schema
Print the directory containing schemas.
--classpath
Builds a classpath containing the Pegasus jars.
--noeoln
Do not produce a end-of-line after output. This is useful when being called from non-shell backticks in scripts. However, order is
important for this option: If you intend to use it, specify it first.
--local-site [d]
Create a site catalog entry for site "local". This is only an XML snippet without root element nor XML headers. The optional argument
"d" points to the mount point to use. If not specified, defaults to the user's $HOME directory.
--full-local [d]
Create a complete site catalog with only site "local". The an XML snippet without root element nor XML headers. The optional argument
"d" points to the mount point to use. If not specified, defaults to the user's $HOME directory.
EXAMPLE
To set the PYTHONPATH variable in your shell for using the Python DAX API:
export PYTHONPATH=`pegasus-config --python`
To set the same path inside Python:
config = subprocess.Popen("pegasus-config --python-dump", stdout=subprocess.PIPE, shell=True).communicate()[0]
exec config
To set the PERL5LIB variable in your shell for using the Perl DAX API:
export PERL5LIB=`pegasus-config --perl`
To set the same path inside Perl:
eval `pegasus-config --perl-dump`;
die("Unable to eval pegasus-config output: $@") if $@;
will set variables a number of lexically local-scoped my variables with prefix "pegasus_" and expand Perl's search path for this script.
Alternatively, you can fail early and collect all Pegasus-related variables into a single global %pegasus variable for convenience:
BEGIN {
eval `pegasus-config --perl-hash`;
die("Unable to eval pegasus-config output: $@") if $@;
}
AUTHOR
Pegasus Team http://pegasus.isi.edu
05/24/2012 PEGASUS-CONFIG(1)