05-10-2011
What Operating System and version do you have?
What Shell do you prefer?
How big is the keywords file?
How big is the total of the 3k data files?
Are these all normal unix text files with a reasonable record size?
You appear to be attempting 150,000,000 serial file passes (15,000 x 3,000) .
Is this a one-off or something which will be run again and again?
Do you have a full-works database engine such as Oracle?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Because I am not creative, I did this:
find . -type f -name '*.GIF'|cut -d'/' -f2|awk -F. '{print "mv "$1".GIF "$1".gif --reply=yes"}' > case.sh
Then ran the case.sh - I was wondering if you guys could come up with something more efficient? Or even limit CPU useage? It is killing my poor ext3... (3 Replies)
Discussion started by: r0sc0
3 Replies
2. Shell Programming and Scripting
Hi,
I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern.
Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies
3. Shell Programming and Scripting
How can we find "latest files which have been recently updated/changed/created" in solaris 10??? (3 Replies)
Discussion started by: asadlone
3 Replies
4. UNIX for Dummies Questions & Answers
Hi to all
Sorry for the confusion because I did not explain the task clearly.
There are many .hhr files in a folder
There are so many lines in these .hhr files but I want only the following 2 lines to be transferred to the output file.
The keyword No 1 and all the words in the next line
They... (5 Replies)
Discussion started by: raghulrajan
5 Replies
5. Shell Programming and Scripting
Hi guys can you please help me with a script to find files with one row/1 line of content then move the file to another directory my script below runs but nothing happens to the files....Alternatively Ca I get a script to find the *.csv files with "wc -1" results = 1 then create a list of those... (5 Replies)
Discussion started by: Dj Moi
5 Replies
6. UNIX for Advanced & Expert Users
I have a huge list of files in an Unix directory (around 10000 files).
I need to be able to search for a certain keyword only within files that are modified between certain date and time, say for e.g 2012-08-20 12:30 to 2012-08-20 12:40
Can someone let me know what would be the fastest way... (10 Replies)
Discussion started by: virtual123
10 Replies
7. Shell Programming and Scripting
I have ~100 text files in a directory that I am trying to parse and output to a new file. I am looking for the words chr,start,stop,ref,alt in each of the files. Those fields should appear somewhere in those files. The first two fields of each new set of rows is also printed. Since this is on a... (7 Replies)
Discussion started by: cmccabe
7 Replies
8. UNIX for Dummies Questions & Answers
The Problem that I am having is when the code ran and populated the progflag.csv file, columns MEMSIZE, SECOND and SASEXE were blank. The next problems are the IF else statement isn't working and the email function isn't sending the progflag.csv attachment.
a. What I want the program to do is to... (2 Replies)
Discussion started by: dellanicholson
2 Replies
9. Shell Programming and Scripting
I have several problems with my program: I hope you can help me.
1) the If else statement isn't working . The IF Else syntax is:
If MEMSIZE OR sasfoundation (SASEXE) OR Real Time(second) >1.0 and Filename, output column name and value to csv or else nothing
Example progflag,cvs:... (13 Replies)
Discussion started by: dellanicholson
13 Replies
10. UNIX for Beginners Questions & Answers
I have two files to be compared to get the output of the differences.
File1 has a lot more lists than File2.
After searching a lot on this thread I'am unable to find the exact code that im willing to get.
This will be used as 'pre-check'/post-check utility (health check Tool) to compare... (1 Reply)
Discussion started by: GeekyJimmy
1 Replies
EXTRACT(1) General Commands Manual EXTRACT(1)
NAME
extract - determine meta-information about a file
SYNOPSIS
extract [ -bghLnvV ] [ -H hash-algorithm ] [ -i ] [ -l library ] [ -p type ] [ -x type ] file ...
DESCRIPTION
This manual page documents version 0.6.0 of the extract command.
extract tests each file specified in the argument list in an attempt to infer meta-information from it. Each file is subjected to the
meta-data extraction libraries from libextractor.
libextractor classifies meta-information (also referred to as keywords) into types. A list of all types can be obtained with the -L option.
OPTIONS
-b Display the output in BiBTeX format.
-g Use grep-friendly output (all keywords on a single line for each file). Use the verbose option to print the filename first, fol-
lowed by the keywords. Use the verbose option twice to also display the keyword types. This option will not print keyword types
or non-textual metadata.
-h Print a brief summary of the options.
-i Run plugins in-process (for debugging). By default, each plugin is run in its own process.
-l libraries
Use the specified libraries to extract keywords. The general format of libraries is .I [[-]LIBRARYNAME[:[-]LIBRARYNAME]*] where
LIBRARYNAME is a libextractor compatible library and typically of the form .Ijpeg. The minus before the libraryname indicates that
this library should be removed from the existing list. To run only a few selected plugins, use -l in combination with -n.
-L Print a list of all known keyword types.
-n Do not use the default set of extractors (typically all standard extractors, currently mp3, ogg, jpg, gif, png, tiff, real, html,
pdf and mime-types), use only the extractors specified with the .B -l option.
-p type
Print only the keywords matching the specified type. By default, all keywords that are found and not removed as duplicates are
printed.
-v Print the version number and exit.
-V Be verbose. This option can be specified multiple times to increase verbosity further.
-x type
Exclude keywords of the specified type from the output. By default, all keywords that are found and not removed as duplicates are
printed.
SEE ALSO
libextractor(3) - description of the libextractor library
EXAMPLES
$ extract test/test.jpg
comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
mimetype - image/jpeg
$ extract -V -x comment test/test.jpg
Keywords for file test/test.jpg:
mimetype - image/jpeg
$ extract -p comment test/test.jpg
comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
$ extract -nV -l png.so -p comment test/test.jpg test/test.png
Keywords for file test/test.jpg:
Keywords for file test/test.png:
comment - Testing keyword extraction
LEGAL NOTICE
libextractor and the extract tool are released under the GPL. libextractor is a GNU package.
BUGS
A couple of file-formats (on the order of 10^3) are not recognized...
AUTHORS
extract was originally written by Christian Grothoff <christian@grothoff.org> and Vidyut Samanta <vids@cs.ucla.edu>. Use <libextrac-
tor@gnu.org> to contact the current maintainer(s).
AVAILABILITY
You can obtain the original author's latest version from http://www.gnu.org/software/libextractor/
libextractor 0.6.0 Dec 20, 2009 EXTRACT(1)