Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Problem with extract PDFs from huge files. Post 303045996 by Neo on Tuesday 21st of April 2020 07:21:42 AM
Old 04-21-2020
Quote:
We have some binary files
What kind of binary file, exactly? What is the file extension?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to extract data from a huge file?

Hi, I have a huge file of bibliographic records in some standard format.I need a script to do some repeatable task as follows: 1. Needs to create folders as the strings starts with "item_*" from the input file 2. Create a file "contents" in each folders having "license.txt(tab... (5 Replies)
Discussion started by: srsahu75
5 Replies

2. Shell Programming and Scripting

How to extract a piece of information from a huge file

Hello All, I need some assistance to extract a piece of information from a huge file. The file is like this one : database information ccccccccccccccccc ccccccccccccccccc ccccccccccccccccc ccccccccccccccccc os information cccccccccccccccccc cccccccccccccccccc... (2 Replies)
Discussion started by: Marcor
2 Replies

3. Shell Programming and Scripting

How to extract a subset from a huge dataset

Hi, All I have a huge file which has 450G. Its tab-delimited format is as below x1 A 50020 1 x1 B 50021 8 x1 C 50022 9 x1 A 50023 10 x2 D 50024 5 x2 C 50025 7 x2 F 50026 8 x2 N 50027 1 : : Now, I want to extract a subset from this file. In this subset, column 1 is x10, column 2 is... (3 Replies)
Discussion started by: cliffyiu
3 Replies

4. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies

5. Shell Programming and Scripting

Problem running Perl Script with huge data files

Hello Everyone, I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this : foreach my $t (@text) { open TEXT, $t or die "Cannot open $t for reading: $!\n"; while(my $line=<TEXT>){ ... (4 Replies)
Discussion started by: ad23
4 Replies

6. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . . (4 Replies)
Discussion started by: patrick87
4 Replies

7. Shell Programming and Scripting

Search pdfs in command line

Hi, I'm trying to search for a particular phrase in a large number of PDFs in a particular directory. What I've done so far only prints out the line, but I haven't been able to display in which file the phrase appears. find . -name '*.pdf' -exec pdftotext {} - \; | grep "search phrase" ... (2 Replies)
Discussion started by: lost.identity
2 Replies

8. UNIX for Advanced & Expert Users

Performance problem with removing duplicates in a huge file (50+ GB)

I'm trying to remove duplicate data from an input file with unsorted data which is of size >50GB and write the unique records to a new file. I'm trying and already tried out a variety of options posted in similar threads/forums. But no luck so far.. Any suggestions please ? Thanks !! (9 Replies)
Discussion started by: Kannan K
9 Replies

9. Shell Programming and Scripting

Extract few content from a huge list of files

I have a huge list of files (about 300,000) which have a pattern like this. .I 1 .U 87049087 .S Am J Emerg .M Allied Health Personnel/*; Electric Countershock/*; .T Refibrillation managed by EMT-Ds: .P ARTICLE. .W Some patients converted from ventricular fibrillation to organized... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies

10. Shell Programming and Scripting

Bash script monitor directory and subdirectories for new pdfs

I need bash script that monitor folders for new pdf files and create xml file for rss feed with newest files on the list. I have some script, but it reports errors. #!/bin/bash SYSDIR="/var/www/html/Intranet" HTTPLINK="http://TYPE.IP.ADDRESS.HERE/pdfs" FEEDTITLE="Najnoviji dokumenti na... (20 Replies)
Discussion started by: markus1981
20 Replies
tracker-extract(1)						   User Commands						tracker-extract(1)

NAME
tracker-extract - Extract metadata from a file. SYNOPSYS
tracker-extract [OPTION...] FILE... DESCRIPTION
tracker-extract reads the file and mimetype provided in stdin and extract the metadata from this file; then it displays the metadata on the standard output. NOTE: If a FILE is not provided then tracker-extract will run for 30 seconds waiting for DBus calls before quitting. OPTIONS
-?, --help Show summary of options. -v, --verbosity=N Set verbosity to N. This overrides the config value. Values include 0=errors, 1=minimal, 2=detailed and 3=debug. -f, --file=FILE The FILE to extract metadata from. The FILE argument can be either a local path or a URI. It also does not have to be an absolute path. -m, --mime=MIME The MIME type to use for the file. If one is not provided, it will be guessed automatically. -d, --disable-shutdown Disable shutting down after 30 seconds of inactivity. -i, --force-internal-extractors Use this option to force internal extractors over 3rd parties like libstreamanalyzer. -m, --force-module=MODULE Force a particular module to be used. This is here as a convenience for developers wanting to test their MODULE file. Only the MOD- ULE name has to be specified, not the full path. Typically, a MODULE is installed to /usr/lib/tracker-0.7/extract-modules/. This option can be used with or without the .so part of the name too, for example, you can use --force-module=foo Modules are shared objects which are dynamically loaded at run time. These files must have the .so suffix to be loaded and must con- tain the correct symbols to be authenticated by tracker-extract. For more information see the libtracker-extract reference documen- tation. -V, --version Show binary version. EXAMPLES
Using command line to extract metadata from a file: $ tracker-extract -v 3 -f /path/to/some/file.mp3 Using a specific module to extract metadata from a file: $ tracker-extract -v 3 -f /path/to/some/file.mp3 -m mymodule ENVIRONMENT
TRACKER_EXTRACTORS_DIR This is the directory which tracker uses to load the shared libraries from (used for extracting metadata for specific file types). These are needed on each invocation of tracker-store. If unset it will default to the correct place. This is used mainly for testing purposes. The default location is /usr/lib/tracker-0.10/extract-modules/. TRACKER_EXTRACTOR_RULES_DIR This is the directory which tracker uses to load the rules files from. The rules files describe extractor modules and their sup- ported MIME types. The default location is /usr/share/tracker/extract-rules/. TRACKER_USE_CONFIG_FILES Don't use GSettings, instead use a config file similar to how settings were saved in 0.10.x. That is, a file which is much like an .ini file. These are saved to $HOME/.config/tracker/ SEE ALSO
tracker-store(1), tracker-sparql(1), tracker-stats(1), tracker-info(1). /usr/lib/tracker-0.10/extract-modules/ /usr/share/tracker/extract-rules/ GNU
July 2007 tracker-extract(1)
All times are GMT -4. The time now is 09:39 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy