we have a problem
We have some binary files ~25GB. In this files are many (millions) PDF files included.
How we can extract them from such huge files? In small files I got it with the command:
so the PDF file begins with PDF-1.? and ends with %%EOF
but it don't works on such big files. So we need another way to extract them.
Hi,
I have a huge file of bibliographic records in some standard format.I need a script to do some repeatable task as follows:
1. Needs to create folders as the strings starts with "item_*" from the input file
2. Create a file "contents" in each folders having "license.txt(tab... (5 Replies)
Hello All,
I need some assistance to extract a piece of information from a huge file.
The file is like this one :
database information
ccccccccccccccccc
ccccccccccccccccc
ccccccccccccccccc
ccccccccccccccccc
os information
cccccccccccccccccc
cccccccccccccccccc... (2 Replies)
Hi, All
I have a huge file which has 450G. Its tab-delimited format is as below
x1 A 50020 1
x1 B 50021 8
x1 C 50022 9
x1 A 50023 10
x2 D 50024 5
x2 C 50025 7
x2 F 50026 8
x2 N 50027 1
:
:
Now, I want to extract a subset from this file. In this subset, column 1 is x10, column 2 is... (3 Replies)
Hi, all:
I've got two folders, say, "folder1" and "folder2".
Under each, there are thousands of files.
It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command.
However, if I change the above question a... (1 Reply)
Hello Everyone,
I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this :
foreach my $t (@text)
{
open TEXT, $t or die "Cannot open $t for reading: $!\n";
while(my $line=<TEXT>){
... (4 Replies)
Hi,
I'm trying to search for a particular phrase in a large number of PDFs in a particular directory.
What I've done so far only prints out the line, but I haven't been able to display in which file the phrase appears.
find . -name '*.pdf' -exec pdftotext {} - \; | grep "search phrase"
... (2 Replies)
I'm trying to remove duplicate data from an input file with unsorted data which is of size >50GB and write the unique records to a new file.
I'm trying and already tried out a variety of options posted in similar threads/forums. But no luck so far..
Any suggestions please ?
Thanks !! (9 Replies)
I have a huge list of files (about 300,000) which have a pattern like this.
.I 1
.U
87049087
.S
Am J Emerg
.M
Allied Health Personnel/*; Electric Countershock/*;
.T
Refibrillation managed by EMT-Ds:
.P
ARTICLE.
.W
Some patients converted from ventricular fibrillation to organized... (1 Reply)
I need bash script that monitor folders for new pdf files and create xml file for rss feed with newest files on the list. I have some script, but it reports errors.
#!/bin/bash
SYSDIR="/var/www/html/Intranet"
HTTPLINK="http://TYPE.IP.ADDRESS.HERE/pdfs"
FEEDTITLE="Najnoviji dokumenti na... (20 Replies)
Discussion started by: markus1981
20 Replies
LEARN ABOUT DEBIAN
unrar-free
UNRAR-FREE(1) General Commands Manual UNRAR-FREE(1)NAME
unrar-free -- extract files from rar archives
SYNOPSIS
unrar-free [-xtfp?V] [--extract] [--list] [--force] [--extract-newer] [--extract-no-paths] [--password] [--help] [--usage] [--ver-
sion] ARCHIVE [FILE ...] [DESTINATION]
DESCRIPTION
unrar-free is a program for extracting files from rar archives.
OPTIONS
These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is
included below.
-x--extract
Extract files from archive (default).
-t--list
List files in archive.
-f--force
Overwrite files when extracting.
--extract-newer
Only extract newer files from the archive.
--extract-no-paths
Don't create directories while extracting.
-p--password
Decrypt archive using a password.
-? --help
Show program help.
--usage Show short program usage message.
-V--version
Show version of program.
NON-FREE UNRAR COMPATIBLE SYNOPSIS
unrar-free [elvx] [-ep] [-o+] [-o-] [-ppassword] [-u] [--] ARCHIVE [FILE ...] [DESTINATION]
This syntax should only be used in front-end programs which are using non-free unrar as a back-end. It is recommended to use this program
by GNU command line syntax.
e Extract files from archive without full path.
l List files in archive.
v Verbose list files in archive.
x Extract files from archive with full path.
-ep Don't create directories while extracting.
-o+ Overwrite files when extracting.
-o- Don't overwrite files when extracting.
-p Decrypt archive using a password.
-u Only extract newer files from the archive.
-- Disable further switch processing. Any switch after the -- are treated as filenames and destination.
BUGS
Advanced features of version 3.0 archives are not currently supported.
AUTHORS
unrar-free was written by Ben Asselstine based on UniquE RAR File Library by Christian Scheurer and Johannes Winkelmann.
This manual page was written by Niklas Vainio <niklas.vainio@iki.fi> for the Debian system (but may be used by others). Permission is
granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 any later version pub-
lished by the Free Software Foundation.
On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL.
UNRAR-FREE(1)