Problem with extract PDFs from huge files. Post: 303045990

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to extract data from a huge file?

Hi, I have a huge file of bibliographic records in some standard format.I need a script to do some repeatable task as follows: 1. Needs to create folders as the strings starts with "item_*" from the input file 2. Create a file "contents" in each folders having "license.txt(tab...

2. Shell Programming and Scripting

How to extract a piece of information from a huge file

Hello All, I need some assistance to extract a piece of information from a huge file. The file is like this one : database information ccccccccccccccccc ccccccccccccccccc ccccccccccccccccc ccccccccccccccccc os information cccccccccccccccccc cccccccccccccccccc...

3. Shell Programming and Scripting

How to extract a subset from a huge dataset

Hi, All I have a huge file which has 450G. Its tab-delimited format is as below x1 A 50020 1 x1 B 50021 8 x1 C 50022 9 x1 A 50023 10 x2 D 50024 5 x2 C 50025 7 x2 F 50026 8 x2 N 50027 1 : : Now, I want to extract a subset from this file. In this subset, column 1 is x10, column 2 is...

4. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a...

5. Shell Programming and Scripting

Problem running Perl Script with huge data files

Hello Everyone, I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this : foreach my $t (@text) { open TEXT, $t or die "Cannot open $t for reading: $!\n"; while(my $line=<TEXT>){ ...

6. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . .

7. Shell Programming and Scripting

Search pdfs in command line

Hi, I'm trying to search for a particular phrase in a large number of PDFs in a particular directory. What I've done so far only prints out the line, but I haven't been able to display in which file the phrase appears. find . -name '*.pdf' -exec pdftotext {} - \; | grep "search phrase" ...

8. UNIX for Advanced & Expert Users

Performance problem with removing duplicates in a huge file (50+ GB)

I'm trying to remove duplicate data from an input file with unsorted data which is of size >50GB and write the unique records to a new file. I'm trying and already tried out a variety of options posted in similar threads/forums. But no luck so far.. Any suggestions please ? Thanks !!

9. Shell Programming and Scripting

Extract few content from a huge list of files

I have a huge list of files (about 300,000) which have a pattern like this. .I 1 .U 87049087 .S Am J Emerg .M Allied Health Personnel/*; Electric Countershock/*; .T Refibrillation managed by EMT-Ds: .P ARTICLE. .W Some patients converted from ventricular fibrillation to organized...

10. Shell Programming and Scripting

Bash script monitor directory and subdirectories for new pdfs

I need bash script that monitor folders for new pdf files and create xml file for rss feed with newest files on the list. I have some script, but it reports errors. #!/bin/bash SYSDIR="/var/www/html/Intranet" HTTPLINK="http://TYPE.IP.ADDRESS.HERE/pdfs" FEEDTITLE="Najnoviji dokumenti na...

LEARN ABOUT DEBIAN

archive::any

Archive::Any(3pm)					User Contributed Perl Documentation					 Archive::Any(3pm)

NAME

       Archive::Any - Single interface to deal with file archives.

SYNOPSIS

	 use Archive::Any;

	 my $archive = Archive::Any->new($archive_file);

	 my @files = $archive->files;

	 $archive->extract;

	 my $type = $archive->type;

	 $archive->is_impolite;
	 $archive->is_naughty;

DESCRIPTION

       This module is a single interface for manipulating different archive formats.  Tarballs, zip files, etc.

       new
	     my $archive = Archive::Any->new($archive_file);
	     my $archive = Archive::Any->new($archive_file, $type);

	   $type is optional.  It lets you force the file type in-case Archive::Any can't figure it out.

       extract
	     $archive->extract;
	     $archive->extract($directory);

	   Extracts the files in the archive to the given $directory.  If no $directory is given, it will go into the current working directory.

       files
	     my @file = $archive->files;

	   A list of files in the archive.

       mime_type
	    my $mime_type = $archive->mime_type();

	   Returns the mime type of the archive.

       is_impolite
	     my $is_impolite = $archive->is_impolite;

	   Checks to see if this archive is going to unpack into the current directory rather than create its own.

       is_naughty
	     my $is_naughty = $archive->is_naughty;

	   Checks to see if this archive is going to unpack outside the current directory.

DEPRECATED

       type
	     my $type = $archive->type;

	   Returns the type of archive.  This method is provided for backwards compatibility in the Tar and Zip plugins and will be going away
	   soon in favor of "mime_type".

PLUGINS

       For detailed information on writing plugins to work with Archive::Any, please see the pod documentation for Archive::Any::Plugin.

AUTHOR

       Clint Moore <cmoore@cpan.org>

AUTHOR EMERITUS

       Michael G Schwern

SEE ALSO

       Archive::Any::Plugin

SUPPORT

       You can find documentation for this module with the perldoc command.

	perldoc Archive::Any

       You can also look for information at:

       o   AnnoCPAN: Annotated CPAN documentation

	   <http://annocpan.org/dist/Archive-Any>

       o   CPAN Ratings

	   <http://cpanratings.perl.org/d/Archive-Any>

       o   RT: CPAN's request tracker

	   <http://rt.cpan.org/NoAuth/Bugs.html?Dist=Archive-Any>

       o   Search CPAN

	   <http://search.cpan.org/dist/Archive-Any>

LICENSE

       This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

       See <http://www.perl.com/perl/misc/Artistic.html>

perl v5.10.0							    2008-06-25							 Archive::Any(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to extract data from a huge file?

Discussion started by: srsahu75

2. Shell Programming and Scripting

How to extract a piece of information from a huge file

Discussion started by: Marcor

3. Shell Programming and Scripting

How to extract a subset from a huge dataset

Discussion started by: cliffyiu

4. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Discussion started by: jiapei100