Problem with extract PDFs from huge files. Post: 303045996

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to extract data from a huge file?

Hi, I have a huge file of bibliographic records in some standard format.I need a script to do some repeatable task as follows: 1. Needs to create folders as the strings starts with "item_*" from the input file 2. Create a file "contents" in each folders having "license.txt(tab...

2. Shell Programming and Scripting

How to extract a piece of information from a huge file

Hello All, I need some assistance to extract a piece of information from a huge file. The file is like this one : database information ccccccccccccccccc ccccccccccccccccc ccccccccccccccccc ccccccccccccccccc os information cccccccccccccccccc cccccccccccccccccc...

3. Shell Programming and Scripting

How to extract a subset from a huge dataset

Hi, All I have a huge file which has 450G. Its tab-delimited format is as below x1 A 50020 1 x1 B 50021 8 x1 C 50022 9 x1 A 50023 10 x2 D 50024 5 x2 C 50025 7 x2 F 50026 8 x2 N 50027 1 : : Now, I want to extract a subset from this file. In this subset, column 1 is x10, column 2 is...

4. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a...

5. Shell Programming and Scripting

Problem running Perl Script with huge data files

Hello Everyone, I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this : foreach my $t (@text) { open TEXT, $t or die "Cannot open $t for reading: $!\n"; while(my $line=<TEXT>){ ...

6. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . .

7. Shell Programming and Scripting

Search pdfs in command line

Hi, I'm trying to search for a particular phrase in a large number of PDFs in a particular directory. What I've done so far only prints out the line, but I haven't been able to display in which file the phrase appears. find . -name '*.pdf' -exec pdftotext {} - \; | grep "search phrase" ...

8. UNIX for Advanced & Expert Users

Performance problem with removing duplicates in a huge file (50+ GB)

I'm trying to remove duplicate data from an input file with unsorted data which is of size >50GB and write the unique records to a new file. I'm trying and already tried out a variety of options posted in similar threads/forums. But no luck so far.. Any suggestions please ? Thanks !!

9. Shell Programming and Scripting

Extract few content from a huge list of files

I have a huge list of files (about 300,000) which have a pattern like this. .I 1 .U 87049087 .S Am J Emerg .M Allied Health Personnel/*; Electric Countershock/*; .T Refibrillation managed by EMT-Ds: .P ARTICLE. .W Some patients converted from ventricular fibrillation to organized...

10. Shell Programming and Scripting

Bash script monitor directory and subdirectories for new pdfs

I need bash script that monitor folders for new pdf files and create xml file for rss feed with newest files on the list. I have some script, but it reports errors. #!/bin/bash SYSDIR="/var/www/html/Intranet" HTTPLINK="http://TYPE.IP.ADDRESS.HERE/pdfs" FEEDTITLE="Najnoviji dokumenti na...

LEARN ABOUT SUSE

cabextract

CABEXTRACT(1)						      General Commands Manual						     CABEXTRACT(1)

NAME

       cabextract - program to extract files from Microsoft cabinet (.cab) archives

SYNOPSIS

       cabextract [-ddir] [-f] [-Fpattern] [-h] [-l] [-L] [-p] [-q] [-s] [-t] [-v]  cabinet files ...

DESCRIPTION

       cabextract  is  a  program that un-archives files in the Microsoft cabinet file format (.cab) or any binary file which contains an embedded
       cabinet file (frequently found in .exe files).

       cabextract will extract all files from all cabinet files specified on the command line.

       To extract a multi-part cabinet consisting of several files, only the first cabinet file needs to be given as an argument to cabextract	as
       it  will  automatically	look  for the remaining files. To prevent cabextract from extracting cabinet files you did not specify, use the -s
       option.

OPTIONS

       A summary of options is included below.

       -d dir Extracts all files into the directory dir.

       -f     When testing or extracting cabinet files, corrupted MSZIP blocks will be ignored. A warning will be printed  if  a  corrupted  MSZIP
	      block is encountered.

       -F pattern
	      Only  files with names that match the shell pattern pattern shall be listed, tested or extracted. On non-GNU systems, this match may
	      be case-sensitive.

       -h     Prints a page of help and exits.

       -l     Lists the contents of the given cabinet files, rather than extracting them.

       -L     When extracting cabinet files, makes each extracted file's name lowercase.

       -p     Files shall be extracted to standard output.

       -q     When extracting cabinet files, suppresses all messages except errors and warnings.

       -s     When testing, listing or extracting cabinets which span multiple files, only cabinet files given on the command line shall be used.

       -t     Tests the integrity of the cabinet. Files are decompressed, but not written to disk or standard output.  If  the	file  successfully
	      decompresses, the MD5 checksum of the file is printed.

       -v     If given alone on the command line, prints the version of cabextract and exits. Given with a list of cabinet files, it will list the
	      contents of the cabinet files.

AUTHOR

       This manual page was written by Stuart Caie <kyzer@4u.net>, based on the one written by Eric Sharkey <sharkey@debian.org>, for  the  Debian
       GNU/Linux system.

								 October 30, 2005						     CABEXTRACT(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to extract data from a huge file?

Discussion started by: srsahu75

2. Shell Programming and Scripting

How to extract a piece of information from a huge file

Discussion started by: Marcor

3. Shell Programming and Scripting

How to extract a subset from a huge dataset

Discussion started by: cliffyiu

4. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Discussion started by: jiapei100