You may wish to see if your system has these (and others like them), or Google for them:
I have used pdftk for a few things, so I know that it works for some tasks.
If that doesn't work, then if you can get the text, you may be able to use other tools to convert to HTML. You didn't give us much to go on, so you will need to decide how much the result is worth your investment of time.
Hi,
I need to extract the table name from an oracle control file which comes as the last word in the third line.
Ex:
LOAD DATA
INFILE '/home/user/files/scott.dat'
INTO TABLE SCOTT.EMP_SAL
FIELDS TERMINATED BY..........
what i want to to is write the table name SCOTT.EMP_SAL to a... (2 Replies)
Suppose there is a table like the following...I just wanted to know if there is any command using which we can get the record/name of the person who joined before 2005..
Sl Name des y.o.joining
1 Ram Engineer 2001
2 Hari Doctor 2004
3 David Plumber 2005
4 Rahim painter 2007
5 gurmeet... (1 Reply)
I have an Employee with EID, ENAME and ESTATUS as columns in SQL.
I want to extract the status of an employee and update the details if the status is 'A'.
Can anyone help in writing the shell script. (1 Reply)
How can I extract table name from the different DDL statement like
ALTER TABLE
CREATE TABLE etc
Basically I have to parse thr the any of the DDL statement and verify if that DDL statement is implemented by DBA or not.
how can i do this efficiently in Kornshell scripting. (2 Replies)
I need to compare the 2 mysql database tables. there are around 50 tables in each DB.
my idea is
in DB1
extract result select * from table1; to alog file1
in DB2
extract result select * from table1; to alog file2
now compare log file 1 file 2
pls help me out ...
thanks in advance (5 Replies)
I want to extract a table from an HTML file. the table starts with
<table class="tableinfo"
and ends with next closing table tag
</table>
how can I do this with awk/sed...
---------- Post updated at 04:34 PM ---------- Previous update was at 04:28 PM ----------
also I want to... (4 Replies)
In here we have a script to extract all pdf links from a single page.. any idea's in how make this read instead of a page a list of pages.. and extract all pdf links ?
#!/bin/bash
# NAME: pdflinkextractor
# AUTHOR: Glutanimate (http://askubuntu.com/users/81372/), 2013
#... (1 Reply)
I need to sort out table in a file to a format below:
Input:
this is a test
example
Cat Bee Dat
1 2 3
more Example
date
data
Bet Cla Blaa Dat
A 6 T
data..
Output:
this is a test (10 Replies)
Hi,
I need to extract only the create table structure with columns alone.
for eg
hive_table
show create table hive_table:
create table hive_table(id number,age number)
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION 'hdfs:/path/'
I need only below
... (5 Replies)
Discussion started by: rohit_shinez
5 Replies
LEARN ABOUT CENTOS
pdftotext
pdftotext(1) General Commands Manual pdftotext(1)NAME
pdftotext - Portable Document Format (PDF) to text converter (version 3.03)
SYNOPSIS
pdftotext [options] [PDF-file [text-file]]
DESCRIPTION
Pdftotext converts Portable Document Format (PDF) files to plain text.
Pdftotext reads the PDF file, PDF-file, and writes a text file, text-file. If text-file is not specified, pdftotext converts file.pdf to
file.txt. If text-file is '-', the text is sent to stdout.
OPTIONS -f number
Specifies the first page to convert.
-l number
Specifies the last page to convert.
-r number
Specifies the resolution, in DPI. The default is 72 DPI.
-x number
Specifies the x-coordinate of the crop area top left corner
-y number
Specifies the y-coordinate of the crop area top left corner
-W number
Specifies the width of crop area in pixels (default is 0)
-H number
Specifies the height of crop area in pixels (default is 0)
-layout
Maintain (as best as possible) the original physical layout of the text. The default is to 'undo' physical layout (columns, hyphen-
ation, etc.) and output the text in reading order.
-fixed number
Assume fixed-pitch (or tabular) text, with the specified character width (in points). This forces physical layout mode.
-raw Keep the text in content stream order. This is a hack which often "undoes" column formatting, etc. Use of raw mode is no longer
recommended.
-htmlmeta
Generate a simple HTML file, including the meta information. This simply wraps the text in <pre> and </pre> and prepends the meta
headers.
-bbox Generate an XHTML file containing bounding box information for each word in the file.
-enc encoding-name
Sets the encoding to use for text output. This defaults to "UTF-8".
-listenc
Lits the available encodings
-eol unix | dos | mac
Sets the end-of-line convention to use for text output.
-nopgbrk
Don't insert page breaks (form feed characters) between pages.
-opw password
Specify the owner password for the PDF file. Providing this will bypass all security restrictions.
-upw password
Specify the user password for the PDF file.
-q Don't print any messages or errors.
-v Print copyright and version information.
-h Print usage information. (-help and --help are equivalent.)
BUGS
Some PDF files contain fonts whose encodings have been mangled beyond recognition. There is no way (short of OCR) to extract text from
these files.
EXIT CODES
The Xpdf tools use the following exit codes:
0 No error.
1 Error opening a PDF file.
2 Error opening an output file.
3 Error related to PDF permissions.
99 Other error.
AUTHOR
The pdftotext software and documentation are copyright 1996-2011 Glyph & Cog, LLC.
SEE ALSO pdfdetach(1), pdffonts(1), pdfimages(1), pdfinfo(1), pdftocairo(1), pdftohtml(1), pdftoppm(1), pdftops(1)
15 August 2011 pdftotext(1)