11-04-2014
Quote:
Originally Posted by
Chubler_XL
I'm unsure if a uncompressed pdf document is plain text.
I think you are right. Unpacked pdf file looks like a plain text file but it is still probably a binary file. However, the tool I used to unpack (pdftk) claims that unpacked file can be edited by a simple text editor:
Quote:
Uncompress PDF page streams for editing the PDF in a text editor (e.g., vim, emacs)
pdftk doc.pdf output doc.unc.pdf uncompress
When I run your script, it shows some binary characters and crashes. I also thought of using awk but as far as I know it works only if a file is plain text.
Could you have a look at the pdf I have attached in my first post?
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hey all,
So I know you can easily find and replace words and strings in text files, but is there an easy way to find and replace just a sub-portion of text in the file name. For example, in a directory I have tons of file names that start with F00001-0708, and I want to change all the files to... (2 Replies)
Discussion started by: hertingm
2 Replies
2. Shell Programming and Scripting
Hi.
I would like to have experts help on below action.
I have text files in which page nubmers exists in form like
PAGE : 1
PAGE : 2
PAGE : 3 and so on there is other text too. I would like to know is it possible to check the last occurance of Page... (6 Replies)
Discussion started by: lodhi1978
6 Replies
3. Shell Programming and Scripting
Hello,
Consider that i have many files that have the below format:
file1
900 7777
1000 5 6 23 nnnnnnnnnnnnnnnnnn
1100 kkkkkkk
file2
900 1989
1000 5 3 10 kkkdfdfdffd
1100 kkkkkkk
What i would like to do is on every file to search the line that starts with... (4 Replies)
Discussion started by: chriss_58
4 Replies
4. Shell Programming and Scripting
Hello everyone. I am new to shell scripting and i am required to create a shell script, the purpose of which i will explain below.
I am on a solaris server btw.
Before delving into the requirements, i will give youse an overview of what is currently in place and its purpose.
... (2 Replies)
Discussion started by: goddevil
2 Replies
5. Shell Programming and Scripting
Hi there
I just wondered if someone could give me some perl advice
I have a bunch of text files used for a wiki that have common headings such as
---++ Title
blah
---++ Summary
blah
---++ Details
Here is the multiline
block
of text I
wish
to (6 Replies)
Discussion started by: rethink
6 Replies
6. UNIX for Dummies Questions & Answers
Not quite a unix question but problem in a perl command. Taking a chance if someone knows about the error
cat 1
a b c d
perl -p -e 's/a/b/g' 1
b b c d
What is the problem here??
perl -p -i -e 's/a/b/g' 1
Can't remove 1: Text file busy, skipping file. (2 Replies)
Discussion started by: analyst
2 Replies
7. Shell Programming and Scripting
Hi Folks,
I have an html file which contains the below line in the body tagI am trying the replace hello with Hello Giridhar programatically.
<body>
<P><STRONG><FONT face="comic sans ms,cursive,sans-serif"><EM>Hello</EM></FONT></STRONG></P>
</body>
I have written the below code to... (3 Replies)
Discussion started by: giridhar276
3 Replies
8. Shell Programming and Scripting
Hi,
I am looking for assistance over shell or perl (without XML twig module) which replace string in XML file under particular branch..example of code file sample..
Exact requirment : Replace "Su saldo es" in below file with "Your balance" but only in XML branch of Text id=98 and Text Id=12... (7 Replies)
Discussion started by: Ashu_099
7 Replies
9. Shell Programming and Scripting
In the below bash I am trying to rename eachof the 3 text files in /home/cmccabe/Desktop/percent by matching the numerical portion of each file to lines 3,4, or 5 in /home/cmccabe/Desktop/analysis.txt. There will always be a match between the files. When a match is found each text file in... (2 Replies)
Discussion started by: cmccabe
2 Replies
10. Shell Programming and Scripting
Hello Gurus,
I have a filesystem like below :
/u03/oracle/EBSDEV/fs1/EBSapps/appl
I want to get only the portion of the above text like below...
/u03/oracle/EBSDEV
Can you please advice on this?
Thanks-
P (5 Replies)
Discussion started by: pokhraj_d
5 Replies
LEARN ABOUT DEBIAN
kinosearch1::analysis::tokenbatch
KinoSearch1::Analysis::TokenBatch(3pm) User Contributed Perl Documentation KinoSearch1::Analysis::TokenBatch(3pm)
Add many tokens to the batch, by supplying the string to be tokenized, and arrays of token starts and token ends (specified in bytes).
NAME
KinoSearch1::Analysis::TokenBatch - a collection of tokens
SYNOPSIS
while ( $batch->next ) {
$batch->set_text( lc( $batch->get_text ) );
}
EXPERIMENTAL API
TokenBatch's API should be considered experimental and is likely to change.
DESCRIPTION
A TokenBatch is a collection of Tokens which you can add to, then iterate over.
METHODS
new
my $batch = KinoSearch1::Analysis::TokenBatch->new;
Constructor.
append
$batch->append( $text, $start_offset, $end_offset, $pos_inc );
Add a Token to the end of the batch. Accepts either three or four arguments: text, start_offset, end_offset, and an optional position
increment which defaults to 1 if not supplied. For a description of what these arguments mean, see the docs for Token.
next
while ( $batch->next ) {
# ...
}
Proceed to the next token in the TokenBatch. Returns true if the TokenBatch ends up located at valid token.
ACCESSOR METHODS
All of TokenBatch's accessor methods affect the current Token. Calling any of these methods when the TokenBatch is not located at a valid
Token will trigger an exception.
set_text get_text
Set/get the text of the current Token.
set_start_offset get_start_offset
Set/get the start_offset of the current Token.
set_end_offset get_end_offset
Set/get the end_offset of the current Token.
set_pos_inc get_pos_inc
Set/get the position increment of the current Token.
COPYRIGHT
Copyright 2005-2010 Marvin Humphrey
LICENSE, DISCLAIMER, BUGS, etc.
See KinoSearch1 version 1.00.
perl v5.14.2 2011-11-15 KinoSearch1::Analysis::TokenBatch(3pm)