Sponsored Content
Top Forums Shell Programming and Scripting Help in writing a KSH script to filter the latest record? Post 302392641 by Scrutinizer on Friday 5th of February 2010 04:00:57 AM
Old 02-05-2010
Since the records appear to be time sorted, you could use something like this:
Code:
awk -F '|' 'END{for(i in A)print A[i]}{A[$NF]=$0}' infile | sort -t'|' -k5,5

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

filter and get the latest order number

hello, how can I filter and get the latest order number (last five digits) below: input file: johnmm00001 maryyy00121 johnm100222 johnmm00003 maryyy00122 output file: johnmm00003 maryyy00122 johnm100222 (6 Replies)
Discussion started by: happyv
6 Replies

2. Shell Programming and Scripting

filter the uniq record problem

Anyone can help for filter the uniq record for below example? Thank you very much Input file 20090503011111|test|abc 20090503011112|tet1|abc|def 20090503011112|test1|bcd|def 20090503011131|abc|abc 20090503011131|bbc|bcd 20090503011152|bcd|abc 20090503011151|abc|abc... (8 Replies)
Discussion started by: bleach8578
8 Replies

3. Shell Programming and Scripting

Filter record from a file

Reposting since I didnt not get any reply. I have a problem while filtering records from a file. Can somebody help please? For eg: Consider the below files Record file: 0003@00000000000190@20100401@201004012010040120100401@003@... (1 Reply)
Discussion started by: gpaulose
1 Replies

4. UNIX for Dummies Questions & Answers

Select the latest dated file-ksh script

Hello all!! I am new here and new in scripting! I want to write a ksh script to select the most recent file from a dir and use it in a variable. I have a directory with files named like: YYYMMDD A basic idea of the script I want to write is #!/usr/bin/ksh latest= latest_dated_file at... (2 Replies)
Discussion started by: chris_euop
2 Replies

5. Shell Programming and Scripting

Need a ksh script for adding the space at the end of record in a flat file

Hi, I need a ksh script for the below requirement: i have a Delimited flat file with 200 records delimiter is '|~|' i need a script to insert space at the end if the record is ending with delimiter '|~|' if it didnt end with delimiter it should not append space. Example: ram|~|2|~| ... (16 Replies)
Discussion started by: srikanth_sagi
16 Replies

6. Shell Programming and Scripting

awk-filter record by another file

I have file1 3049 3138 4672 22631 45324 112382 121240 125470 130289 186128 193996 194002 202776 228002 253221 273523 284601 284605 641858 (8 Replies)
Discussion started by: biomed
8 Replies

7. Shell Programming and Scripting

Delete record filter by column

Dear friend, I have a file 2 files with column wise FILE_A ------------------------------ x,1,@ y,3,$ x,5,% FILE_B -------------------- x,1,@ i like to delete the all lines in FILE_A ,if first column available in FILE_B. output (in FILE_A) y,3,$ x,5,% (10 Replies)
Discussion started by: Jewel
10 Replies

8. Shell Programming and Scripting

Finding the Latest record

Dear All, I have getting data as follows, the second field signifies table name and last one is time stamp. I have return always latest record based on time stamp. Could you please help me ? I/P ==== ... (1 Reply)
Discussion started by: srikanth38
1 Replies

9. Shell Programming and Scripting

Shell Script (ksh) - SQLPlus query filter using a string variable

Using ksh, I am using SQLPlus to execute a query with a filter using a string variable. REPO_DB=DEV1 FOLDER_NM='U_nmalencia' FOLDER_CHECK=$(sqlplus -s /nolog <<EOF CONNECT user/pswd_select@${REPO_DB} set echo off heading off feedback off select subj_name from subject where... (5 Replies)
Discussion started by: nkm0brm
5 Replies

10. UNIX for Dummies Questions & Answers

Display latest record from file based on multiple columns combination

I have requirement to print latest record from file based on multiple columns combination. EWAPE EW1SLE0000 EW1SOMU01 ABORTED 03/16/2015 100004 03/16/2015 100005 001 EWAPE EW1SLE0000 EW1SOMU01 ABORTED 03/18/2015 140003 03/18/2015 140004 001 EWAPE EW1SLE0000 EW1SOMU01 ABORTED 03/18/2015 220006... (1 Reply)
Discussion started by: tmalik79
1 Replies
PDF2TXT(1)							  PDFMiner Manual							PDF2TXT(1)

NAME
pdf2txt - extracts text contents of PDF files SYNOPSIS
pdf2txt [option...] file... DESCRIPTION
pdf2txt extracts text contents from a PDF file. It extracts all the text that is to be rendered programmatically, i.e. text represented as ASCII or Unicode strings. It cannot recognize text drawn as images that would require optical character recognition. It also extracts the corresponding locations, font names, font sizes, writing direction (horizontal or vertical) for each text portion. You need to provide a password for protected PDF documents when its access is restricted. You cannot extract any text from a PDF document which does not have extraction permission. OPTIONS
-o file Specifies the output file name. The default is to print the extracted contents to standand output in text format. -p pageno[,pageno,...] Specifies the comma-separated list of the page numbers to be extracted. Page numbers start at one. By default, it extracts text from all the pages. -c codec Specifies the output codec. -t type Specifies the output format. The following formats are currently supported: text Text format. This is the default. html HTML format. It is not recommended. xml XML format. It provides the most information. tag "Tagged PDF" format. A tagged PDF has its own contents annotated with HTML-like tags. pdf2txt tries to extract its content streams rather than inferring its text locations. Tags used here are defined in the PDF Reference, Sixth Edition[1] (S10.7 "Tagged PDF"). -D writing-mode Specifies the writing mode of text outputs: lr-tb Left-to-right, top-to-bottom. tb-rl Top-to-bottom, right-to-left. auto Determine writing mode automatically -M char-margin, -L line-margin, -W word-margin These are the parameters used for layout analysis. In an actual PDF file, text portions might be split into several chunks in the middle of its running, depending on the authoring software. Therefore, text extraction needs to splice text chunks. In the figure below, two text chunks whose distance is closer than the char-margin is considered continuous and get grouped into one. Also, two lines whose distance is closer than the line-margin is grouped as a text box, which is a rectangular area that contains a "cluster" of text portions. Furthermore, it may be required to insert blank characters (spaces) as necessary if the distance between two words is greater than the word-margin, as a blank between words might not be represented as a space, but indicated by the positioning of each word. Each value is specified not as an actual length, but as a proportion of the length to the size of each character in question. The default values are char-margin = 1.0, line-margin = 0.3, and W = 0.2, respectively. -n Suppress layout analysis. -A Force layout analysis for all the text strings, including text contained in figures. -V Enable detection of vertical writing. -s scale Specifies the output scale. This option can be used in HTML format only. -m n Specifies the maximum number of pages to extract. By default, all the pages in a document are extracted. -P password Provides the user password to access PDF contents. -d Increase the debug level. EXAMPLES
Extract text as an HTML file whose filename is output.html: $ pdf2txt -o output.html samples/naacl06-shinyama.pdf Extract a Japanese HTML file in vertical writing: $ pdf2txt -c euc-jp -D tb-rl -o output.html samples/jo.pdf Extract text from an encrypted PDF file: $ pdf2txt -P mypassword -o output.txt secret.pdf SEE ALSO
dumppdf(1) AUTHORS
Jakub Wilk <jwilk@debian.org> Wrote this manual page for the Debian system. Yusuke Shinyama <yusuke@cs.nyu.edu> Author of PDFMiner and its original HTML documentation. NOTES
1. PDF Reference, Sixth Edition http://www.adobe.com/devnet/acrobat/pdfs/pdf_reference_1-7.pdf pdf2txt 08/24/2011 PDF2TXT(1)
All times are GMT -4. The time now is 11:17 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy