Sponsored Content
Full Discussion: OCR text that needs cleaning
Top Forums Shell Programming and Scripting OCR text that needs cleaning Post 302981981 by safran on Thursday 22nd of September 2016 04:51:32 AM
Old 09-22-2016
OCR text that needs cleaning - reply

Hi,

Thanks for the quick response but your AWK one-liners just uppercase everything before the POS.
I'm already doing this uppercasing when I run doup.sed
The code I'm stuck on is the lowercasing of everything within the parentheses before the POS

Thanks
 

8 More Discussions You Might Find Interesting

1. AIX

doing some spring cleaning....

USERS="me you jim joe sue" for user in ${USERS}; do rmuser -p $user usrdir=`cat /etc/passwd|grep $user|awk -F":" '{ print $6 }'` rm -fr `cat /etc/passwd|grep $user|awk -F":" '{ print $6 }'` echo Deleting: $user '\t' REMOVING: $usrdir done This is for AIX ONLY!!! but easily ported to... (0 Replies)
Discussion started by: Optimus_P
0 Replies

2. UNIX for Dummies Questions & Answers

Cleaning text files

I wish to clean a text file of the following characters 1/2, 1/4, o (degrees) I cant display these characters. I have tried ALT+189 etc (my terminal emulator is set to ASCII). How do I display the above ? I am using HP UX 10. (5 Replies)
Discussion started by: ferretman
5 Replies

3. Shell Programming and Scripting

Working with OCR text inside PDF files

I'm trying to find a way to automate cleanup of OCR for a large number of scanned pages - due to limitations of the access mechanism where these are to end up, I need to create pdf files that include the background text for searching. Going in I have Tif images too dirty to OCR and re-keyed text... (2 Replies)
Discussion started by: dorcas
2 Replies

4. UNIX and Linux Applications

Ocr

Is there any open-source software that OCRs PDFs? (2 Replies)
Discussion started by: CRGreathouse
2 Replies

5. Shell Programming and Scripting

File cleaning

HI , I am getting the source data as below. Source Data CDR_Data,,,,, F1,F2,F3,F4,F5,F6 5,5,6,7,8,7 6,6,g,,, 7,7,76,,, 8,8,gt,,, 9,9,df ,d,d,d ,,,,, (4 Replies)
Discussion started by: wangkc
4 Replies

6. Shell Programming and Scripting

cleaning the file

Hi, I have a file with multiple rows. each row has 8 columns. Column 8 has entries separated by commas. I want to exclude all the rows in which column 8 has more than 3 commas. 1234#0/1 - ABC_1234 3 ATGCATGCATGC HHHIIIGIHVF 1 49:T>C,60:T>C,78:C>A,76:G>T,65:T>G Thanks, Diya (3 Replies)
Discussion started by: Diya123
3 Replies

7. UNIX for Advanced & Expert Users

Regular expression for finding OCR mistakes.

I have a large file of plain text, created using some OCR software. Some words have inevitably been got wrong. I've been trying to create grep or sed, etc., regular expressions to find them - but haven't quite managed to get it right. Here's what I'm trying to achieve: Output all lines which... (2 Replies)
Discussion started by: gencon
2 Replies

8. Shell Programming and Scripting

cleaning up files using find...

I am trying to cleanup a directory with around 4000 files, and using the below command to delete all .gz files older than 60 days, I am having the same issue of arguments being too long. is there a way i can use the same command to do what I intend to do. find /opt/et/logs/Archive/*.log.*.gz... (4 Replies)
Discussion started by: Shellslave
4 Replies
IPC::SysV(3pm)						 Perl Programmers Reference Guide					    IPC::SysV(3pm)

NAME
IPC::SysV - System V IPC constants and system calls SYNOPSIS
use IPC::SysV qw(IPC_STAT IPC_PRIVATE); DESCRIPTION
"IPC::SysV" defines and conditionally exports all the constants defined in your system include files which are needed by the SysV IPC calls. Common ones include IPC_CREATE IPC_EXCL IPC_NOWAIT IPC_PRIVATE IPC_RMID IPC_SET IPC_STAT GETVAL SETVAL GETPID GETNCNT GETZCNT GETALL SETALL SEM_A SEM_R SEM_UNDO SHM_RDONLY SHM_RND SHMLBA and auxiliary ones S_IRUSR S_IWUSR S_IRWXU S_IRGRP S_IWGRP S_IRWXG S_IROTH S_IWOTH S_IRWXO but your system might have more. ftok( PATH ) ftok( PATH, ID ) Return a key based on PATH and ID, which can be used as a key for "msgget", "semget" and "shmget". See ftok. If ID is omitted, it defaults to 1. If a single character is given for ID, the numeric value of that character is used. shmat( ID, ADDR, FLAG ) Attach the shared memory segment identified by ID to the address space of the calling process. See shmat. ADDR should be "undef" unless you really know what you're doing. shmdt( ADDR ) Detach the shared memory segment located at the address specified by ADDR from the address space of the calling process. See shmdt. memread( ADDR, VAR, POS, SIZE ) Reads SIZE bytes from a memory segment at ADDR starting at position POS. VAR must be a variable that will hold the data read. Returns true if successful, or false if there is an error. memread() taints the variable. memwrite( ADDR, STRING, POS, SIZE ) Writes SIZE bytes from STRING to a memory segment at ADDR starting at position POS. If STRING is too long, only SIZE bytes are used; if STRING is too short, nulls are written to fill out SIZE bytes. Returns true if successful, or false if there is an error. SEE ALSO
IPC::Msg, IPC::Semaphore, IPC::SharedMem, ftok, shmat, shmdt AUTHORS
Graham Barr <gbarr@pobox.com>, Jarkko Hietaniemi <jhi@iki.fi>, Marcus Holland-Moritz <mhx@cpan.org> COPYRIGHT
Version 2.x, Copyright (C) 2007-2010, Marcus Holland-Moritz. Version 1.x, Copyright (c) 1997, Graham Barr. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.16.3 2013-03-04 IPC::SysV(3pm)
All times are GMT -4. The time now is 08:42 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy