09-22-2016
OCR text that needs cleaning - reply
Hi,
Thanks for the quick response but your AWK one-liners just uppercase everything before the POS.
I'm already doing this uppercasing when I run doup.sed
The code I'm stuck on is the lowercasing of everything within the parentheses before the POS
Thanks
8 More Discussions You Might Find Interesting
1. AIX
USERS="me you jim joe sue"
for user in ${USERS}; do
rmuser -p $user
usrdir=`cat /etc/passwd|grep $user|awk -F":" '{ print $6 }'`
rm -fr `cat /etc/passwd|grep $user|awk -F":" '{ print $6 }'`
echo Deleting: $user '\t' REMOVING: $usrdir
done
This is for AIX ONLY!!! but easily ported to... (0 Replies)
Discussion started by: Optimus_P
0 Replies
2. UNIX for Dummies Questions & Answers
I wish to clean a text file of the following characters
1/2, 1/4, o (degrees)
I cant display these characters. I have tried ALT+189 etc (my terminal emulator is set to ASCII). How do I display the above ? I am using HP UX 10. (5 Replies)
Discussion started by: ferretman
5 Replies
3. Shell Programming and Scripting
I'm trying to find a way to automate cleanup of OCR for a large number of scanned pages - due to limitations of the access mechanism where these are to end up, I need to create pdf files that include the background text for searching.
Going in I have Tif images too dirty to OCR and re-keyed text... (2 Replies)
Discussion started by: dorcas
2 Replies
4. UNIX and Linux Applications
Is there any open-source software that OCRs PDFs? (2 Replies)
Discussion started by: CRGreathouse
2 Replies
5. Shell Programming and Scripting
HI ,
I am getting the source data as below.
Source Data
CDR_Data,,,,,
F1,F2,F3,F4,F5,F6
5,5,6,7,8,7
6,6,g,,,
7,7,76,,,
8,8,gt,,,
9,9,df ,d,d,d
,,,,, (4 Replies)
Discussion started by: wangkc
4 Replies
6. Shell Programming and Scripting
Hi,
I have a file with multiple rows. each row has 8 columns.
Column 8 has entries separated by commas. I want to exclude all the rows in which column 8 has more than 3 commas.
1234#0/1 - ABC_1234 3 ATGCATGCATGC HHHIIIGIHVF 1 49:T>C,60:T>C,78:C>A,76:G>T,65:T>G
Thanks,
Diya (3 Replies)
Discussion started by: Diya123
3 Replies
7. UNIX for Advanced & Expert Users
I have a large file of plain text, created using some OCR software. Some words have inevitably been got wrong. I've been trying to create grep or sed, etc., regular expressions to find them - but haven't quite managed to get it right. Here's what I'm trying to achieve:
Output all lines which... (2 Replies)
Discussion started by: gencon
2 Replies
8. Shell Programming and Scripting
I am trying to cleanup a directory with around 4000 files, and using the below command to delete all .gz files older than 60 days, I am having the same issue of arguments being too long. is there a way i can use the same command to do what I intend to do.
find /opt/et/logs/Archive/*.log.*.gz... (4 Replies)
Discussion started by: Shellslave
4 Replies
LEARN ABOUT CENTOS
ipc::sysv
IPC::SysV(3pm) Perl Programmers Reference Guide IPC::SysV(3pm)
NAME
IPC::SysV - System V IPC constants and system calls
SYNOPSIS
use IPC::SysV qw(IPC_STAT IPC_PRIVATE);
DESCRIPTION
"IPC::SysV" defines and conditionally exports all the constants defined in your system include files which are needed by the SysV IPC
calls. Common ones include
IPC_CREATE IPC_EXCL IPC_NOWAIT IPC_PRIVATE IPC_RMID IPC_SET IPC_STAT
GETVAL SETVAL GETPID GETNCNT GETZCNT GETALL SETALL
SEM_A SEM_R SEM_UNDO
SHM_RDONLY SHM_RND SHMLBA
and auxiliary ones
S_IRUSR S_IWUSR S_IRWXU
S_IRGRP S_IWGRP S_IRWXG
S_IROTH S_IWOTH S_IRWXO
but your system might have more.
ftok( PATH )
ftok( PATH, ID )
Return a key based on PATH and ID, which can be used as a key for "msgget", "semget" and "shmget". See ftok.
If ID is omitted, it defaults to 1. If a single character is given for ID, the numeric value of that character is used.
shmat( ID, ADDR, FLAG )
Attach the shared memory segment identified by ID to the address space of the calling process. See shmat.
ADDR should be "undef" unless you really know what you're doing.
shmdt( ADDR )
Detach the shared memory segment located at the address specified by ADDR from the address space of the calling process. See shmdt.
memread( ADDR, VAR, POS, SIZE )
Reads SIZE bytes from a memory segment at ADDR starting at position POS. VAR must be a variable that will hold the data read. Returns
true if successful, or false if there is an error. memread() taints the variable.
memwrite( ADDR, STRING, POS, SIZE )
Writes SIZE bytes from STRING to a memory segment at ADDR starting at position POS. If STRING is too long, only SIZE bytes are used; if
STRING is too short, nulls are written to fill out SIZE bytes. Returns true if successful, or false if there is an error.
SEE ALSO
IPC::Msg, IPC::Semaphore, IPC::SharedMem, ftok, shmat, shmdt
AUTHORS
Graham Barr <gbarr@pobox.com>, Jarkko Hietaniemi <jhi@iki.fi>, Marcus Holland-Moritz <mhx@cpan.org>
COPYRIGHT
Version 2.x, Copyright (C) 2007-2010, Marcus Holland-Moritz.
Version 1.x, Copyright (c) 1997, Graham Barr.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.16.3 2013-03-04 IPC::SysV(3pm)