Hi
I have only ever used awk and sed for basic requirements up until now.
I have had to break a log down for multiple purposes.
Using awk, sed and a date script. I am left with this:
(message id, time of msg attempt, message id, domain name, time of msg completion)
... (4 Replies)
I am tryin to figure out how to extract interested text from file
example.txt
blah blah
blah a: child1
blah a: child2
blah b: parent1
blah blah
blah ....
blah a: child21
blah a: child22
blah a: child23
blah b: parent2
this kinda text repeats .. number of children is... (6 Replies)
Hi there,
I have some text files in unix format that processed by a program in windows, and when I open them with less or vi in linux, a warn for opening binary file is prompted, and as shown in vi, between every two characters there was inserted a "^@". How can I fix this. Plus, there are over... (2 Replies)
Greetings. Iīm a biologist and I donīt have mucho knowledge on Unix/Linux, but I need to use Cygwin to change some documents from a GenBank format to a FASTA format. GenBank format goes somthing like this:
LOCUS NM_013964 2568 bp mRNA linear PRI 26-APR-2009... (2 Replies)
Hello again unix.com
How can I extract from a large file in format:
steve@aol.com steve hawkins Location of this member is bla bla bla
sun@hotmail.com Sun Ying This member is using browser bla bla bla
to another text in format:
steve@aol.com steve hawkins
sun@hotmail.com sun ying
... (5 Replies)
Hello Unix.com,
I have a text in format:
john
sara
lee
How can I make it:
john:john
john:john1
john:john12
john:john123
sara:sara
sara:sara12
sara:sara123 and so on (2 Replies)
Hello unix.com users,
I have a ip file (line-by-line). How can I delete the ips that keep repeating by mark XXX.XXX.XXX.* ... I want to erase only the lines that keep repeating more than 2 times.
Example:
1.2.3.1
1.2.3.2
1.2.3.3
I want to erase all ips blocks that are repeating by C... (1 Reply)
i want to generate a list line-by-line of normal characters
using letters . for example :
dnds
gnos
mgod
pets
jnfp
etc...
i want to use all letters with all the posibilities
is there a script that can do this ? (3 Replies)
Hello again,
I have a problem manipulating a large text document and there is no way I could edit this document by hand.
Format is:
Address : XXXX N 37 Ave, Hollywood, FL, 33021
Phone: XXX3190XXX
Player: XXXXXX
Character: Jaramillo
DOB: June-14-1995
-----
Name: Alexandra
Ticket... (3 Replies)
Hello Forum ,
I need a help about text manupulation. I have a text file and I have to manipulate this file. Let's say source.txt
source.txt
UNB+UNOC:3+O0013000005MAN MN RVS:91+0098006688:92+190304:2313+F004169241'
UNH+8146848+DELJIT:D:96A:UN'
BGM+307:::JIS_SYNCRO_FIRM+2019030423234101+9'... (8 Replies)
Discussion started by: cemokam65
8 Replies
LEARN ABOUT DEBIAN
hocr2djvused
HOCR2DJVUSED(1) hocr2djvused manual HOCR2DJVUSED(1)NAME
hocr2djvused - hOCR to djvused script converter
SYNOPSIS
hocr2djvused [option...]
DESCRIPTION
hocr2djvused reads a hOCR[1] file (as produced by OCRopus[2] or Cuneiform[3] or Tesseract[4]) from the standard input and converts it to a
djvused script.
OPTIONS
Text segmentation options
-t lines, --details lines
Record location of every line. Don't record locations of particular words or characters.
-t words, --details=words
Record location of every line and every word. Don't record locations of particular characters.
This is the default.
-t chars, --details=chars
Record location of every line, every word and every character.
--word-segmentation=simple
Consider each non-empty sequence of non-whitespace characters a single word.
This is the default, despite being linguistically incorrect.
--word-segmentation=uax29
Use the Unicode Text Segmentation[5] algorithm to break lines into words.
This options break assumptions of some DjVu tools that words are separated by spaces, and therefore is it not recommended.
Other options
--rotation=n
Assume that DjVu pages are rotated by n degrees.
--page-size=widthxheight
Specifies that page size is width pixels x height pixels.
This option is required for hOCR generated by Cuneiform (< 0.8) and superfluous otherwise.
--html5
Use a HTML5 parser[6], which is more robust but slower than the default parser.
--version
Output version information and exit.
-h, --help
Display help and exit.
SEE ALSO ocrodjvu(1), djvused(1)AUTHOR
Jakub Wilk <jwilk@jwilk.net>
Author.
NOTES
1. hOCR
http://docs.google.com/View?docid=dfxcv4vc_67g844kf
2. OCRopus
http://ocropus.googlecode.com/
3. Cuneiform
http://launchpad.net/cuneiform-linux
4. Tesseract
http://tesseract-ocr.googlecode.com/
5. Unicode Text Segmentation
http://unicode.org/reports/tr29/
6. HTML5 parser
http://www.whatwg.org/specs/web-apps/current-work/#html-parser
hocr2djvused 0.7.9 03/10/2012 HOCR2DJVUSED(1)