1- I tried to turn text to one array and replace them something like this:
But this code doesn't keep the new line and echo all text in one line and my text is so big about 15 Gb and can't put it in one array.
2- Choose randomly from all occurrences of one word or token in the whole text.
simply can get number of repetition by something like this ( grep "a" | wc -l )
3- I can use python to do this but my text is huge and I want to use bash since it is faster than python. In python I use a set contains (a,b) and in replace function use a random function to choose (a or b) from that set.
4- I use Ubuntu 18.04, sed (GNU sed) 4.4, GNU Awk 4.1.4, API: 1.1 (GNU MPFR 4.0.1, GNU MP 6.1.2)
5- I simplify the problem, the main problem is that: I want to normalize a text corpus for training a tri-gram language model, in the language model, the sequence of words is important. I normalize the numbers to letters so for example, I convert all 30 to thirty but we use often half instead of thirty for reporting hour (e.g 8:30). I want to replace randomly things like this not whole of them.
on my desktop i am using the kde rotating desktop image option. this rotates images randomly every half hour. now, i would like to write an html file which will have an inline frame with some text, maybe system messages, or my friends live journal thati read alot, or unix.com! however, i dont want... (1 Reply)
I have a directory of files that look like filename 001.ext, filename 002.ext, etc. I'd like to rename the files with unique random numbered names, so that the original filenames are stripped and the files are given a new, random number name. I'm not super new to UNIX, but I don't often use it for... (2 Replies)
Hi there!
I am really enjoying working with sed. I am trying to come up with a sed command to replace some occurrences (not all) in the same line, for instance:
I have a command which the output will be:
200.300.400.5 0A 0B 0C 01 02 03
being that the last 6 strings are actually one... (7 Replies)
I have a text (text.txt) and I would like to replace only the first 2 occurrences of a word (but I might need to replace more):
For example, if text is this:
CAR sweet head
hat red yellow
CAR book brown
tiger CAR cow CAR
CAR milk
I would like to replace the word "CAR" with word... (12 Replies)
I want to create a cron job randomly once a day for my site's registration.
The responsible file for registrations is a config file and I need to change the contents
twice on day (on and off)
I know the way for random cron job for example
*/n * * * * /usr/local/bin/php... (6 Replies)
Hi,
I tried to adapt bartus's solution to my problem, without success. I want to replace all the occurences of this:
with:
, where something can contain an arbitrary number of balanced parens and brakets.
Any ideas ?
Best, (1 Reply)
Hi,
(First post, please be gental!)
I have a java app that I am running on unix (centos)
But it keeps dying randomly. The times seem random from anything between 3 hours and 3 days.
I have a cronjob running to restart it when ever it dies but I would rather this happened less often.
... (2 Replies)
Hello,
This is my code:
nb_lignes=`wc -l $1 | cut -d " " -f1`
for i in $(seq $nb_lignes)
do
m=`head $1 -n $i | tail -1`
//command
done
Please how can i change it to get Get 20% of lines in File randomly to apply "command" on each line ? 20% or 40% or 60 % (it's a parameter)
Thank you. (15 Replies)
Hey,
How can i create randomly create time N times.
Suppose i want to create data for a particualr date 5 times...
Mon Jan 19 11:42:50
Mon Jan 19 19:16:40
Mon Jan 19 12:12:33
Mon Jan 19 14:26:27
Mon Jan 19 12:29:53
Mon Jan 19 13:30:31
I want the script to create N times randome... (2 Replies)
Discussion started by: jaituteja
2 Replies
LEARN ABOUT SUSE
html::formatrtf
HTML::FormatRTF(3) User Contributed Perl Documentation HTML::FormatRTF(3)NAME
HTML::FormatRTF - Format HTML as RTF
SYNOPSIS
use HTML::FormatRTF;
my $out_file = "test.rtf";
open(RTF, ">$out_file")
or die "Can't write-open $out_file: $!
Aborting";
print RTF HTML::FormatRTF->format_file(
'test.html',
'fontname_headings' => "Verdana",
);
close(RTF);
DESCRIPTION
HTML::FormatRTF is a class for objects that you use to convert HTML to RTF. There is currently no proper support for tables or forms.
This is a subclass of HTML::Formatter, whose documentation you should consult for more information on the new, format, format_file
You can specify any of the following parameters in the call to "new", "format_file", or "format_string":
lm Amount of extra indenting to apply to the left margin, in twips (twentieths of a point). Default is 0.
So if you wanted the left margin to be an additional half inch larger, you'd set "lm => 720" (since there's 1440 twips in an inch). If
you wanted it to be about 1.5cm larger, you'd set "lw => 850" (since there's about 567 twips in a centimeter).
rm Amount of extra indenting to apply to the left margin, in twips (twentieths of a point). Default is 0.
normal_halfpoint_size
This is the size of normal text in the document, in half-points. The default value is 22, meaning that normal text is in 11 point.
header_halfpoint_size
This is the size of text used in the document's page-header, in half-points. The default value is 17, meaning that normal text is in
7.5 point. Currently, the header consists just of "p.pagenumber" in the upper-right-hand corner, and cannot be disabled.
head1_halfpoint_size ... head6_halfpoint_size
These control the font size of each heading level, in half-twips. For example, the default for head3_halfpoint_size is 25, meaning
that HTML "<h3>...</h3>" text will be in 12.5 point text (in addition to being underlined and in the heading font).
codeblock_halfpoint_size
This controls the font size (in half-points) of the text used for "<pre>...</pre>" text. By default, it is 18, meaning 9 point.
fontname_body
This option controls what font is to be used for the body of the text -- that is, everything other than heading text and text in
pre/code/tt elements. The default value is currently "Times". Other handy values I can suggest using are "Georgia" or "Bookman Old
Style".
fontname_code
This option controls what font is to be used for text in pre/code/tt elements. The default value is currently "Courier New".
fontname_headings
This option controls what font name is to be used for headings. You can use the same font as fontname_body, but I prefer a sans-serif
font, so the default value is currently "Arial". Also consider "Tahoma" and "Verdana".
document_language
This option controls what Microsoft language number will be specified as the language for this document. The current default value is
1033, for US English. Consult an RTF reference for other language numbers.
hr_width
This option controls how many underline characters will be used for rendering a "<hr>" tag. Its default value is currently 50. You can
usually leave this alone, but under some circumstances you might want to use a smaller or larger number.
no_prolog
If this option is set to a true value, HTML::FormatRTF will make a point of not emitting the RTF prolog before the document. By
default, this is off, meaning that HTML::FormatRTF will emit the prolog. This option is of interest only to advanced users.
no_trailer
If this option is set to a true value, HTML::FormatRTF will make a point of not emitting the RTF trailer at the end of the document.
By default, this is off, meaning that HTML::FormatRTF will emit the bit of RTF that ends the document. This option is of interest only
to advanced users.
SEE ALSO
HTML::Formatter, RTF::Writer
COPYRIGHT
Copyright (c) 2002 Sean M. Burke. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of
merchantability or fitness for a particular purpose.
AUTHOR
Sean M. Burke "<sburke@cpan.org>"
perl v5.12.1 2004-06-02 HTML::FormatRTF(3)