Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Replace randomly occurrences bash Post 303044397 by allabiba on Thursday 20th of February 2020 10:24:45 AM
Old 02-20-2020
1- I tried to turn text to one array and replace them something like this:
Code:
text=(
    a a a a a a a a a a
    a a a a a a a a a a
)
Get the number of items

number=${#text[@]}
Get half of it

half=$((number/2))
In a loop from 1 to the half of the text get random number and using it as index change item in text array

for i in $(seq $half); {
    rnd=$((RANDOM % (number-1)))
    text[$rnd]=b
}
Echo result

echo ${text[@]}

But this code doesn't keep the new line and echo all text in one line and my text is so big about 15 Gb and can't put it in one array.

2- Choose randomly from all occurrences of one word or token in the whole text.
simply can get number of repetition by something like this ( grep "a" | wc -l )

3- I can use python to do this but my text is huge and I want to use bash since it is faster than python. In python I use a set contains (a,b) and in replace function use a random function to choose (a or b) from that set.

4- I use Ubuntu 18.04, sed (GNU sed) 4.4, GNU Awk 4.1.4, API: 1.1 (GNU MPFR 4.0.1, GNU MP 6.1.2)

5- I simplify the problem, the main problem is that: I want to normalize a text corpus for training a tri-gram language model, in the language model, the sequence of words is important. I normalize the numbers to letters so for example, I convert all 30 to thirty but we use often half instead of thirty for reporting hour (e.g 8:30). I want to replace randomly things like this not whole of them.

best regards
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to select a value randomly

on my desktop i am using the kde rotating desktop image option. this rotates images randomly every half hour. now, i would like to write an html file which will have an inline frame with some text, maybe system messages, or my friends live journal thati read alot, or unix.com! however, i dont want... (1 Reply)
Discussion started by: norsk hedensk
1 Replies

2. UNIX for Dummies Questions & Answers

randomly renaming files

I have a directory of files that look like filename 001.ext, filename 002.ext, etc. I'd like to rename the files with unique random numbered names, so that the original filenames are stripped and the files are given a new, random number name. I'm not super new to UNIX, but I don't often use it for... (2 Replies)
Discussion started by: platz
2 Replies

3. Shell Programming and Scripting

sed replace multiple occurrences on the same line, but not all

Hi there! I am really enjoying working with sed. I am trying to come up with a sed command to replace some occurrences (not all) in the same line, for instance: I have a command which the output will be: 200.300.400.5 0A 0B 0C 01 02 03 being that the last 6 strings are actually one... (7 Replies)
Discussion started by: ppucci
7 Replies

4. Shell Programming and Scripting

awk and gsub - how to replace only the first X occurrences

I have a text (text.txt) and I would like to replace only the first 2 occurrences of a word (but I might need to replace more): For example, if text is this: CAR sweet head hat red yellow CAR book brown tiger CAR cow CAR CAR milk I would like to replace the word "CAR" with word... (12 Replies)
Discussion started by: bingel
12 Replies

5. Shell Programming and Scripting

Cron job randomly once a day

I want to create a cron job randomly once a day for my site's registration. The responsible file for registrations is a config file and I need to change the contents twice on day (on and off) I know the way for random cron job for example */n * * * * /usr/local/bin/php... (6 Replies)
Discussion started by: lucker
6 Replies

6. UNIX for Dummies Questions & Answers

Replace all occurrences of strings with parentheses

Hi, I tried to adapt bartus's solution to my problem, without success. I want to replace all the occurences of this: with: , where something can contain an arbitrary number of balanced parens and brakets. Any ideas ? Best, (1 Reply)
Discussion started by: ff1969ff1969
1 Replies

7. Programming

Java application dying randomly

Hi, (First post, please be gental!) I have a java app that I am running on unix (centos) But it keeps dying randomly. The times seem random from anything between 3 hours and 3 days. I have a cronjob running to restart it when ever it dies but I would rather this happened less often. ... (2 Replies)
Discussion started by: sm9ai
2 Replies

8. Shell Programming and Scripting

Get 20% of lines in File randomly

Hello, This is my code: nb_lignes=`wc -l $1 | cut -d " " -f1` for i in $(seq $nb_lignes) do m=`head $1 -n $i | tail -1` //command done Please how can i change it to get Get 20% of lines in File randomly to apply "command" on each line ? 20% or 40% or 60 % (it's a parameter) Thank you. (15 Replies)
Discussion started by: chercheur857
15 Replies

9. UNIX for Dummies Questions & Answers

BASH - Counting word occurrences in a Web Page

Hi all, I have to do a script bash (for university) that counts all word occurrences in a specific web page. anyone can help me?. Thanks :) (1 Reply)
Discussion started by: piacentero
1 Replies

10. Shell Programming and Scripting

Randomly create time in UNIX

Hey, How can i create randomly create time N times. Suppose i want to create data for a particualr date 5 times... Mon Jan 19 11:42:50 Mon Jan 19 19:16:40 Mon Jan 19 12:12:33 Mon Jan 19 14:26:27 Mon Jan 19 12:29:53 Mon Jan 19 13:30:31 I want the script to create N times randome... (2 Replies)
Discussion started by: jaituteja
2 Replies
HTML::FormatRTF(3)					User Contributed Perl Documentation					HTML::FormatRTF(3)

NAME
HTML::FormatRTF - Format HTML as RTF SYNOPSIS
use HTML::FormatRTF; my $out_file = "test.rtf"; open(RTF, ">$out_file") or die "Can't write-open $out_file: $! Aborting"; print RTF HTML::FormatRTF->format_file( 'test.html', 'fontname_headings' => "Verdana", ); close(RTF); DESCRIPTION
HTML::FormatRTF is a class for objects that you use to convert HTML to RTF. There is currently no proper support for tables or forms. This is a subclass of HTML::Formatter, whose documentation you should consult for more information on the new, format, format_file You can specify any of the following parameters in the call to "new", "format_file", or "format_string": lm Amount of extra indenting to apply to the left margin, in twips (twentieths of a point). Default is 0. So if you wanted the left margin to be an additional half inch larger, you'd set "lm => 720" (since there's 1440 twips in an inch). If you wanted it to be about 1.5cm larger, you'd set "lw => 850" (since there's about 567 twips in a centimeter). rm Amount of extra indenting to apply to the left margin, in twips (twentieths of a point). Default is 0. normal_halfpoint_size This is the size of normal text in the document, in half-points. The default value is 22, meaning that normal text is in 11 point. header_halfpoint_size This is the size of text used in the document's page-header, in half-points. The default value is 17, meaning that normal text is in 7.5 point. Currently, the header consists just of "p.pagenumber" in the upper-right-hand corner, and cannot be disabled. head1_halfpoint_size ... head6_halfpoint_size These control the font size of each heading level, in half-twips. For example, the default for head3_halfpoint_size is 25, meaning that HTML "<h3>...</h3>" text will be in 12.5 point text (in addition to being underlined and in the heading font). codeblock_halfpoint_size This controls the font size (in half-points) of the text used for "<pre>...</pre>" text. By default, it is 18, meaning 9 point. fontname_body This option controls what font is to be used for the body of the text -- that is, everything other than heading text and text in pre/code/tt elements. The default value is currently "Times". Other handy values I can suggest using are "Georgia" or "Bookman Old Style". fontname_code This option controls what font is to be used for text in pre/code/tt elements. The default value is currently "Courier New". fontname_headings This option controls what font name is to be used for headings. You can use the same font as fontname_body, but I prefer a sans-serif font, so the default value is currently "Arial". Also consider "Tahoma" and "Verdana". document_language This option controls what Microsoft language number will be specified as the language for this document. The current default value is 1033, for US English. Consult an RTF reference for other language numbers. hr_width This option controls how many underline characters will be used for rendering a "<hr>" tag. Its default value is currently 50. You can usually leave this alone, but under some circumstances you might want to use a smaller or larger number. no_prolog If this option is set to a true value, HTML::FormatRTF will make a point of not emitting the RTF prolog before the document. By default, this is off, meaning that HTML::FormatRTF will emit the prolog. This option is of interest only to advanced users. no_trailer If this option is set to a true value, HTML::FormatRTF will make a point of not emitting the RTF trailer at the end of the document. By default, this is off, meaning that HTML::FormatRTF will emit the bit of RTF that ends the document. This option is of interest only to advanced users. SEE ALSO
HTML::Formatter, RTF::Writer COPYRIGHT
Copyright (c) 2002 Sean M. Burke. All rights reserved. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. AUTHOR
Sean M. Burke "<sburke@cpan.org>" perl v5.12.1 2004-06-02 HTML::FormatRTF(3)
All times are GMT -4. The time now is 10:09 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy