Help in modifying existing Perl Script to produce report of dupes
Hello,
I have a large amount of data with the following structure:
Word=Transliterated word
I have written a Perl Script (reproduced below) which goes through the full file and identifies all dupes on the right hand side. It creates successfully a new file with two headers: Singletons and Dupes.
I have tried to modify the script to produce additionally a record listing the frequency count of all dupes. Thus in the sample provided, I would like to know how many times the dupe Albert has been transliterated in different ways. I am providing pseudo-data since the original data is in a foreign script.
The final output would thus have two files:
The output file listing Singletons and Dupes
The report which would have the dupes listed along with their frequency.
I am not very good at generating reports in Perl and hence the request:
Perl script follows.
Many thanks for excellent help and advice given.
Last edited by Franklin52; 04-26-2012 at 03:48 AM..
Reason: Corrected code tags
hi
i got a file called essay which contain few pages with many paragraphs. now i wanna with PERL to produce another file which called Essaylist that contain a sorted list of words that appear in the file essay.
the format for Essaylist:
$word found $times times on page a b c....
where $word... (3 Replies)
Hi freinds
I have a small problem I want u to help me in, I have a syslog server and configured it to send me email automatically, I get a small perl script to help me in, and tested it to send alerts to root and it worked successfully without any problems
Now I want to send it outside, I... (4 Replies)
Hi all,
I have a snmpd.conf file as below. in "SECTION: Trap Destinations" line I want to add "trap2dest <IP>:162 <com_str>" on a new line. For this I wrote following code
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
ssize_t read_char_count = 0;
... (2 Replies)
Hi I am new to shell scripting. There is a requirement to write a shell script to meet follwing needs.Prompt reply shall be highly appreciated.
script that will compare two config files and produce 2 outputs - actual config file and a report indicating changes made.
OS :Susi linux ver 10.3.
... (4 Replies)
Hello,
I have two files. File1 or the master file contains two columns separated by a delimiter:
a=b
b=d
e=f
g=h
File 2 which is the file to be processed has only a single column
a
h
c
b
What I need is an awk script to identify unique names from file 2 which are not found in the... (6 Replies)
Hi,
I have requirement to produce a report on high CPU utilization processes and the processes lying on the CPU for long time (Long running queries). The report should append into the files every 3 minutes. I use prstat to pull top 5 and found the following result.
... (3 Replies)
I am compiling a synonym dictionary which has the following structure
Headword=Synonym1,Synonym2 and so on, with each synonym separated by a comma.
As is usual in such cases manual preparation of synonyms results in repeating the synonym which results in dupes as in the example below:... (3 Replies)
Hi,
I have the following command in place
nawk -F, '!a++' file > file.uniq
It has been working perfectly as per requirements, by removing duplicates by taking into consideration only first 3 fields. Recently it has started giving below error:
bash-3.2$ nawk -F, '!a++'... (17 Replies)
Dear all,
I have a large dictionary database which has the following structure
source word=target word
e.g.
book=livre
Since the database is very large in spite of all the care taken, it so happens that at times the source word is repeated
e.g.
book=livre
book=tome
Since I want to... (7 Replies)
I have a large database which has the following structure
a=b
where a is one language and b is the other and = is the delimiter
Since the data treats of language, homographs occur i.e. the same word on the left hand side can map in two different entries to two different glosses on the right... (3 Replies)
Discussion started by: gimley
3 Replies
LEARN ABOUT CENTOS
mail::dkim::textwrap
Mail::DKIM::TextWrap(3) User Contributed Perl Documentation Mail::DKIM::TextWrap(3)NAME
Mail::DKIM::TextWrap - text wrapping module written for use with DKIM
SYNOPSIS (FOR MAIL::DKIM USERS)
use Mail::DKIM::TextWrap;
Just add the above line to any program that uses Mail::DKIM::Signer and your signatures will automatically be wrapped to 72 characters.
SYNOPSIS (FOR OTHER USERS)
my $output = "";
my $tw = Mail::DKIM::TextWrap->new(
Margin => 10,
Output => $output,
);
$tw->add("Mary had a little lamb, whose fleece was white as snow.
");
$tw->finish;
print $output;
DESCRIPTION
This is a general-purpose text-wrapping module that I wrote because I had some specific needs with Mail::DKIM that none of the contemporary
text-wrapping modules offered.
Specifically, it offers the ability to change wrapping options in the middle of a paragraph. For instance, with a DKIM signature:
DKIM-Signature: a=rsa; c=simple; h=first:second:third:fourth;
b=Xr2mo2wmb1LZBwmEJElIPezal7wQQkRQ8WZtxpofkNmXTjXf8y2f0
the line-breaks can be inserted next to any of the colons of the h= tag, or any character of the b= tag. The way I implemented this was to
serialize the signature one element at a time, changing the text-wrapping options at the start and end of each tag.
TEXT WRAPPING OPTIONS
Text wrapping options can be specified when calling new(), or by simply changing the property as needed. For example, to change the number
of characters allowed per line:
$tw->{Margin} = 20;
Break
a regular expression matching characters where a line break can be inserted. Line breaks are inserted AFTER a matching substring. The
default is "/s/".
BreakBefore
a regular expression matching characters where a line break can be inserted. Line breaks are inserted BEFORE a matching substring.
Usually, you want to use Break, rather than BreakBefore. The default is "undef".
Margin
specifies how many characters to allow per line. The default is 72. If no place to line-break is found on a line, the line will extend
beyond this margin.
Separator
the text to insert when a linebreak is needed. The default is "
". If you want to set a following-line indent (e.g. all lines but the
first begin with four spaces), use something like "
".
Swallow
a regular expression matching characters that can be omitted when a line break occurs. For example, if you insert a line break between
two words, then you are replacing a "space" with the line break, so you are omitting the space. On the other hand, if you insert a line
break between two parts of a hyphenated word, then you are breaking at the hyphen, but you still want to display the hyphen. The
default is "/s/".
CONSTRUCTOR
new() - create a new text-wrapping object
my $tw = Mail::DKIM::TextWrap->new(
Output => $output,
%wrapping_options,
);
The text-wrapping object encapsulates the current options and the current state of the text stream. In addition to specifying text wrapping
options as described in the section above, the following options are recognized:
Output
a scalar reference, or a glob reference, to specify where the "wrapped" text gets output to. If not specified, the default of STDOUT is
used.
METHODS
add() - process some text that can be wrapped
$tw->add("Mary had a little lamb.
");
You can add() all the text at once, or add() the text in parts by calling add() multiple times.
finish() - call when no more text is to be added
$tw->finish;
Call this when finished adding text, so that any remaining text in TextWrap's buffers will be output.
flush() - output the current partial word, if any
$tw->flush;
Call this whenever changing TextWrap's parameters in the middle of a string of words. It explicitly allows a line-break at the current
position in the string, regardless of whether it matches the current break pattern.
perl v5.16.3 2010-02-28 Mail::DKIM::TextWrap(3)