Help in modifying existing Perl Script to produce report of dupes
Hello,
I have a large amount of data with the following structure:
Word=Transliterated word
I have written a Perl Script (reproduced below) which goes through the full file and identifies all dupes on the right hand side. It creates successfully a new file with two headers: Singletons and Dupes.
I have tried to modify the script to produce additionally a record listing the frequency count of all dupes. Thus in the sample provided, I would like to know how many times the dupe Albert has been transliterated in different ways. I am providing pseudo-data since the original data is in a foreign script.
The final output would thus have two files:
The output file listing Singletons and Dupes
The report which would have the dupes listed along with their frequency.
I am not very good at generating reports in Perl and hence the request:
Perl script follows.
Many thanks for excellent help and advice given.
Last edited by Franklin52; 04-26-2012 at 03:48 AM..
Reason: Corrected code tags
hi
i got a file called essay which contain few pages with many paragraphs. now i wanna with PERL to produce another file which called Essaylist that contain a sorted list of words that appear in the file essay.
the format for Essaylist:
$word found $times times on page a b c....
where $word... (3 Replies)
Hi freinds
I have a small problem I want u to help me in, I have a syslog server and configured it to send me email automatically, I get a small perl script to help me in, and tested it to send alerts to root and it worked successfully without any problems
Now I want to send it outside, I... (4 Replies)
Hi all,
I have a snmpd.conf file as below. in "SECTION: Trap Destinations" line I want to add "trap2dest <IP>:162 <com_str>" on a new line. For this I wrote following code
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
ssize_t read_char_count = 0;
... (2 Replies)
Hi I am new to shell scripting. There is a requirement to write a shell script to meet follwing needs.Prompt reply shall be highly appreciated.
script that will compare two config files and produce 2 outputs - actual config file and a report indicating changes made.
OS :Susi linux ver 10.3.
... (4 Replies)
Hello,
I have two files. File1 or the master file contains two columns separated by a delimiter:
a=b
b=d
e=f
g=h
File 2 which is the file to be processed has only a single column
a
h
c
b
What I need is an awk script to identify unique names from file 2 which are not found in the... (6 Replies)
Hi,
I have requirement to produce a report on high CPU utilization processes and the processes lying on the CPU for long time (Long running queries). The report should append into the files every 3 minutes. I use prstat to pull top 5 and found the following result.
... (3 Replies)
I am compiling a synonym dictionary which has the following structure
Headword=Synonym1,Synonym2 and so on, with each synonym separated by a comma.
As is usual in such cases manual preparation of synonyms results in repeating the synonym which results in dupes as in the example below:... (3 Replies)
Hi,
I have the following command in place
nawk -F, '!a++' file > file.uniq
It has been working perfectly as per requirements, by removing duplicates by taking into consideration only first 3 fields. Recently it has started giving below error:
bash-3.2$ nawk -F, '!a++'... (17 Replies)
Dear all,
I have a large dictionary database which has the following structure
source word=target word
e.g.
book=livre
Since the database is very large in spite of all the care taken, it so happens that at times the source word is repeated
e.g.
book=livre
book=tome
Since I want to... (7 Replies)
I have a large database which has the following structure
a=b
where a is one language and b is the other and = is the delimiter
Since the data treats of language, homographs occur i.e. the same word on the left hand side can map in two different entries to two different glosses on the right... (3 Replies)
Discussion started by: gimley
3 Replies
LEARN ABOUT DEBIAN
gedcom::webservices
Gedcom::WebServices(3pm) User Contributed Perl Documentation Gedcom::WebServices(3pm)NAME
Gedcom::WebServices - Basic web service routines for Gedcom.pm
Version 1.16 - 24th April 2009
SYNOPSIS
wget -qO - http://www.example.com/ws/plain/my_family/i9/name
DESCRIPTION
This module provides web service access to a GEDCOM file in conjunction with mod_perl. Using it, A request for imformation can be made in
the form of a URL specifying the GEDCOM file to be used, which information is required and the format in which the information is to be
delivered. This information is then returned in the specified format.
There are currently three supported formats:
o plain - no markup
o XML
o JSON
URLs
The format of the URLs used to access the web services are:
$BASEURL/$FORMAT/$GEDCOM/$XREF/requested/information
$BASEURL/$FORMAT/$GEDCOM?search=search_criteria
BASEURL
The base URL to access the web services.
FORMAT
The format in which to return the results.
GEDCOM
The name of the GEDCOM file to use (the extension .ged is assumed).
XREF
The xref of the record about which information is required. XREFs can be obtained initially from a search, and subsequently from
certain queries.
requested/information
The information requested. This is in the same format as that taken by the get_value method.
search_criteria
An individual to search for. This is in the same format as that taken by the get_individual method.
EXAMPLES
$ wget -qO - 'http://pjcj.sytes.net:8585/ws/plain/royal92?search=elizabeth_ii'
/ws/plain/royal92/I52
$ wget -qO - http://pjcj.sytes.net:8585/ws/plain/royal92/I52
0 @I52@ INDI
1 NAME Elizabeth_II Alexandra Mary/Windsor/
1 TITL Queen of England
1 SEX F
1 BIRT
2 DATE 21 APR 1926
2 PLAC 17 Bruton St.,London,W1,England
1 FAMS @F14@
1 FAMC @F12@
$ wget -qO - http://pjcj.sytes.net:8585/ws/plain/royal92/I52/name
Elizabeth_II Alexandra Mary /Windsor/
$ wget -qO - http://pjcj.sytes.net:8585/ws/plain/royal92/I52/birth/date
21 APR 1926
$ wget -qO - http://pjcj.sytes.net:8585/ws/plain/royal92/I52/children
/ws/plain/royal92/I58
/ws/plain/royal92/I59
/ws/plain/royal92/I60
/ws/plain/royal92/I61
$ wget -qO - http://pjcj.sytes.net:8585/ws/json/royal92/I52/name
{"name":"Elizabeth_II Alexandra Mary /Windsor/"}
$ wget -qO - http://pjcj.sytes.net:8585/ws/xml/royal92/I52/name
<NAME>Elizabeth_II Alexandra Mary /Windsor/</NAME>
$ wget -qO - http://pjcj.sytes.net:8585/ws/xml/royal92/I52
<INDI ID="I52">
<NAME>Elizabeth_II Alexandra Mary/Windsor/</NAME>
<TITL>Queen of England</TITL>
<SEX>F</SEX>
<BIRT>
<DATE>21 APR 1926</DATE>
<PLAC>17 Bruton St.,London,W1,England</PLAC>
</BIRT>
<FAMS REF="F14"/>
<FAMC REF="F12"/>
</INDI>
CONFIGURATION
Add a section similar to the following to your mod_perl config:
PerlWarn On
PerlTaintCheck On
PerlPassEnv GEDCOM_TEST
<IfDefine GEDCOM_TEST>
<Perl>
$Gedcom::TEST = 1;
</Perl>
</IfDefine>
<Perl>
use Apache::Status;
$ENV{PATH} = "/bin:/usr/bin";
delete @ENV{"IFS", "CDPATH", "ENV", "BASH_ENV"};
$Gedcom::DATA = $Gedcom::ROOT; # location of data stored on server
use lib "$Gedcom::ROOT/blib/lib";
use Gedcom::WebServices;
my $handlers =
[ qw
(
plain
xml
json
)
];
eval Gedcom::WebServices::_set_handlers($handlers);
# use Apache::PerlSections; print STDERR Apache::PerlSections->dump;
</Perl>
PerlTransHandler Gedcom::WebServices::_parse_uri
BUGS
Very probably.
See the BUGS file. And the TODO file.
VERSION
Version 1.16 - 24th April 2009
LICENCE
Copyright 2005-2009, Paul Johnson (paul@pjcj.net)
This software is free. It is licensed under the same terms as Perl itself.
The latest version of this software should be available from my homepage: http://www.pjcj.net
perl v5.14.2 2012-04-12 Gedcom::WebServices(3pm)