Sponsored Content
Top Forums Shell Programming and Scripting CREATING A SYLLABLE CONCORDANCE WITH POSITIONAL VARIANTS Post 302543707 by gimley on Monday 1st of August 2011 10:46:36 PM
Old 08-01-2011
Hello,
With a little help from colleagues, I finally managed to get the concordance going. Here is the code in case someone else would like to use it:
Code:
#! /usr/bin/perl

use strict;  # These two lines save you endless trouble
use warnings; # without them typos and such errors get missed

open (my $corpus_file, '<', 'Corpus'); # Created a test corpus with just the contained lines
# $/="\r\n"; # Again with the DOS files
chomp(my @corpus = (<$corpus_file>)); # Load the corpus file into an array for faster access
open (my $syllables_file, '<', 'Syllables');
while(<$syllables_file>){
    chomp(my $syllable = $_);
    my $count = 0;
    my $init = my $med = my $fin = my $stdalone = "NONE";
    for my $word (@corpus) {
        if ( $word =~ /^$syllable.+/) {
            if ($init eq "NONE") {
                $init = $word;
                $count++;
            }
        }
        elsif ($word =~ /.+$syllable.+/) {
            if ($med eq "NONE") {
                $med = $word;
                $count++;
            }
        }
        elsif ($word =~ /.+$syllable$/) {
            if ($fin eq "NONE") {
                $fin = $word;
                $count++;
            }
        }
        elsif ($word =~ /^$syllable$/) {
            if ($stdalone eq "NONE") {
                $stdalone = $word;
                $count++;
            }
        }
        last if $count == 4;
    }
    print "$syllable\nInitial $init\nMedial $med\nFinal $fin\nStandalone $stdalone\n";
    #print "$init\t$med\t$fin\t$stdalone\n";
}

Many thanks for the information re. Regex.

Last edited by Scott; 08-02-2011 at 01:14 AM.. Reason: Code tags, please...
 

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Creating a syllable concordance

Hello, I have two files. The first file contains specific syllables of a language (Hindi) and the second file contains a large database from which these syllables have been culled. The syllable file which has syllables in Hindi has one syllable per line and the corpus file has a data... (8 Replies)
Discussion started by: gimley
8 Replies

2. Shell Programming and Scripting

[All variants] remove first pair of parentheses

How to remove first pair of parentheses and content in them from the beginning of the line? Here's the list: (ok)-test (ok)-test-(ing) (some)-test-(ing)-test test-(ing) Desired result: test test-(ing) test-(ing)-test test-(ing) Here's what I already tried with GNU sed: sed -e... (6 Replies)
Discussion started by: useretail
6 Replies

3. Shell Programming and Scripting

Writing a clustering concordance for a Perso-Arabic script

I am working on a database of a language using Arabic Script. One of the major issues is that the shape of the characters changes according to their initial, medial or final positioning. Another major issue is that of the clustering of vowels within the word: the clustering changes totally the... (9 Replies)
Discussion started by: gimley
9 Replies

4. Shell Programming and Scripting

[All variants] Change settings

Hi, I have a big settings confg (file attached). There are a few separate tasks that I have to accomplish. All scripting/programming languages are appreciated. 1. I need to parse all values and output to stdout. Sample output (truncated): VALUEA 2017-01-01 Lores ipsum Lorem ipsum dolor sit... (11 Replies)
Discussion started by: useretail
11 Replies

5. UNIX for Beginners Questions & Answers

Merge 4 bim files by keeping only the overlapping variants (unique rs values )

Dear community, I am facing a problem and I kindly ask your help: I have 4 different data sets consisted from 3 different types of array. On each file, column 1 is chromosome position, column 2 is SNP id etc... Lets say I have the following (bim) datasets: x2014: 1 rs3094315... (4 Replies)
Discussion started by: fondan
4 Replies
TM(1)							      General Commands Manual							     TM(1)

NAME
tm - meditate SYNOPSIS
tm [-number] [time] DESCRIPTION
Tm causes UNIX to go into a state in which all current activities are suspended for time minutes (default is 20). At the beginning of this period, tm generates a set of number (default 3) transcendental numbers. Then it prints a two- to six-character nonsense syllable (mantra) on every logged-in terminal (a different syllable on each terminal). For the remainder of the time interval, it repeats these numbers to itself, in random order, binary digit by binary digit (memory permitting), while simultaneously contemplating its kernel. It is suggested that users utilize the time thus provided to do some meditating themselves. One possibility is to close one's eyes, attempt to shut out one's surroundings, and concentrate on the mantra supplied by tm. At the end of the time interval, UNIX returns to the suspended activities, refreshed and reinvigorated. Hopefully, so do the users. FILES
Tm does not use any files, in an attempt to isolate itself from external influences and distractions. DIAGNOSTICS
If disturbed for any reason during the interval of meditation, tm locks the keyboard on every terminal, prints an unprintable expletive, and unlocks the keyboard. Subsequent UNIX operation may be marked by an unusual number of lost or scrambled files and dropped lines. BUGS
If number is greater than 32,767 (decimal), tm appears to generate rational numbers for the entire time interval, after which the behavior of the system may be completely irrational (i.e., transcendental). WARNING
Attempts to use flog(1) on tm are invariably counterproductive. TM(1)
All times are GMT -4. The time now is 01:30 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy