Sponsored Content
Top Forums Shell Programming and Scripting CREATING A SYLLABLE CONCORDANCE WITH POSITIONAL VARIANTS Post 302543707 by gimley on Monday 1st of August 2011 10:46:36 PM
Old 08-01-2011
Hello,
With a little help from colleagues, I finally managed to get the concordance going. Here is the code in case someone else would like to use it:
Code:
#! /usr/bin/perl

use strict;  # These two lines save you endless trouble
use warnings; # without them typos and such errors get missed

open (my $corpus_file, '<', 'Corpus'); # Created a test corpus with just the contained lines
# $/="\r\n"; # Again with the DOS files
chomp(my @corpus = (<$corpus_file>)); # Load the corpus file into an array for faster access
open (my $syllables_file, '<', 'Syllables');
while(<$syllables_file>){
    chomp(my $syllable = $_);
    my $count = 0;
    my $init = my $med = my $fin = my $stdalone = "NONE";
    for my $word (@corpus) {
        if ( $word =~ /^$syllable.+/) {
            if ($init eq "NONE") {
                $init = $word;
                $count++;
            }
        }
        elsif ($word =~ /.+$syllable.+/) {
            if ($med eq "NONE") {
                $med = $word;
                $count++;
            }
        }
        elsif ($word =~ /.+$syllable$/) {
            if ($fin eq "NONE") {
                $fin = $word;
                $count++;
            }
        }
        elsif ($word =~ /^$syllable$/) {
            if ($stdalone eq "NONE") {
                $stdalone = $word;
                $count++;
            }
        }
        last if $count == 4;
    }
    print "$syllable\nInitial $init\nMedial $med\nFinal $fin\nStandalone $stdalone\n";
    #print "$init\t$med\t$fin\t$stdalone\n";
}

Many thanks for the information re. Regex.

Last edited by Scott; 08-02-2011 at 01:14 AM.. Reason: Code tags, please...
 

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Creating a syllable concordance

Hello, I have two files. The first file contains specific syllables of a language (Hindi) and the second file contains a large database from which these syllables have been culled. The syllable file which has syllables in Hindi has one syllable per line and the corpus file has a data... (8 Replies)
Discussion started by: gimley
8 Replies

2. Shell Programming and Scripting

[All variants] remove first pair of parentheses

How to remove first pair of parentheses and content in them from the beginning of the line? Here's the list: (ok)-test (ok)-test-(ing) (some)-test-(ing)-test test-(ing) Desired result: test test-(ing) test-(ing)-test test-(ing) Here's what I already tried with GNU sed: sed -e... (6 Replies)
Discussion started by: useretail
6 Replies

3. Shell Programming and Scripting

Writing a clustering concordance for a Perso-Arabic script

I am working on a database of a language using Arabic Script. One of the major issues is that the shape of the characters changes according to their initial, medial or final positioning. Another major issue is that of the clustering of vowels within the word: the clustering changes totally the... (9 Replies)
Discussion started by: gimley
9 Replies

4. Shell Programming and Scripting

[All variants] Change settings

Hi, I have a big settings confg (file attached). There are a few separate tasks that I have to accomplish. All scripting/programming languages are appreciated. 1. I need to parse all values and output to stdout. Sample output (truncated): VALUEA 2017-01-01 Lores ipsum Lorem ipsum dolor sit... (11 Replies)
Discussion started by: useretail
11 Replies

5. UNIX for Beginners Questions & Answers

Merge 4 bim files by keeping only the overlapping variants (unique rs values )

Dear community, I am facing a problem and I kindly ask your help: I have 4 different data sets consisted from 3 different types of array. On each file, column 1 is chromosome position, column 2 is SNP id etc... Lets say I have the following (bim) datasets: x2014: 1 rs3094315... (4 Replies)
Discussion started by: fondan
4 Replies
shells(4)							   File Formats 							 shells(4)

NAME
shells - shell database SYNOPSIS
/etc/shells DESCRIPTION
The shells file contains a list of the shells on the system. Applications use this file to determine whether a shell is valid. See getuser- shell(3C). For each shell a single line should be present, consisting of the shell's path, relative to root. A hash mark (#) indicates the beginning of a comment; subsequent characters up to the end of the line are not interpreted by the routines which search the file. Blank lines are also ignored. The following default shells are used by utilities: /bin/bash, /bin/csh, /bin/jsh, /bin/ksh, /bin/pfcsh, /bin/pfksh, /bin/pfsh, /bin/sh, /bin/tcsh, /bin/zsh, /sbin/jsh, /sbin/sh, /usr/bin/bash, /usr/bin/csh, /usr/bin/jsh, /usr/bin/ksh, /usr/bin/pfcsh, /usr/bin/pfksh, /usr/bin/pfsh, and /usr/bin/sh, /usr/bin/tcsh, /usr/bin/zsh. Note that /etc/shells overrides the default list. Invalid shells in /etc/shells may cause unexpected behavior (such as being unable to log in by way of ftp(1)). FILES
/etc/shells lists shells on system SEE ALSO
vipw(1B), ftpd(1M), sendmail(1M), getusershell(3C), aliases(4) SunOS 5.10 4 Jun 2001 shells(4)
All times are GMT -4. The time now is 02:08 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy