Sponsored Content
Top Forums Shell Programming and Scripting CREATING A SYLLABLE CONCORDANCE WITH POSITIONAL VARIANTS Post 302543658 by DGPickett on Monday 1st of August 2011 03:47:11 PM
Old 08-01-2011
Well, regex for white space vary: Regex Tutorial - \b Word Boundaries

I used to say \< and \> for word boundary, but the PERL guys got to the POSIX and changed it after decades, so both may be \b!

So, you need to check for
  • standalone \<a\>
  • initial \<a[a-z]
  • final [a-z]a\>
  • medial [a-z]a[a-z]
but since the [a-z] check is more expensive, you might be able to check in this order, since if not \<a\> then \<a is initial and a\> is final, and medial is none of the above.
 

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Creating a syllable concordance

Hello, I have two files. The first file contains specific syllables of a language (Hindi) and the second file contains a large database from which these syllables have been culled. The syllable file which has syllables in Hindi has one syllable per line and the corpus file has a data... (8 Replies)
Discussion started by: gimley
8 Replies

2. Shell Programming and Scripting

[All variants] remove first pair of parentheses

How to remove first pair of parentheses and content in them from the beginning of the line? Here's the list: (ok)-test (ok)-test-(ing) (some)-test-(ing)-test test-(ing) Desired result: test test-(ing) test-(ing)-test test-(ing) Here's what I already tried with GNU sed: sed -e... (6 Replies)
Discussion started by: useretail
6 Replies

3. Shell Programming and Scripting

Writing a clustering concordance for a Perso-Arabic script

I am working on a database of a language using Arabic Script. One of the major issues is that the shape of the characters changes according to their initial, medial or final positioning. Another major issue is that of the clustering of vowels within the word: the clustering changes totally the... (9 Replies)
Discussion started by: gimley
9 Replies

4. Shell Programming and Scripting

[All variants] Change settings

Hi, I have a big settings confg (file attached). There are a few separate tasks that I have to accomplish. All scripting/programming languages are appreciated. 1. I need to parse all values and output to stdout. Sample output (truncated): VALUEA 2017-01-01 Lores ipsum Lorem ipsum dolor sit... (11 Replies)
Discussion started by: useretail
11 Replies

5. UNIX for Beginners Questions & Answers

Merge 4 bim files by keeping only the overlapping variants (unique rs values )

Dear community, I am facing a problem and I kindly ask your help: I have 4 different data sets consisted from 3 different types of array. On each file, column 1 is chromosome position, column 2 is SNP id etc... Lets say I have the following (bim) datasets: x2014: 1 rs3094315... (4 Replies)
Discussion started by: fondan
4 Replies
uri(n)						    Tcl Uniform Resource Identifier Management						    uri(n)

NAME
uri - URI utilities SYNOPSIS
package require Tcl 8.2 package require uri ?1.1.1? uri::split url uri::join ?key value?... uri::resolve base url uri::isrelative url uri::geturl url ?options...? uri::canonicalize uri uri::register schemeList script DESCRIPTION
This package contains two parts. First it provides regular expressions for a number of url/uri schemes. Second it provides a number of com- mands for manipulating urls/uris and fetching data specified by them. For the latter this package analyses the requested url/uri and then dispatches it to the appropriate package (http, ftp, ...) for actual fetching. COMMANDS
uri::split url uri::split takes a single url, decodes it and then returns a list of key/value pairs suitable for array set containing the con- stituents of the url. If the scheme is missing from the url it defaults to http. Currently only the schemes http, ftp, mailto, urn and file are supported. See section EXTENDING on how to expand that range. uri::join ?key value?... uri::join takes a list of key/value pairs (generated by uri::split, for example) and returns the canonical url they represent. Cur- rently only the schemes http, ftp, mailto, urn and file are supported. See section EXTENDING on how to expand that range. uri::resolve base url uri::resolve resolves the specified url relative to base. In other words: A non-relative url is returned unchanged, whereas for a relative url the missing parts are taken from base and prepended to it. The result of this operation is returned. For an empty url the result is base. uri::isrelative url uri::isrelative determines whether the specified url is absolute or relative. uri::geturl url ?options...? uri::geturl decodes the specified url and then dispatches the request to the package appropriate for the scheme found in the url. The command assumes that the package to handle the given scheme either has the same name as the scheme itself (including possible capitalization) followed by ::geturl, or, in case of this failing, has the same name as the scheme itself (including possible capi- talization). It further assumes that whatever package was loaded provides a geturl-command in the namespace of the same name as the package itself. This command is called with the given url and all given options. Currently geturl does not handle any options itself. Note: file-urls are an exception to the rule described above. They are handled internally. It is not possible to specify results of the command. They depend on the geturl-command for the scheme the request was dispatched to. uri::canonicalize uri uri::canonicalize returns the canonical form of a URI. The canonical form of a URI is one where relative path specifications, ie. . and .., have been resolved. uri::register schemeList script uri::register registers the first element of schemeList as a new scheme and the remaining elements as aliases for this scheme. It creates the namespace for the scheme and executes the script in the new namespace. The script has to declare variables containing the regular expressions relevant to the scheme. At least the variable schemepart has to be declared as that one is used to extend the variables keeping track of the registered schemes. SCHEMES
In addition to the commands mentioned above this package provides regular expression to recognize urls for a number of url schemes. For each supported scheme a namespace of the same name as the scheme itself is provided inside of the namespace uri containing the variable url whose contents are a regular expression to recognize urls of that scheme. Additional variables may contain regular expressions for parts of urls for that scheme. The variable uri::schemes contains a list of all supported schemes. Currently these are ftp, file, http, gopher, mailto, news, wais and prospero. EXTENDING
Extending the range of schemes supported by uri::split and uri::join is easy because both commands do not handle the request by themselves but dispatch it to another command in the uri namespace using the scheme of the url as criterion. uri::split and uri::join call Split[string totitle <scheme>] and Join[string totitle <scheme>] respectively. CREDITS
Original code by Andreas Kupries. Modularisation by Steve Ball. KEYWORDS
uri, url, fetching information, www, http, ftp, mailto, gopher, wais, prospero, file uri 1.1.1 uri(n)
All times are GMT -4. The time now is 06:04 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy