Use Regex to identify / format a complex string Post: 302651011

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Gathering data from complex/large dataspreads .txt format

Hi, I'm working on gathering information stored in .txt files. The format of the data within the .txt files is shown in the picture uploaded with this post. Sections like the one pictured are repeated (with different data, same format) many times within each .txt file but each section is of data...

2. Shell Programming and Scripting

How to identify whether the script is in Unix format or not ?

Hi All, I have the below scenario in my environment Developers used to copy file from windows to Linux box. Some time on the copied file developers miss to run the dos2unix utility. Because of this script gets failed during the execution. Most of the failures are due to the dos2unix format...

3. Shell Programming and Scripting

Complex Regex Perl

Hi the below perl snippet will replace any three letter string in the beginning with a two letter string which is specified..but if i want to modfiy only certain characters for eg.. ABC - AB CAB - AB AAA - No Modifcations 1AB - AB AB8 - AB Whatever coming before or after of AB only have...

4. Shell Programming and Scripting

Regex to identify a full-stop as a sentence delimiter

Hello, Splitting a sentence using the full-stop/question-mark/exclamation is a common device. Whereas the question-mark / exclamation do not pose too much of a problem; the full-stop as a sentence delimiter raises certain issues because of its varied use: just to name a few. Standard parsers...

5. Shell Programming and Scripting

Regex to identify word in second position on a line

I am interested in finding a regex to find a word in second position on a line. The word in question is या I tried the following PERL EXPRESSION but it did not work: ] या or ^\W या But both gave Null results I am giving below a Sample file: देना या सौंपना=delegate तह जमना या...

6. Shell Programming and Scripting

Identify lines with wrong format in a file and fix

Gurus, I have a data file which has a certain number of columns say 101. It has one description column which contains foreign characters and due to this some times, those special characters are translated to new line character and resulting in failing the process. I am using the following awk...

7. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Hello, I have a dictionary which I am building for the Open Source Community. The data structure is as under HEADWORD=PARTOFSPEECH=ENGLISH MEANING as shown in the example below अ=m=Prefix signifying negation. अँहँ=ind=Interjection expressing disapprobation. अं=int=An interjection...

8. Shell Programming and Scripting

Regex to identify illegal characters in a perso-arabic database

I am working on Sindhi: a perso-Arabic script and since it shares the Unicode-block with over 400 other languages, quite often the database contains characters which are not wanted: illegal characters. I have identified the character set of Sindhi which is given below: For clarity's sake, each...

9. UNIX for Beginners Questions & Answers

Regex to identify pattern

Hi In a file I have string in multiple lines. Like below: <?=test.getObjectName("L", "testTBL","D") ?> <?=test.getObjectName("L", "testTBL","testDB", "D") ?> I want to use regex to search for the pattern "<?=test.getObjectName...?>" If the parenthesis has 3 parameters then return 2nd...

10. UNIX for Beginners Questions & Answers

Help with understanding this regex in a Perl script parsing a 'complex' string

Hi, I need some guidance with understanding this Perl script below. I am not the author of the script and the author has not leave any documentation. I supposed it is meant to be 'easy' if you're a Perl or regex guru. I am having problem understanding what regex to use :confused: The script does...

LEARN ABOUT DEBIAN

locale::script

Locale::Script(3perl)					 Perl Programmers Reference Guide				     Locale::Script(3perl)

NAME

       Locale::Script - standard codes for script identification

SYNOPSIS

	  use Locale::Script;

	  $script  = code2script('phnx');		      # 'Phoenician'
	  $code    = script2code('Phoenician'); 	      # 'Phnx'
	  $code    = script2code('Phoenician',
				 LOCALE_CODE_NUMERIC);	      # 115

	  @codes   = all_script_codes();
	  @scripts = all_script_names();

DESCRIPTION

       The "Locale::Script" module provides access to standards codes used for identifying scripts, such as those defined in ISO 15924.

       Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 15924
       four-letter codes will be used.

SUPPORTED CODE SETS

       There are several different code sets you can use for identifying scripts. The ones currently supported are:

       alpha
	   This is a set of four-letter (capitalized) codes from ISO 15924 such as 'Phnx' for Phoenician.

	   This code set is identified with the symbol "LOCALE_SCRIPT_ALPHA".

	   The Zxxx, Zyyy, and Zzzz codes are not used.

	   This is the default code set.

       numeric
	   This is a set of three-digit numeric codes from ISO 15924 such as 115 for Phoenician.

	   This code set is identified with the symbol "LOCALE_SCRIPT_NUMERIC".

ROUTINES

       code2script ( CODE [,CODESET] )
       script2code ( NAME [,CODESET] )
       script_code2code ( CODE ,CODESET ,CODESET2 )
       all_script_codes ( [CODESET] )
       all_script_names ( [CODESET] )
       Locale::Script::rename_script  ( CODE ,NEW_NAME [,CODESET] )
       Locale::Script::add_script  ( CODE ,NAME [,CODESET] )
       Locale::Script::delete_script  ( CODE [,CODESET] )
       Locale::Script::add_script_alias  ( NAME ,NEW_NAME )
       Locale::Script::delete_script_alias  ( NAME )
       Locale::Script::rename_script_code  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Script::add_script_code_alias  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Script::delete_script_code_alias  ( CODE [,CODESET] )
	   These routines are all documented in the Locale::Codes man page.

SEE ALSO

       Locale::Codes
       Locale::Constants
       http://www.unicode.org/iso15924/
	   Home page for ISO 15924.

AUTHOR

       See Locale::Codes for full author history.

       Currently maintained by Sullivan Beck (sbeck@cpan.org).

COPYRIGHT

	  Copyright (c) 1997-2001 Canon Research Centre Europe (CRE).
	  Copyright (c) 2001-2010 Neil Bowers
	  Copyright (c) 2010-2011 Sullivan Beck

       This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.14.2							    2011-09-26						     Locale::Script(3perl)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Gathering data from complex/large dataspreads .txt format

Discussion started by: p43hd

2. Shell Programming and Scripting

How to identify whether the script is in Unix format or not ?

Discussion started by: kalpeer

3. Shell Programming and Scripting

Complex Regex Perl

Discussion started by: rajkrishna89

4. Shell Programming and Scripting

Regex to identify a full-stop as a sentence delimiter

Discussion started by: gimley

5. Shell Programming and Scripting

Regex to identify word in second position on a line

Discussion started by: gimley

6. Shell Programming and Scripting

Identify lines with wrong format in a file and fix

Discussion started by: tumsri

7. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Discussion started by: gimley

8. Shell Programming and Scripting

Regex to identify illegal characters in a perso-arabic database

Discussion started by: gimley

9. UNIX for Beginners Questions & Answers

Regex to identify pattern

Discussion started by: dashing201

10. UNIX for Beginners Questions & Answers

Help with understanding this regex in a Perl script parsing a 'complex' string

Discussion started by: newbie_01

LEARN ABOUT DEBIAN

locale::script