I am not an expert with linux, but following various posts on this forum, I have been trying to write a script to match pattern of charters occurring together in a file.
My file has approximately 200 million characters (upper and lower case), with about 50 characters per line. I have merged all the lines together to make it one line using
I now have all charcters in my file in the same line without spaces.
I am trying to count the number of times the specific characters occur together. For example, in the file below
I am trying to look for the pattern 'tr' that occurs in the sentence. The script I have now is
The above script works perfectly fine for a small file, but when I try to run it on my actual file with more than 200 million characters, it takes ages to finish the task (I lost patience and did not check the total time taken).
Is there a way I can optimize the code?
Next, I have been trying to get the position of the match. For example, in the above example file, 'tr' is starts on 4th and 27th position. I just want the number as output.
Hi All,
I am pretty new to pattern matching and extraction using shell scripting. Could anyone please help me in extracting the word matching a pattern from a line in bash.
Input Sample (can vary between any of the 3 samples below):
1) Adaptec SCSI RAID 5445
2) Adaptec SCSI 5445S RAID
3)... (8 Replies)
I have a file with the below format,
GS*8*****
ST*1********
A*
B*
E*
RMR*123455(This is the unique number to locate this row)
F*
SE*1***
GE**
GS*9*****
ST*2
H*
J*
RMR*567889(This is the unique number to locate this row)
L*
SE*
GE***** (16 Replies)
I have a file a file having entries are like
@ram@sham@sita
@krishan@kumar
@deep@kumar@hello@sham
in this file all line are having different no of pattern-@.
need to fetch the substring after the last pattern.
like
sita
kumar
sham
thanks in advance (3 Replies)
Hi,
I have a string looks like the following:
USERS 32767.9844 UNDOTBS1 32767.9844 SYSAUX 32767.9844 SYSTEM 32767.9844 EMS 8192 EMS 8192 EMS_INDEXES 4096 EMS_INDEXES 4096 8 rows selected.
How do I extract a sub-string to get the expected output as following:
EMS 8192
EMS_INDEXES 4096
... (3 Replies)
I am facing a problem and I would be grateful if you can help me :wall:
I have a list of words like
And I have a datafile like
the box of
the box of tissues out of
of tissues out of
the book, the
the book, the pen and the
the pen and the
I want to find Patterns of “x.*x” where... (2 Replies)
here is what i want to achieve... consider a file contains below contents. the file size is large about 60mb
cat dump.sql
INSERT INTO `table1` (`id`, `action`, `date`, `descrip`, `lastModified`) VALUES (1,'Change','2011-05-05 00:00:00','Account Updated','2012-02-10... (10 Replies)
The sample file:
dept1: user1,user2,user3
dept2: user4,user5,user6
dept3: user7,user8,user9
I want to match by '/^dept2.*/' but don't want to have substring 'dept2:' in output. How to compose such regex? (8 Replies)
Hi All,
My Problem is like below.
I have a file which contains just one row and contains data like
PO_CREATE12457888888888889SK1234567878744551111111111SK89456321145789955455555SK8888888815788852222
i want to extract SK12345678
SK89456321
SK88888888
So basically SK and next 8... (4 Replies)
Hi,
I know how to replace a string with another in a file.
But, i wish to replace the below string pattern
EncryptedPassword="{gafgfa}]\asffafsf312a" i.e EncryptedPassword="<any random string>"
To
EncryptedPassword=""
i.e remove the random password to a empty string.
Can you... (3 Replies)
Discussion started by: mohtashims
3 Replies
LEARN ABOUT DEBIAN
locale::script
Locale::Script(3perl) Perl Programmers Reference Guide Locale::Script(3perl)NAME
Locale::Script - standard codes for script identification
SYNOPSIS
use Locale::Script;
$script = code2script('phnx'); # 'Phoenician'
$code = script2code('Phoenician'); # 'Phnx'
$code = script2code('Phoenician',
LOCALE_CODE_NUMERIC); # 115
@codes = all_script_codes();
@scripts = all_script_names();
DESCRIPTION
The "Locale::Script" module provides access to standards codes used for identifying scripts, such as those defined in ISO 15924.
Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 15924
four-letter codes will be used.
SUPPORTED CODE SETS
There are several different code sets you can use for identifying scripts. The ones currently supported are:
alpha
This is a set of four-letter (capitalized) codes from ISO 15924 such as 'Phnx' for Phoenician.
This code set is identified with the symbol "LOCALE_SCRIPT_ALPHA".
The Zxxx, Zyyy, and Zzzz codes are not used.
This is the default code set.
numeric
This is a set of three-digit numeric codes from ISO 15924 such as 115 for Phoenician.
This code set is identified with the symbol "LOCALE_SCRIPT_NUMERIC".
ROUTINES
code2script ( CODE [,CODESET] )
script2code ( NAME [,CODESET] )
script_code2code ( CODE ,CODESET ,CODESET2 )
all_script_codes ( [CODESET] )
all_script_names ( [CODESET] )
Locale::Script::rename_script ( CODE ,NEW_NAME [,CODESET] )
Locale::Script::add_script ( CODE ,NAME [,CODESET] )
Locale::Script::delete_script ( CODE [,CODESET] )
Locale::Script::add_script_alias ( NAME ,NEW_NAME )
Locale::Script::delete_script_alias ( NAME )
Locale::Script::rename_script_code ( CODE ,NEW_CODE [,CODESET] )
Locale::Script::add_script_code_alias ( CODE ,NEW_CODE [,CODESET] )
Locale::Script::delete_script_code_alias ( CODE [,CODESET] )
These routines are all documented in the Locale::Codes man page.
SEE ALSO
Locale::Codes
Locale::Constants
http://www.unicode.org/iso15924/
Home page for ISO 15924.
AUTHOR
See Locale::Codes for full author history.
Currently maintained by Sullivan Beck (sbeck@cpan.org).
COPYRIGHT
Copyright (c) 1997-2001 Canon Research Centre Europe (CRE).
Copyright (c) 2001-2010 Neil Bowers
Copyright (c) 2010-2011 Sullivan Beck
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.14.2 2011-09-26 Locale::Script(3perl)