This is my personal feeling: It is well known that any dedicated compiled program, be it C, C++, Pascal, or other, usually benefit from increased execution speed compared to e.g. scripts. But there is a tradeoff in terms of flexibility vs. e.g. awk, perl esp. when it comes to text analysis and processing, and adapting / modelling algorithms, for which those were specifically targeted / designed.
I'd be very interested in any results comparing execution times of your C++ with an equivalent awk script, as they both will use the same regex system calls.
I hadn't seen this post, else I'd have responded sooner...
I'm not sure how to set a timer within the awk program to just time the replacing time itself, so the results will be based on the programs opening/searching for the key/replacing text/closing.
The word 'ad' was used as it has a long definition and is close to the beginning of the file (less search time).
The timer program:
The results:
Awk (regex):
1000x @ 2m08.22s real 2m04.66s user 0m05.01s system
C++ (regex):
1000x @ 15.24s real 0m13.44s user 0m01.53s system (8.4x faster than Awk)
C++ (custom search and replace - I'll post this code on the forum)
1000x @ 3.37s real 0m02.54s user 0m00.62s system (4.5x faster than C++ regex)
Timed within the programs themselves and only timing the search and replace process:
C++ (regex)
1000x @ 12.26s real 0m05.82s user 0m00.00s system
Ram used: 3mb
C++ (custom)
1000x @ 1.66s real 0m01.66s user 0m00.00s system
Ram used: > 1mb
Hi all,
This problem has cost me half a day, and i still do not know how to do.
Any help will be appreciated. Thanks advance.
I want to use a variable as the first parameters of gsub function of awk.
Example:
{
...
arri]=gsub(i,tolower(i),$1)
(which should be ambraced by //)
...
} (1 Reply)
Hello,
I have a variable that displays the following results from a JVM....
1602100K->1578435K
I would like to collect the value of 1578435 which is the value after a garbage collection. I've tried the following command but it looks like I can't get the > to work. Any suggestions as... (4 Replies)
Hi all
I want to do a simple substitution in awk but I am getting unexpected output. My function accepts a time and then prints out a validation message if the time is valid. However some times may include a : and i want to strip this out if it exists before i get to the validation. I have shown... (4 Replies)
Hi,
Can some one please explain the following line please throw some light on the ones marked in red
awk '{print $9}' ${FTP_LOG} | awk -v start=${START_DATE} 'BEGIN { FS = "." } { old_line1=$0; gsub(/\-/,""); if ( $3 >= start ) print old_line1 }' | awk -v end=${END_DATE} 'BEGIN { FS="." } {... (3 Replies)
I want to replace comma with space and "*646#" with space.
I am using the following code:
nawk -F"|" '{gsub(","," ",$3); gsub(/\*646\#/"," ",$3);print}' OFS="|" file
I am getting following error:
Help is appreciated (5 Replies)
Hey,
I would like to replace a string by a new one. Teh problem is that both strings should be variables to be flexible, because I am having a lot of files (with the same structure, but in different folders)
for i in daysim_*
do
cd $i/5/
folder=`pwd |awk '{print $1}'`
awk '{ if... (3 Replies)
Hi, I want to print the first column with original value and without any double quotes
The output should look like
<original column>|<column without quotes>
$ cat a.txt
"20121023","19301229712","100397"
"20121023","19361629712","100778"
"20121030A","19361630412","100838"... (3 Replies)
Hello,
I'm trying to substitute a string with leading zero for all the records except the trailer record using awk command and with variables. The input file test_med1.txt has data like below
1234ABC...........................9200............LF... (2 Replies)
Hi ALL,
I want to replace string occurrence in my file "Config" using a external file named "Mapping" using awk.
$cat Config
! Configuration file for RAVI
! Configuration file for RACHANA
! Configuration file for BALLU
$cat Mapping
ravi:ram
rachana:shyam
ballu:hameed
The... (5 Replies)
Discussion started by: useless79
5 Replies
LEARN ABOUT OPENDARWIN
regex
regex(1F) FMLI Commands regex(1F)NAME
regex - match patterns against a string
SYNOPSIS
regex [-e] [ -v "string"] [ pattern template] ... pattern [template]
DESCRIPTION
The regex command takes a string from the standard input, and a list of pattern / template pairs, and runs regex() to compare the string
against each pattern until there is a match. When a match occurs, regex writes the corresponding template to the standard output and
returns TRUE. The last (or only) pattern does not need a template. If that is the pattern that matches the string, the function simply
returns TRUE. If no match is found, regex returns FALSE.
The argument pattern is a regular expression of the form described in regex(). In most cases, pattern should be enclosed in single quotes
to turn off special meanings of characters. Note that only the final pattern in the list may lack a template.
The argument template may contain the strings $m0 through $m9, which will be expanded to the part of pattern enclosed in ( ... )$0 through
( ... )$9 constructs (see examples below). Note that if you use this feature, you must be sure to enclose template in single quotes so
that FMLI does not expand $m0 through $m9 at parse time. This feature gives regex much of the power of cut(1), paste(1), and grep(1), and
some of the capabilities of sed(1). If there is no template, the default is $m0$m1$m2$m3$m4$m5$m6$m7$m8$m9.
OPTIONS
The following options are supported:
-e Evaluates the corresponding template and writes the result to the standard output.
-v "string" Uses string instead of the standard input to match against patterns.
EXAMPLES
Example 1: Cutting letters out of a string
To cut the 4th through 8th letters out of a string (this example will output strin and return TRUE):
`regex -v "my string is nice" '^.{3}(.{5})$0' '$m0'`
Example 2: Validating input in a form
In a form, to validate input to field 5 as an integer:
valid=`regex -v "$F5" '^[0-9]+$'`
Example 3: Translating an environment variable in a form
In a form, to translate an environment variable which contains one of the numbers 1, 2, 3, 4, 5 to the letters a, b, c, d, e:
value=`regex -v "$VAR1" 1 a 2 b 3 c 4 d 5 e '.*' 'Error'`
Note the use of the pattern '.*' to mean "anything else".
Example 4: Using backquoted expressions
In the example below, all three lines constitute a single backquoted expression. This expression, by itself, could be put in a menu defini-
tion file. Since backquoted expressions are expanded as they are parsed, and output from a backquoted expression (the cat command, in this
example) becomes part of the definition file being parsed, this expression would read /etc/passwd and make a dynamic menu of all the login
ids on the system.
`cat /etc/passwd | regex '^([^:]*)$0.*$' '
name=$m0
action=`message "$m0 is a user"`'`
DIAGNOSTICS
If none of the patterns match, regex returns FALSE, otherwise TRUE.
NOTES
Patterns and templates must often be enclosed in single quotes to turn off the special meanings of characters. Especially if you use the
$m0 through $m9 variables in the template, since FMLI will expand the variables (usually to "") before regex even sees them.
Single characters in character classes (inside []) must be listed before character ranges, otherwise they will not be recognized. For exam-
ple, [a-zA-Z_/] will not find underscores (_) or slashes (/), but [_/a-zA-Z] will.
The regular expressions accepted by regcmp differ slightly from other utilities (that is, sed, grep, awk, ed, and so forth).
regex with the -e option forces subsequent commands to be ignored. In other words, if a backquoted statement appears as follows:
`regex -e ...; command1; command2`
command1 and command2 would never be executed. However, dividing the expression into two:
`regex -e ...``command1; command2`
would yield the desired result.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWcsu |
+-----------------------------+-----------------------------+
SEE ALSO awk(1), cut(1), grep(1), paste(1), sed(1), regcmp(3C), attributes(5)SunOS 5.10 12 Jul 1999 regex(1F)