This is a problem I've worked on a while and can't figure out.
There is a file.txt
The Awk program is trying to extract the year portion of the birth and death ("98: and "2nd C.") using the below technique
There are other ways to do it via the command line, but I need inside a function in a script using readfile().
The above code returns the correct birth year, but the death year is mangled because the regex is grabbing both the birth and death strings.
It works when not using the readfile() function, instead getline and this regex
The ".*" grabs everything to the end of the line and since readline makes the entire file a single line it grabs to the end of the "line" (file). That's why I'm using word boundary ("\y"), which works, but it doesn't work if there is a space in the data, such as the case here with the death string ("2nd C."). I tried adding "[:space:]" but that didn't work. I think this is solvable with the right regex but I'm out of ideas.
Last edited by Scrutinizer; 10-26-2014 at 03:36 AM..
Reason: extra code tags
Hi guys,
does anyone know how to test for a regular expression - i want to include it in a script to make sure the variable is a regexp
cheers (1 Reply)
please help:
I want to add 1 space between string and numbers:
input file:
abcd12345
output file:
abcd 1234
The following sed command does not work:
sed 's/\(+\)\(+\)/\1 \2/' file
Any ideas, please
Andy (2 Replies)
I have 2 files called stuff-egress-filter and stuff-ingress filter. There are also files called something like stuff-egress-F/0
I want to match the first two... I tried (i realize there is no filename... I'm piping this from the ls command)
grep stuff-*-filter
Finds nothing. If I... (18 Replies)
I'd like to know if there is a catchall line for renaming the following patterns:
s01e03 -> 01x03
s4e9 -> 04x09
s10e08 ->10x08
and possibly even:
318 -> 03x18
1002 ->10x02
if its the first 3 or first digit number in the string.
thanks! (0 Replies)
Good Day,
Im new to scripting especially awk and sed. I just would like to ask help from you guys about a sed command that prints the line immediately after a regexp, but not the line containing the regexp.
sed -n '/regexp/{n;p;}' filename
What if my regexp is 3 word or a sentence. Im... (3 Replies)
My input file looks like this:
13154|X,the deer hunter
13154|Y,the good life
1316|,american idol
1316|,bowling
1316|,chuck
etc...
The X, Y, or any other character (besides a comma) after the pipe is a "Device Type". I want to strip out lines that do not have a device type.
I have... (2 Replies)
I would like to extract "1333 Fairlane" given the below text.
The word "Building:" is always present. The wording between Building and the beginning of the address can be almost anything. It appears the the hyphen is there most of the time.
Campus: Fairlane Business Park
Building:... (9 Replies)
PERLREQUICK(1) Perl Programmers Reference Guide PERLREQUICK(1)NAME
perlrequick - Perl regular expressions quick start
DESCRIPTION
This page covers the very basics of understanding, creating and using regular expressions ('regexes') in Perl.
The Guide
Simple word matching
The simplest regex is simply a word, or more generally, a string of characters. A regex consisting of a word matches any string that con-
tains that word:
"Hello World" =~ /World/; # matches
In this statement, "World" is a regex and the "//" enclosing "/World/" tells perl to search a string for a match. The operator "=~" asso-
ciates the string with the regex match and produces a true value if the regex matched, or false if the regex did not match. In our case,
"World" matches the second word in "Hello World", so the expression is true. This idea has several variations.
Expressions like this are useful in conditionals:
print "It matches
" if "Hello World" =~ /World/;
The sense of the match can be reversed by using "!~" operator:
print "It doesn't match
" if "Hello World" !~ /World/;
The literal string in the regex can be replaced by a variable:
$greeting = "World";
print "It matches
" if "Hello World" =~ /$greeting/;
If you're matching against $_, the "$_ =~" part can be omitted:
$_ = "Hello World";
print "It matches
" if /World/;
Finally, the "//" default delimiters for a match can be changed to arbitrary delimiters by putting an 'm' out front:
"Hello World" =~ m!World!; # matches, delimited by '!'
"Hello World" =~ m{World}; # matches, note the matching '{}'
"/usr/bin/perl" =~ m"/perl"; # matches after '/usr/bin',
# '/' becomes an ordinary char
Regexes must match a part of the string exactly in order for the statement to be true:
"Hello World" =~ /world/; # doesn't match, case sensitive
"Hello World" =~ /o W/; # matches, ' ' is an ordinary char
"Hello World" =~ /World /; # doesn't match, no ' ' at end
perl will always match at the earliest possible point in the string:
"Hello World" =~ /o/; # matches 'o' in 'Hello'
"That hat is red" =~ /hat/; # matches 'hat' in 'That'
Not all characters can be used 'as is' in a match. Some characters, called metacharacters, are reserved for use in regex notation. The
metacharacters are
{}[]()^$.|*+?
A metacharacter can be matched by putting a backslash before it:
"2+2=4" =~ /2+2/; # doesn't match, + is a metacharacter
"2+2=4" =~ /2+2/; # matches, + is treated like an ordinary +
'C:WIN32' =~ /C:\WIN/; # matches
"/usr/bin/perl" =~ //usr/bin/perl/; # matches
In the last regex, the forward slash '/' is also backslashed, because it is used to delimit the regex.
Non-printable ASCII characters are represented by escape sequences. Common examples are " " for a tab, "
" for a newline, and "
" for a
carriage return. Arbitrary bytes are represented by octal escape sequences, e.g., "