Time for a little introduction to regular expressions:
Regular expressions (or "Regexps") are a tool for matching patterns in texts. They consist of
ordinary characters
metacharacters
Ordinary characters just stand for themselves - "a" means "search for an a".
Metacharacters change the way, ordinary characters are interpreted. I will describe a few, but you are encouraged to research them, the web is full of articles about them.
[...]
defines a "character class". Any one character inside will be matched, but only one of them. Example:
will match "abd" and "acd", but not "abcd" or "abbd", etc.. It is possible to reverse the meaning by using the caret "^" as the first character. This means every character NOT listed in the class:
This will match "axd" and "avd" - anything but "abd" and "acd". It is also possible to list consecutive characters by a "-": "[a.z]" means any (lowercase) character a-z, "[0-9]" means any digit 0-9. Notice, that i used "[ <t>]" (left bracket, space, tab, right bracket) in my expression above! That defines a character class which matches any whitespace, blanks OR tabs.
*
This acts as a multiplier to the expression before, meaning "zero or more of whatever the expression is". For example:
means zero or more a's followed by a b. The strings "abc", "abbbbc" and even "ac" (zero or more!) are matched, but not "axc". If you combine this with the character classes you may get:
x, followed by any number of a's, b's or c's, followd by a y. It matches "xabcaababbbabby" and "xay", "xby" and "xcy" and even "xy" (again: zero or more), but not "xdy".
\( ... \)
This is a device for grouping and acts similar to brackets in math: it does nothing itself, but it groups together what is inside, for instance to make it manipulatable by the asterisk. Example:
means any number of reduplications of the string "abc", followed by a "d". This matches "abcd" and "abcabcd", also "d", but not "abd", because before the "d" there is no "abc". Groupings can be nested.
\{n\} , \{m,n\}
This is also a multiplier for the expression before, like the asterisk, but limited by one or two numbers. The one-number variant means exactly that many reiterations, the two-number variant means between m and n reiterations. Example:
is the same as "abbc". This one:
matches "abbc", "abbbc" and "abbbbc" and nothing else. Again, this can be combined with other expressions, like classes. I used:
to match exactly 14 digits. I could have written
for the same effect.
\
The backslash is part of some control constructs, as you have seen above. apart from that it is the "escape character". You have seen that several characters - metacharacters - are not meaning themselves but something different. The backslash strips this special meaning from them and makes them ordinary characters again. Like here:
Because the asterisk is escaped it is meant literal. This regexp matches the string "ab*c". Or this:
Usually the full stop is a metacharacter and means any one character. But here i want to use it literally and therefore escape it. The expression means a digit, followed by a full stop (decimal point), followed by eactly 14 other digits.
With this you should be able to decipher the sed script yourself.
Hye all,
I would like some help with reading in a file in which the data is seperated by commas. for instance:
input.dat:
1,2,34,/test
for the above case, the fn. will store the values into an array -> data as follows:
data = 1
data = 2
data = 34
data = /test
I am trying to write... (5 Replies)
Ok; here is the code
INCREMENT=0 #Final Count
MATCH=0 #Treated as a Boolean
declare -a LINEFOUR #Declared Array
for FILE in $DIR; do # DIR was declared earlier
test -f $FILE && (
TEMP=(sed -n '4p' $FILE) #How do I assign the fourth line of the file to TEMP? This doesn't... (1 Reply)
Hi,
I have tried to find some sort of previous similar thread on this but not quite close to what I want to achieve.
Basically I have two class of data in my file..e.g
1,1,1,1,1,2,yes
1,2,3,4,5,5,yes
2,3,4,5,5,5,no
1,2,3,4,4,2,no
1,1,3,4,5,2,no
I wanted to read the "yes" entry to an... (5 Replies)
Hi,
I am trying to use arrays in my script but can not seem to get it to work.
I have a file called sections, this contains headers from a tripwire log file, separated by "@" but could be "," if easier
The headers will be used to cut sections from the log file into another to be mailed.
... (5 Replies)
hi all, i have a data file that contains 2 columns, names and numbers. i need to read names in to a an array call names and numbers in to an array call numbers. i also have # and blank lines in my dat file and i need to skip those when i read the dat file. how do i do this? btw, my column 1 and... (3 Replies)
Hi, im new to shell scripting. i have a query for which i have searched your forums but coulndt get what i need.
i have a file that has two records of exactly the same length and format and they are comma seperated. i need to save the first and the second columns of the input file to 2 different... (11 Replies)
Okay, I've made threads on extracting fields and comparing strings in separate files in .csv's. I've written the following code with intentions of learning more.
I just want this one question answered: How can I assign fields from a file(comma separated) to variables?
My goal is to check... (0 Replies)
if i declare both but don't input any variables what values will the int array and file pointer array have on default, and if i want to reset any of the elements of both arrays to default, should i just set it to 0 or NULL or what? (1 Reply)
I have this piece of code. The first if statement is not working, however the second if statement is working fine.
I have set a value for Srcs to be file.srcs and want to print it.
If no value for Rcvs is set, I get the print statement correctly
hasValue="file.srcs"
if ${hasValue}; then
... (0 Replies)
i have two files say a and b
a has these lines
1 20 30 40
2 30 40 50
3 25 35 45
5 20 50 20
and b has these lines
20 30
30 40
25 35
20 50
the script reads
FILENAME ( "a" ) {
rec1=$2; rec2=$2; } (4 Replies)