Regular Exression Confusion


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Regular Exression Confusion
# 1  
Old 07-08-2008
Power Regular Exression Confusion

Hi All,

I am new to shell script. And right now learning regular expression. I am using BASH as a shell and RedHat Linux. I have couple of doubts regarding the regular expression

1) +
This is matches one or more occurrence of the preceding pattern.
So if the file contains following data
file.txt
a
aa
aba
bbb
ccc

Code:
grep -i 'a+' file.txt

should print following
a
aa
aba

But to print above I need to have
Code:
 grep -i a\+' file.txt

2) ?
should match 0 or 1 occurrence of the preceding pattern

But when I use
Code:
 grep -i 'a?' file.txt

it pritnts nothing.. And when I use
Code:
grep -i 'a\?' file.txt

it pritints all the data in the file. i.e output was as below
a
aa
aba
bbb
ccc

I was expecting the output as below
a
bbb
ccc

Now I have a doubt 0 or 1 means exactly 0 or 1 ?? Smilie or is it mean 0 or more then 0 ??? if it means 0 or more then 0 then whats the differnece between * (asterisk ) and ? ( question mark )?? Smilie


3)
Code:
grep -i 'a\{1,\}' file.txt

and
Code:
grep -i 'a\{1\}' file.txt

prints the same data as below

a
aa
aba

I was expecting this output only when I use
Code:
grep -i 'a\{1,\}' file.txt

and should print below line when I use
Code:
grep -i 'a\{1\}' file.txt

a

Please Please help me.........
Thnx in advance Smilie
# 2  
Old 07-08-2008
Problem is, that grep just understands very limited regular expressions until you add in the switch "-E" or use "egrep" and you have to escape a + of course so it isn't interpreted as "just another character":

Code:
root@isau02:/data/tmp/testfeld> cat infile
a
aa
aba
bbb
ccc
root@isau02:/data/tmp/testfeld> grep -E a\+ infile
a
aa
aba

For 2):
A single character is a dot, not a question mark.

For 3):
{1} also stands for minimum. It's the same as {1,}. And even {1,4} will print out every line where at least one "a" is in. There is a very good pocke guide on awk and sed from O'Reilly. They have a good overview on regular expressions in the first few pages, very helpful.
# 3  
Old 07-08-2008
Quote:

For
2): A single character is a dot, not a question mark.

For 3):
{1} also stands for minimum. It's the same as {1,}. And even {1,4} will print out every line where at least one "a" is in. There is a very good pocke guide on awk and sed from O'Reilly. They have a good overview on regular expressions in the first few pages, very helpful.
Hi
Thnx for your reply
.
1) But when I refer the book and some other on line articles they says that "?" will print 0 or 1 occurrence of the preceding character ?? ( I was understanding it as exact 0 or exact 1, my mistake Smilie )

But now it seems that "*" and "?" are same. or is there any difference ?? Smilie

2) And what should I do if I need to match the exact number of occurrence in the string. Smilie

i.e. I need to find the string that contains only 2 "a" ( not less then 1 and not more then 2 ).
# 4  
Old 07-08-2008
Quote:
Originally Posted by Gaurang033
2) And what should I do if I need to match the exact number of occurrence in the string. Smilie

i.e. I need to find the string that contains only 2 "a" ( not less then 1 and not more then 2 ).
Try:
Code:
grep -i "a.*a[^a]*" file.txt

# 5  
Old 07-08-2008
Quote:
Originally Posted by Klashxx
Try:
Code:
grep -i "a.*a[^a]*" file.txt

hey thanx,

But above code prints all the strings that contains 2 or more then 2 "a" characters.

I want strings that contain exactly 2 "a" charcters.

for example if file contains
a
aa
aaa
abab
ababab

I want only following string.
aa ( only 2 "a" )
abab ( only 2 "a" )

But you code prints the following

aa
aaa ( I don't want it as it contains the 3 "a" )
abab
ababab ( I don't want it as it contains the 3 "a" )
# 6  
Old 07-08-2008
I understood what you want, but to explain the ?, I read about it and fooled around, which might explain it a bit:

Code:
echo -e "November" | egrep Nov\(ember\)\?
November
echo -e "Nov" | egrep Nov\(ember\)\?
Nov
echo -e "Noveember" | egrep Nov\(ember\)\?
Noveember
echo -e "No" | egrep Nov\(ember\)\?
echo -e "Novrebm" | egrep Nov\(ember\)\?
Novrebm

So it is as the description says, none or one occurence. Here it is a group of characters, but they interpret as single characters, not as a complete string.

So much for that.
# 7  
Old 07-08-2008
A quick workaround:

Code:
> cat text
a
aa
aaa
abab
ababab
jjhk
jjsasa
jjsasasda
> awk 'NF==3' FS='a' text
aa
abab
jjsasa

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Confusion with the concept of wc -c and wc -m

Is wc -c and wc -m same ? Shellscript::cat file1 hello Shellscript::cat file1 | wc -c 6 Shellscript::cat file1 | wc -m 6 Shellscript::file file1 file1: ASCII text Shellscript::uname -a Linux was85host 2.6.27.45-0.1-vmi #1 SMP 2010-02-22 16:49:47 +0100 i686 i686 i386 GNU/LinuxAtleast... (5 Replies)
Discussion started by: shellscripting
5 Replies

2. Shell Programming and Scripting

Confusion with PS

Hello All, I have a problem in counting number of process getting run with my current script name.. Here it is ps -ef | grep $0 | grep -v grep This display just one line with the PID, PPID and other details when i print it in the script. But when I want to count the numbers in my... (11 Replies)
Discussion started by: sathyaonnuix
11 Replies

3. Homework & Coursework Questions

Server Confusion

I don't even know where to start with this one. There is so much out there about different aspects of this. I am starting with a basic Ubuntu 11.04 install. Do I need to configure a DNS? I am a little confused about that. What do I need to do for a domain name? I have followed various tutorials,... (1 Reply)
Discussion started by: polyglot0727
1 Replies

4. Programming

shmget confusion?????

Hi friends, This is a small program built on the concept of shared memory. The producer is a separate program and process, and the consumer is a seperate program and process. Both are executed under the same user account. The producer takes some string from the user and adds that string to the... (1 Reply)
Discussion started by: gabam
1 Replies

5. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print... (1 Reply)
Discussion started by: jessy83
1 Replies

6. Programming

C fork Confusion :-?

Hi, I was trying to learn forking in C in UNIX. Somehow i still haven't been able to get the concept well. I mean, i do understand that fork creates an exact replica of the parent (other than the fact that parent gets the process id of the child and child gets 0 when fork is called). This is the... (2 Replies)
Discussion started by: ralpheno
2 Replies

7. UNIX for Dummies Questions & Answers

'tr' confusion

Good day, everyone! Could anybody explain me the following situation. If I'm running similar script: Var="anna.kurnikova" Var2="Anna Kurn" echo $Var | tr -t "$Var" "$Var2" Why the output is : anna KurniKova instead of Anna Kurnikova? :confused: Thank you in advance for any... (2 Replies)
Discussion started by: Nafanja
2 Replies

8. UNIX for Dummies Questions & Answers

ftp confusion

I'm an intern at a company that recently bought out another business. In doing so, they inherited a unix system that contains files which they need to retrieve. No one in the company, including myself, really understands or knows unix so please respond with the true assumption that I'm a unix... (1 Reply)
Discussion started by: intern
1 Replies

9. UNIX for Dummies Questions & Answers

unix confusion

:confused: some one please tell me where i can possibly find out what is unix 10.2 and the basic system functions of it is. I really need help! (1 Reply)
Discussion started by: tribb24
1 Replies

10. UNIX for Dummies Questions & Answers

Clear confusion

Hi, In some machines when i type "clear" it completely clears all the contents on that window but on some it simply scrolls up all the content. How can i change this? (4 Replies)
Discussion started by: vibhor_agarwali
4 Replies
Login or Register to Ask a Question