|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
How do we write an exception in a Regex.
Hello, Actually this is a follow-up of my earlier request to identify Sentence Boundaries while generating snippets for a search engine. The basic regex I have written to delimit sentence boundaries handles numbers and acronyms but I cannot get it to handle cases of Quote:
I tried the following syntax: Code:
!(Dr\.|Mr\.|Mrs\.|Ms\.|[A-Z]\.|i\.e\.|w\.r\.t\.|e\.g\.|etc\.|viz\.) to make the regex ignore a full-stop after such cases enumerated, but it does not work. In fact the simple regex I had written has got murky and just does not perform any more. Any help in correcting the regex would be appreciated. Some sample sentences are given below: Quote:
|
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
Instead of writing things for a regex to not match, try getting something else in your regex to match it first. Regexes do greedy matching so whatever matches it first 'wins'. What language is this regex for? This works in grep: Code:
$ echo "Mr. Andrew visited me. fleeb narf stuff." | egrep -o "([a-zA-Z]|(Mr|Ms|Dr|Mrs)[.]| )*[.]" Mr. Andrew visited me. fleeb narf stuff. $ A simplified example but hopefully conveys the idea. Just a preference of mine, but I find it clearer to put special chars in [] than escape them to make them literal sometimes. |
| The Following User Says Thank You to Corona688 For This Useful Post: | ||
gimley (08-03-2012) | ||
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
Many thanks. Works beautifully in egrep, but dies in Java. I wonder why. Does anybody know if Java demands a special regex set ?
|
|
#4
|
|||
|
|||
|
regex really isn't the same everywhere. Might have been a good idea to post you were using java from the start.
|
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Converting perl regex to sed regex | suntzu | Shell Programming and Scripting | 1 | 10-30-2010 06:16 AM |
| read/write,write/write lock with smbclient fails | swatidas11 | IP Networking | 1 | 03-05-2010 10:26 AM |
| MMU exception | Puntino | Linux | 2 | 05-07-2008 12:35 PM |
| Help with RPC Exception | ejbrever | HP-UX | 2 | 08-24-2006 02:08 PM |
| RPC Exception - Help | ejbrever | UNIX for Advanced & Expert Users | 0 | 08-21-2006 12:56 PM |
|
|