Hi Don,
Thank you for your detailed reply and examples of RS expressions.
You are correct, I would need to consider all the other abbreviations as well.
I'm using these sentences for training a neural net.
The training corpus would be classic books like Pride and Prejudice from project Gutenberg for example :
Pride and Prejudice by Jane Austen - Free Ebook
Looking at this example above:
-Not all sentences end on a newline.
-There aren't two spaces after the end of a sentence
-The reason I'm considering : and ; because even though they are not complete sentences, they are in "general" complete "thoughts" and as such I can approximate them as complete sentences and reduce the complexity of the neural net.
I think the simplest solution for me seems to be to do massage the input file and do a search and replace all of the exceptions as itkamaraj suggested.
If by looking at this text example you might have another suggestion please let me know.
thank you all !