The
sed command has helped in extracting the data I want, however, I lose some of formating i.e. spaces in words. It appears that is what is causing the problem. If I convert the data using
sed s'/[^a-zA-Z0-9]//g' it gets the data I want. If I convert using s'/[^a-zA-Z0-9<>:]//g' it too gets the data I want. However when I convert the data using s'/[^a-zA-Z0-9<>: ]//g' that's when I hit the problem. So is there a way of subsituting the space with some character, extarct the data I am intrested in and then converting the character back to space?