Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Unexpected Behaviour from grepping Text File Post 302650435 by sudon't on Saturday 2nd of June 2012 11:56:37 PM
Old 06-03-2012
Now THIS may be quite what you wanted in the first place:
Code:
egrep -w '...a.....n[^^M]' /usr/share/dict/2of12.txt | head -5
advancement
arraignment
arrangement
derangement
devastating

The regular expression ends with a character range that means 'any character EXCEPT a record separator'; you may get it as a caret followed by a newline, both enclosed in brackets (open bracket, caret, control-V, newline, close bracket). Just a single tiny side-effect: I get 'self-advancement' somewhere down the list of results...[/QUOTE]

OK, that looks like it works. I need to figure out how to use that to try and strip that junk out of the file. I think we get 'self-advancement' because we didn't start with a caret. Again, why don't we need the -w flag in any other file? If you don't use it, even with the starting caret, you get longer words.

---------- Post updated at 11:43 PM ---------- Previous update was at 11:40 PM ----------

Quote:
Originally Posted by agama
This is not true. The dot does not match a newline. If it did, egrep "." file would match an empty line (^$) which it doesn't.

The problem is that there are probably ctl-M characters in the file just before the newline (Dos style newline which is 0x0d0x0a combinations rather than just a single 0x0a). Run the file through dos2unix, or someother conversion tool, which will delete the ctl-M characters from the file.

---------- Post updated at 23:36 ---------- Previous update was at 23:33 ----------

If you want an easy to verify that CTL-M characters are present, use cat:

Code:
cat -v 2of12.txt |head

This will likely generate the output:

Code:
a^M
aardvark^M
abaci^M
aback^M
abacus^M
abaft^M
abalone^M
abandon^M
abandoned^M
abandonment^M

That was it, all right!

Code:
whom^M
whomever^M
whomsoever^M
whoop^M
whoopee^M
whooper^M
whoops^M
whoosh^M
whopper^M
whopping^M
whore^M

---------- Post updated at 11:56 PM ---------- Previous update was at 11:43 PM ----------

dos2unix did the trick.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grepping Errors in a file

Hey All, I have to grep for an error from a file and get the results of errror in a different file...... But there should be no duplicate entries. Can anyone help me in giving a shell script for this This is file which contains pattern error which I am supposed to grep and put this in a... (4 Replies)
Discussion started by: achararun
4 Replies

2. Shell Programming and Scripting

Grepping text by providing line numbers.

Dear Friends, I have a flat file from which I want to grep line no. 7,10, 19 to 35, 37. How can it be done? Thank you in advance Anushree (6 Replies)
Discussion started by: anushree.a
6 Replies

3. Shell Programming and Scripting

Grepping log file

Dear All, I have a log file that is dislpayed as: <msg time='2009-10-14T05:46:42.580+00:00' org_id='oracle' comp_id='tnslsnr' type='UNKNOWN' level='16' host_id='mtdb_a' host_addr='UNKNOWN' version='1'> <txt>14-OCT-2009 05:46:42 *... (19 Replies)
Discussion started by: x-plicit78
19 Replies

4. Shell Programming and Scripting

Grepping file and returning passed variable if the value does not exist in file at all.

I have a list of fields that I want to check a file for, returning that field if it not found at all in the file. Is there a way to do a grep -lc and return the passed variable too rather then just the count? I am doing some crappy work-around now but I was not sure how to regrep this for :0 so... (3 Replies)
Discussion started by: personalt
3 Replies

5. AIX

Unexpected Behaviour with WPAR

Hello, We have a system running AIX 6.1.7.1. We have created a Workload Partition(wpar) on this system with wpar specific routing enabled. On wpar, we are running DNS (UDP/53) and syslog (UDP/514). en0: 1.1.1.1/255.255.255.0 NOT assigned to any wpar en1:... (0 Replies)
Discussion started by: 03sep2011
0 Replies

6. UNIX for Dummies Questions & Answers

mtime unexpected behaviour

Hi All, My requirement is to remove the more than 60 days files from Archive folder, so prepared this command. for files in `find /abc/Archive/<file_name_25032012.dat> -type f -mtime 61|xargs ls -lrt` do rm -f $files done I tested this command in both unix and informatica. In unix if files... (8 Replies)
Discussion started by: harris
8 Replies

7. Shell Programming and Scripting

Help with grepping data from a text file

Hello, I have a text file which contains a list of strings which I want to grep from another file where these strings occur and print out only these lines. I had earlier used the grep command where File1 was the file containing the strings to be grepped (Source File) and File2 the Target File... (4 Replies)
Discussion started by: gimley
4 Replies

8. Shell Programming and Scripting

Grepping text from one file in another file

Hello, I have a file with a large number of words each listed in sequential order one word per line. I want to search these words in another file which has the structure Both the files are large, but the words in the sourcefile are all available in the target file. I tried to grep... (2 Replies)
Discussion started by: gimley
2 Replies

9. UNIX for Advanced & Expert Users

[BASH] Getopts/shift within a function, unexpected behaviour

Hello Gurus :) I'm "currently" (for the last ~2weeks) writing a script to build ffmpeg with some features from scratch. This said, there are quite a few features, libs, to be downloaded, compiled and installed, so figured, writing functions for some default tasks might help. Specialy since... (3 Replies)
Discussion started by: sea
3 Replies

10. Shell Programming and Scripting

ksh Script, Reading A File, Grepping A File Contents In Another File

So I'm stumped. First... APOLOGIES... my work is offline in an office that has zero internet connectivity, as required by our client. If need be, I could print out my script attempts and retype them here. But on the off chance... here goes. I have a text file (file_source) of terms, each line... (3 Replies)
Discussion started by: Brusimm
3 Replies
POLISH(5)							      Debian								 POLISH(5)

NAME
polish - a list of Polish words DESCRIPTION
/usr/share/dict/polish is an ASCII file which contains an alphabetic list of words, one per line. FILES
/etc/dictionaries-common/words is a symbolic link to a /usr/share/dict/<language> file. /usr/share/dict/words is a symbolic link to /etc/dictionaries-common/words, and is the name by which other software should refer to the system word list. See select-default-wordlist(8) for more information. The directory /usr/share/dict can contain word lists for many languages, with name of the language in English, e.g., /usr/share/dict/french and /usr/share/dict/danish contain respectively lists of French and Danish words if they exist. Such lists should be coded using the UTF-8 character set encoding. SEE ALSO
ispell(1), select-default-wordlist(8), and the Filesystem Hierarchy Standard. HISTORY
The words lists are not specific, and may be generated from any number of sources. The system word list used to be /usr/dict/words. For compatibility, software should check that location if /usr/share/dict/words does not exist. AUTHOR
Word lists are collected and maintained by various authors. Debian Project March 29th, 2011 POLISH(5)
All times are GMT -4. The time now is 07:36 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy