Unexpected Behaviour from grepping Text File Post: 302650435

Sponsored Content

Top Forums UNIX for Dummies Questions & Answers Unexpected Behaviour from grepping Text File Post 302650435 by sudon't on Saturday 2nd of June 2012 11:56:37 PM

06-03-2012

Registered User

Now THIS may be quite what you wanted in the first place:

Code:

egrep -w '...a.....n[^^M]' /usr/share/dict/2of12.txt | head -5
advancement
arraignment
arrangement
derangement
devastating

The regular expression ends with a character range that means 'any character EXCEPT a record separator'; you may get it as a caret followed by a newline, both enclosed in brackets (open bracket, caret, control-V, newline, close bracket). Just a single tiny side-effect: I get 'self-advancement' somewhere down the list of results...[/QUOTE]

OK, that looks like it works. I need to figure out how to use that to try and strip that junk out of the file. I think we get 'self-advancement' because we didn't start with a caret. Again, why don't we need the -w flag in any other file? If you don't use it, even with the starting caret, you get longer words.

---------- Post updated at 11:43 PM ---------- Previous update was at 11:40 PM ----------

Quote:

Originally Posted by agama

This is not true. The dot does not match a newline. If it did, egrep "." file would match an empty line (^$) which it doesn't.

The problem is that there are probably ctl-M characters in the file just before the newline (Dos style newline which is 0x0d0x0a combinations rather than just a single 0x0a). Run the file through dos2unix, or someother conversion tool, which will delete the ctl-M characters from the file.

---------- Post updated at 23:36 ---------- Previous update was at 23:33 ----------

If you want an easy to verify that CTL-M characters are present, use cat:

Code:

cat -v 2of12.txt |head

This will likely generate the output:

Code:

a^M
aardvark^M
abaci^M
aback^M
abacus^M
abaft^M
abalone^M
abandon^M
abandoned^M
abandonment^M

That was it, all right!

Code:

whom^M
whomever^M
whomsoever^M
whoop^M
whoopee^M
whooper^M
whoops^M
whoosh^M
whopper^M
whopping^M
whore^M

---------- Post updated at 11:56 PM ---------- Previous update was at 11:43 PM ----------

dos2unix did the trick.

sudon't

View Public Profile for sudon't

Find all posts by sudon't

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grepping Errors in a file

Hey All, I have to grep for an error from a file and get the results of errror in a different file...... But there should be no duplicate entries. Can anyone help me in giving a shell script for this This is file which contains pattern error which I am supposed to grep and put this in a...

2. Shell Programming and Scripting

Grepping text by providing line numbers.

Dear Friends, I have a flat file from which I want to grep line no. 7,10, 19 to 35, 37. How can it be done? Thank you in advance Anushree

3. Shell Programming and Scripting

Grepping log file

Dear All, I have a log file that is dislpayed as: <msg time='2009-10-14T05:46:42.580+00:00' org_id='oracle' comp_id='tnslsnr' type='UNKNOWN' level='16' host_id='mtdb_a' host_addr='UNKNOWN' version='1'> <txt>14-OCT-2009 05:46:42 *...

4. Shell Programming and Scripting

Grepping file and returning passed variable if the value does not exist in file at all.

I have a list of fields that I want to check a file for, returning that field if it not found at all in the file. Is there a way to do a grep -lc and return the passed variable too rather then just the count? I am doing some crappy work-around now but I was not sure how to regrep this for :0 so...

5. AIX

Unexpected Behaviour with WPAR

Hello, We have a system running AIX 6.1.7.1. We have created a Workload Partition(wpar) on this system with wpar specific routing enabled. On wpar, we are running DNS (UDP/53) and syslog (UDP/514). en0: 1.1.1.1/255.255.255.0 NOT assigned to any wpar en1:...

6. UNIX for Dummies Questions & Answers

mtime unexpected behaviour

Hi All, My requirement is to remove the more than 60 days files from Archive folder, so prepared this command. for files in `find /abc/Archive/<file_name_25032012.dat> -type f -mtime 61|xargs ls -lrt` do rm -f $files done I tested this command in both unix and informatica. In unix if files...

7. Shell Programming and Scripting

Help with grepping data from a text file

Hello, I have a text file which contains a list of strings which I want to grep from another file where these strings occur and print out only these lines. I had earlier used the grep command where File1 was the file containing the strings to be grepped (Source File) and File2 the Target File...

8. Shell Programming and Scripting

Grepping text from one file in another file

Hello, I have a file with a large number of words each listed in sequential order one word per line. I want to search these words in another file which has the structure Both the files are large, but the words in the sourcefile are all available in the target file. I tried to grep...

9. UNIX for Advanced & Expert Users

[BASH] Getopts/shift within a function, unexpected behaviour

Hello Gurus :) I'm "currently" (for the last ~2weeks) writing a script to build ffmpeg with some features from scratch. This said, there are quite a few features, libs, to be downloaded, compiled and installed, so figured, writing functions for some default tasks might help. Specialy since...

10. Shell Programming and Scripting

ksh Script, Reading A File, Grepping A File Contents In Another File

So I'm stumped. First... APOLOGIES... my work is offline in an office that has zero internet connectivity, as required by our client. If need be, I could print out my script attempts and retype them here. But on the off chance... here goes. I have a text file (file_source) of terms, each line...

LEARN ABOUT DEBIAN

polish

POLISH(5)							      Debian								 POLISH(5)

NAME

       polish - a list of Polish words

DESCRIPTION

       /usr/share/dict/polish is an ASCII file which contains an alphabetic list of words, one per line.

FILES

       /etc/dictionaries-common/words  is  a  symbolic	link  to  a  /usr/share/dict/<language> file.  /usr/share/dict/words is a symbolic link to
       /etc/dictionaries-common/words,	and  is  the  name  by	which  other   software   should   refer   to	the   system   word   list.    See
       select-default-wordlist(8) for more information.

       The directory /usr/share/dict can contain word lists for many languages, with name of the language in English, e.g., /usr/share/dict/french
       and /usr/share/dict/danish contain respectively lists of French and Danish words if they exist.	Such lists should be coded using the UTF-8
       character set encoding.

SEE ALSO

       ispell(1), select-default-wordlist(8), and the Filesystem Hierarchy Standard.

HISTORY

       The words lists are not specific, and may be generated from any number of sources.

       The  system word list used to be /usr/dict/words.  For compatibility, software should check that location if /usr/share/dict/words does not
       exist.

AUTHOR

       Word lists are collected and maintained by various authors.

Debian Project							 March 29th, 2011							 POLISH(5)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grepping Errors in a file

Discussion started by: achararun

2. Shell Programming and Scripting

Grepping text by providing line numbers.

Discussion started by: anushree.a

3. Shell Programming and Scripting

Grepping log file

Discussion started by: x-plicit78

4. Shell Programming and Scripting

Grepping file and returning passed variable if the value does not exist in file at all.

Discussion started by: personalt

5. AIX

Unexpected Behaviour with WPAR

Discussion started by: 03sep2011

6. UNIX for Dummies Questions & Answers

mtime unexpected behaviour

Discussion started by: harris

7. Shell Programming and Scripting

Help with grepping data from a text file

Discussion started by: gimley

8. Shell Programming and Scripting

Grepping text from one file in another file

Discussion started by: gimley

9. UNIX for Advanced & Expert Users

[BASH] Getopts/shift within a function, unexpected behaviour

Discussion started by: sea

10. Shell Programming and Scripting

ksh Script, Reading A File, Grepping A File Contents In Another File

Discussion started by: Brusimm

LEARN ABOUT DEBIAN

polish