grep or awk looking for repeating text

09-09-2011

Registered User

10, 0

Join Date: Sep 2011

Last Activity: 28 December 2011, 1:58 PM EST

Location: Palm Beach

Posts: 10

Thanks Given: 7

Thanked 0 Times in 0 Posts

grep or awk looking for repeating text

I am looking for a way to find the below pattern in text.

Code:

777777,111,08-20-2011 
111111,222,08-20-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
111111,222,07-22-2011

I would like to find a way to print every line in a file where the first 6 numbers match and there is different entries in the second column on the same date.

This would be the output I need.

Code:


777777,111,07-24-2011
777777,222,07-24-2011

Last edited by radoulov; 09-09-2011 at 05:36 PM.. Reason: Code tags!

ffdstanley

View Public Profile for ffdstanley

Find all posts by ffdstanley

09-09-2011

Registered User

686, 179

Join Date: Mar 2011

Last Activity: 17 March 2020, 9:58 PM EDT

Posts: 686

Thanks Given: 51

Thanked 179 Times in 171 Posts

OK... this is not very elegant, but should work:

Code:

awk -F, 'a[$1,$3]++' input | sed 's/,[^,]*,/,.*,/' | uniq  | while read i ; do
   grep "$i" input 
done

Code:

awk -F, 'a[$1,$3]++' input

basically looks for duplicates, but doesn't print the first occurance, just the lines with first and thirs field already encountered. Then, the second field is taken out with sed, and the input file is greped for each pattern (1st and 3rd field with 2nd being whatever) .

mirni

View Public Profile for mirni

Find all posts by mirni

09-09-2011

Registered User

628, 174

Join Date: Oct 2010

Last Activity: 2 December 2017, 5:58 AM EST

Location: Madrid, Spain

Posts: 628

Thanks Given: 8

Thanked 174 Times in 171 Posts

Hi,

A solution using Perl:

Code:

$ cat infile
777777,111,08-20-2011 
111111,222,08-20-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
111111,222,07-22-2011
$ cat script.pl
use warnings;
use strict;

my (%reg);
while ( <> ) {
        next if /^\s*$/;
        s/\s*$//;
        my @f = split /,/;
        push @{ $reg{ join( ",,", @f[0,2] ) } }, $f[1];
}

foreach my $key ( keys %reg ) {
        next unless @{ $reg{ $key } } > 1;
        my %seen;
        my @values = grep { not $seen{ $_ }++ } @{ $reg{ $key } };
        foreach (@values) { 
                printf "%s%s%s\n", $key =~ /^([^,]+,)/, $_, $key =~ /.*(,.*)$/;
        }
}
$ perl script.pl infile
777777,111,07-24-2011
777777,222,07-24-2011

Regards,
Birei

This User Gave Thanks to birei For This Post:

birei

View Public Profile for birei

Find all posts by birei

09-10-2011

Registered User

1,000, 237

Join Date: Jun 2011

Last Activity: 2 August 2017, 9:27 AM EDT

Location: From far

Posts: 1,000

Thanks Given: 21

Thanked 237 Times in 231 Posts

Not for sale:

Code:

cat INPUTFILE | sed 's/ *$//' | awk -F, '{ print $0, $3, $2, $1 }' | 
sort | uniq -f2 | sed -rn 'G; /(.{6})\n\1/p; s/\n.*//; h' | cut -d' ' -f1
777777,222,07-24-2011
777777,111,07-24-2011

And I know everything about cats. )))

This User Gave Thanks to yazu For This Post:

yazu

View Public Profile for yazu

Find all posts by yazu

09-11-2011

Registered User

10, 0

Join Date: Sep 2011

Last Activity: 28 December 2011, 1:58 PM EST

Location: Palm Beach

Posts: 10

Thanks Given: 7

Thanked 0 Times in 0 Posts

Another cat scenario

How about if I want to see all users with multiple different entries in the second field? I would want all lines from those users.

input

Code:

777777,111,08-20-2011 
111111,222,08-20-2011
222222,222,08-11-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
111111,222,07-22-2011
222222,111,07-29-2011

output

Code:

777777,111,08-20-2011 
222222,222,08-11-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
222222,111,07-29-2011

I am learning a little at a time. This stuff is amazing!!!

Last edited by radoulov; 09-11-2011 at 01:16 PM.. Reason: Code tags!

ffdstanley

View Public Profile for ffdstanley

Find all posts by ffdstanley

09-11-2011

Registered User

1,000, 237

Join Date: Jun 2011

Last Activity: 2 August 2017, 9:27 AM EDT

Location: From far

Posts: 1,000

Thanks Given: 21

Thanked 237 Times in 231 Posts

If you don't bother about the order you can just remove my awful sed command from the filter.
===
Well, it gives the wrong result... (((

Last edited by yazu; 09-12-2011 at 02:01 AM..

yazu

View Public Profile for yazu

Find all posts by yazu

09-12-2011

Registered User

2,100, 402

Join Date: Apr 2009

Last Activity: 11 February 2020, 10:24 AM EST

Posts: 2,100

Thanks Given: 26

Thanked 402 Times in 360 Posts

Quote:

Originally Posted by ffdstanley

How about if I want to see all users with multiple different entries in the second field? I would want all lines from those users.

input

Code:

777777,111,08-20-2011 
111111,222,08-20-2011
222222,222,08-11-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
111111,222,07-22-2011
222222,111,07-29-2011

output

Code:

777777,111,08-20-2011 
222222,222,08-11-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
222222,111,07-29-2011

...

Code:

$
$
$ cat input
777777,111,08-20-2011
111111,222,08-20-2011
222222,222,08-11-2011
777777,111,07-24-2011
777777,222,07-24-2011
111111,222,07-22-2011
222222,111,07-29-2011
$
$
$ perl -F, -lane '@c = grep {defined $$_{$F[0]}} @x;
                  if (! defined @c) {
                    push @x, { $F[0], [$F[1], "$.:$_"] };
                  } else {
                    $c[0]->{$F[0]}->[0] .= ",$F[1]" if $c[0]->{$F[0]}->[0] !~ /$F[1]/;
                    push @{$c[0]->{$F[0]}}, "$.:$_";
                  }
                  END {
                    foreach $i (@x) {
                      @v = values %$i;
                      if (@{$v[0]}[0] =~ /,/) {
                        foreach (@{$v[0]}[1..$#{$v[0]}]) {
                          ($key, $val) = split /:/;
                          $z{$key} = $val;
                        }
                      }
                    }
                    print $z{$_} foreach (sort keys %z);
                  }
                 ' input
777777,111,08-20-2011
222222,222,08-11-2011
777777,111,07-24-2011
777777,222,07-24-2011
222222,111,07-29-2011
$
$

tyler_durden

This User Gave Thanks to durden_tyler For This Post:

durden_tyler

View Public Profile for durden_tyler

Find all posts by durden_tyler

Shell Programming and Scripting

grep or awk looking for repeating text

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to find=grep or maybe sed/awk for multiple lines of text?

Discussion started by: newbie_01

2. Shell Programming and Scripting

Indexing each repeating pattern of rows in a column using awk/sed

Discussion started by: ks_reddy

3. Shell Programming and Scripting

awk and or sed command to sum the value in repeating tags in a XML

Discussion started by: bk_12345

4. Shell Programming and Scripting

Grep to isolate a text file line and Awk to select a word?

Discussion started by: Ironguru

5. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Discussion started by: striker4o

6. Shell Programming and Scripting

Removing repeating lines from a data frame (AWK)

Discussion started by: gd9629

7. Shell Programming and Scripting

search text file in file if this file contains necessary text (awk,grep)

Discussion started by: candyme

8. UNIX for Dummies Questions & Answers

Grab Portion of Output Text (sed, grep, awk?)

Discussion started by: compulsiveguile

9. UNIX for Advanced & Expert Users

bash/grep/awk/sed: How to extract every appearance of text between two specific strings

Discussion started by: Pioneer1976

10. Shell Programming and Scripting

Repeating awk command

Discussion started by: nistleloy