grep or awk looking for repeating text


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting grep or awk looking for repeating text
# 1  
Old 09-09-2011
grep or awk looking for repeating text

I am looking for a way to find the below pattern in text.

Code:
777777,111,08-20-2011 
111111,222,08-20-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
111111,222,07-22-2011


I would like to find a way to print every line in a file where the first 6 numbers match and there is different entries in the second column on the same date.

This would be the output I need.
Code:

777777,111,07-24-2011
777777,222,07-24-2011 


Last edited by radoulov; 09-09-2011 at 05:36 PM.. Reason: Code tags!
# 2  
Old 09-09-2011
OK... this is not very elegant, but should work:
Code:
awk -F, 'a[$1,$3]++' input | sed 's/,[^,]*,/,.*,/' | uniq  | while read i ; do
   grep "$i" input 
done

Code:
awk -F, 'a[$1,$3]++' input

basically looks for duplicates, but doesn't print the first occurance, just the lines with first and thirs field already encountered. Then, the second field is taken out with sed, and the input file is greped for each pattern (1st and 3rd field with 2nd being whatever) .
# 3  
Old 09-09-2011
Hi,

A solution using Perl:
Code:
$ cat infile
777777,111,08-20-2011 
111111,222,08-20-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
111111,222,07-22-2011
$ cat script.pl
use warnings;
use strict;

my (%reg);
while ( <> ) {
        next if /^\s*$/;
        s/\s*$//;
        my @f = split /,/;
        push @{ $reg{ join( ",,", @f[0,2] ) } }, $f[1];
}

foreach my $key ( keys %reg ) {
        next unless @{ $reg{ $key } } > 1;
        my %seen;
        my @values = grep { not $seen{ $_ }++ } @{ $reg{ $key } };
        foreach (@values) { 
                printf "%s%s%s\n", $key =~ /^([^,]+,)/, $_, $key =~ /.*(,.*)$/;
        }
}
$ perl script.pl infile
777777,111,07-24-2011
777777,222,07-24-2011

Regards,
Birei
This User Gave Thanks to birei For This Post:
# 4  
Old 09-10-2011
Not for sale:
Code:
cat INPUTFILE | sed 's/ *$//' | awk -F, '{ print $0, $3, $2, $1 }' | 
sort | uniq -f2 | sed -rn 'G; /(.{6})\n\1/p; s/\n.*//; h' | cut -d' ' -f1
777777,222,07-24-2011
777777,111,07-24-2011

And I know everything about cats. )))
This User Gave Thanks to yazu For This Post:
# 5  
Old 09-11-2011
Another cat scenario

How about if I want to see all users with multiple different entries in the second field? I would want all lines from those users.

input
Code:
777777,111,08-20-2011 
111111,222,08-20-2011
222222,222,08-11-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
111111,222,07-22-2011
222222,111,07-29-2011

output
Code:
777777,111,08-20-2011 
222222,222,08-11-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
222222,111,07-29-2011

I am learning a little at a time. This stuff is amazing!!!

Last edited by radoulov; 09-11-2011 at 01:16 PM.. Reason: Code tags!
# 6  
Old 09-11-2011
If you don't bother about the order you can just remove my awful sed command from the filter.
===
Well, it gives the wrong result... (((

Last edited by yazu; 09-12-2011 at 02:01 AM..
# 7  
Old 09-12-2011
Quote:
Originally Posted by ffdstanley
How about if I want to see all users with multiple different entries in the second field? I would want all lines from those users.

input
Code:
777777,111,08-20-2011 
111111,222,08-20-2011
222222,222,08-11-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
111111,222,07-22-2011
222222,111,07-29-2011

output
Code:
777777,111,08-20-2011 
222222,222,08-11-2011 
777777,111,07-24-2011
777777,222,07-24-2011 
222222,111,07-29-2011

...

Code:
$
$
$ cat input
777777,111,08-20-2011
111111,222,08-20-2011
222222,222,08-11-2011
777777,111,07-24-2011
777777,222,07-24-2011
111111,222,07-22-2011
222222,111,07-29-2011
$
$
$ perl -F, -lane '@c = grep {defined $$_{$F[0]}} @x;
                  if (! defined @c) {
                    push @x, { $F[0], [$F[1], "$.:$_"] };
                  } else {
                    $c[0]->{$F[0]}->[0] .= ",$F[1]" if $c[0]->{$F[0]}->[0] !~ /$F[1]/;
                    push @{$c[0]->{$F[0]}}, "$.:$_";
                  }
                  END {
                    foreach $i (@x) {
                      @v = values %$i;
                      if (@{$v[0]}[0] =~ /,/) {
                        foreach (@{$v[0]}[1..$#{$v[0]}]) {
                          ($key, $val) = split /:/;
                          $z{$key} = $val;
                        }
                      }
                    }
                    print $z{$_} foreach (sort keys %z);
                  }
                 ' input
777777,111,08-20-2011
222222,222,08-11-2011
777777,111,07-24-2011
777777,222,07-24-2011
222222,111,07-29-2011
$
$

tyler_durden
This User Gave Thanks to durden_tyler For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to find=grep or maybe sed/awk for multiple lines of text?

Hi, I am running the following: PASS="username/password" sqlplus -s << EOF | grep -v "^$" $PASS set feedback off set heading off set termout off select name from v\$database ; exit EOF Which gives ERROR: ORA-28002: the password will expire within 5 days PSMP1 (1 Reply)
Discussion started by: newbie_01
1 Replies

2. Shell Programming and Scripting

Indexing each repeating pattern of rows in a column using awk/sed

Hello All, I have data like this in a column. 0 1 2 3 0 3 4 5 6 0 1 2 3 etc. where 0 identifies the start of a pattern in my data. So I need the output like below using either awk/sed. 0 1 (2 Replies)
Discussion started by: ks_reddy
2 Replies

3. Shell Programming and Scripting

awk and or sed command to sum the value in repeating tags in a XML

I have a XML in which <Amt Ccy="EUR">3.1</Amt> tag repeats. This is under another tag <Main>. I need to sum all the values of <Amt Ccy=""> (Ccy may vary) coming under <Main> using awk and or sed command. can some help? Sample looks like below <root> <Main> ... (6 Replies)
Discussion started by: bk_12345
6 Replies

4. Shell Programming and Scripting

Grep to isolate a text file line and Awk to select a word?

I am looking at using grep to locate the line in the text file and them use awk to select a word or words out of it. I know awk needs -v to allow a variable to be used, but also needs -F to allow the break up of the sentence and allow the location of separate variables. $line = grep "1:" File |... (8 Replies)
Discussion started by: Ironguru
8 Replies

5. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

6. Shell Programming and Scripting

Removing repeating lines from a data frame (AWK)

Hey Guys! I have written a code which combines lots of files into one big file(.csv). However, each of the original files had headers on the first line, and now that I've combined the files the headers are interspersed throughout the new combined data frame. For example, throughout the data... (21 Replies)
Discussion started by: gd9629
21 Replies

7. Shell Programming and Scripting

search text file in file if this file contains necessary text (awk,grep)

Hello friends! Help me pls to write correct awk and grep statements for my task: I have got files with name filename.txt It has such structure: Start of file FROM: address@domen.com (12...890) abc DATE: 11/23/2009 on Std SUBJECT: any subject End of file So, I must check, if this file... (4 Replies)
Discussion started by: candyme
4 Replies

8. UNIX for Dummies Questions & Answers

Grab Portion of Output Text (sed, grep, awk?)

Alright, here's the deal. I'm running the following ruby script (output follows): >> /Users/name/bin/acweather.rb -z 54321 -o /Users/name/bin -c Clouds AND Sun 57/33 - Mostly sunny and cool I want to just grab the "57/33" portion, but that's it. I don't want any other portion of the line. I... (5 Replies)
Discussion started by: compulsiveguile
5 Replies

9. UNIX for Advanced & Expert Users

bash/grep/awk/sed: How to extract every appearance of text between two specific strings

I have a text wich looks like this: clid=2 cid=6 client_database_id=35 client_nickname=Peter client_type=0|clid=3 cid=22 client_database_id=57 client_nickname=Paul client_type=0|clid=5 cid=22 client_database_id=7 client_nickname=Mary client_type=0|clid=6 cid=22 client_database_id=6... (3 Replies)
Discussion started by: Pioneer1976
3 Replies

10. Shell Programming and Scripting

Repeating awk command

Hi all, I have an awk command that needs to be ran multiple times in a script on one file containing lots of fields of data. The file look like this (the numbers are made up): 1234 2222 2223 2222 123 2223 3333 2323 3333 3321 3344 4444 The... (2 Replies)
Discussion started by: nistleloy
2 Replies
Login or Register to Ask a Question