Finding Unique strings which match pattern


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Finding Unique strings which match pattern
# 1  
Old 02-16-2009
Question Finding Unique strings which match pattern

I need to grep for a pattern in a file. Files are huge and have several repeated occurances of the strings which match pattern. I just need the strings which contain the pattern in the output.


For eg.

The contents of my file are as follows. The pattern I want to match by is ABCD

aaaaaaaaaa ABCD_EFGH_XYZ.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbb ABCD_EFGH_XYZ.Table_Name1 cccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccccccccccccccccc ABCD_EFGH_XYZ.Table_Name2 ddddddddddddddd
aaaaaaaaaa ABCD_XYZ_EFGH.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
aaaaaaaaaa ABCD_XYZ_1234.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbb ABCD_EFGH_XYZ.ORDER_SUMM cccccccccccccccccccccccccccccccccccccccc
bbbbbbbbbb ABCD_EFGH_XYZ.PAYMENT_DETAIL cccccccccccccccccccccccccccccccccccccccc
bbbbbbbbbb ABCD_EFGH_XYZ.ORDER_SUMM cccccccccccccccccccccccccccccccccccccccc

What I want to get out of the file is the Unique strings which contain the pattern ABCD. Typically my output will be

ABCD_EFGH_XYZ.Table_Name1
ABCD_EFGH_XYZ.Table_Name2
ABCD_XYZ_EFGH.Table_Name1
ABCD_XYZ_1234.Table_Name1
ABCD_EFGH_XYZ.ORDER_SUMM
ABCD_EFGH_XYZ.PAYMENT_DETAIL

In real life, what I am trying to do is find the Tables which are referenced in a logfile.

Any guidance will be appreciated.
# 2  
Old 02-16-2009
Try:

Code:
awk '!a[$2]++{print $2}' file

Regards
# 3  
Old 02-16-2009
Hammer & Screwdriver Perhaps this approach

Code:
> cat file171 
aaaaaaaaaa ABCD_EFGH_XYZ.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbb ABCD_EFGH_XYZ.Table_Name1 cccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccccccccccccccccc ABCD_EFGH_XYZ.Table_Name2 ddddddddddddddd
aaaaaaaaaa ABCD_XYZ_EFGH.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
aaaaaaaaaa ABCD_XYZ_1234.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbb ABCD_EFGH_XYZ.ORDER_SUMM cccccccccccccccccccccccccccccccccccccccc
bbbbbbbbbb ABCD_EFGH_XYZ.PAYMENT_DETAIL cccccccccccccccccccccccccccccccccccccccc
bbbbbbbbbb ABCD_EFGH_XYZ.ORDER_SUMM cccccccccccccccccccccccccccccccccccccccc

> cut -d" " -f2 file171 | sort -u
ABCD_EFGH_XYZ.ORDER_SUMM
ABCD_EFGH_XYZ.PAYMENT_DETAIL
ABCD_EFGH_XYZ.Table_Name1
ABCD_EFGH_XYZ.Table_Name2
ABCD_XYZ_1234.Table_Name1
ABCD_XYZ_EFGH.Table_Name1

# 4  
Old 02-16-2009
Thanks for your inputs. But I think my intent was wrongly conveyed by the example I gave. The pattern I am searching for can be anywhere in the log file.

Following might be a better representation of the file I have . Basically I want to be able to get all the unique strings in the file which start with pattern ABCD. These strings does not necessarily occur at the same column positions.

Is this possible by combining grep with any other commands?

aaaaaaaaaa ABCD_EFGH_XYZ.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb kkkkk ABCD_EFGH_XYZ.Table_Name1 hhhhhhcccccccc ABCD_EFGH_XYZ.Table_Name2 ddddddddddddd ABCD_XYZ_EFGH.Table_Name1 bbbbbbb ABCD_XYZ_1234.Table_Name1 bbbbbbbbbb ABCD_EFGH_XYZ.ORDER_SUMM ccccc ABCD_EFGH_XYZ.PAYMENT_DETAIL ccccccccccccccccccccccccccccccccccbbbbbbbbbb ABCD_EFGH_XYZ.ORDER_SUMM ccccccccccc


Result :-
ABCD_EFGH_XYZ.ORDER_SUMM
ABCD_EFGH_XYZ.PAYMENT_DETAIL
ABCD_EFGH_XYZ.Table_Name1
ABCD_EFGH_XYZ.Table_Name2
ABCD_XYZ_1234.Table_Name1
ABCD_XYZ_EFGH.Table_Name1

Thanks in advance for your guidance


Quote:
Originally Posted by joeyg
Code:
> cat file171 
aaaaaaaaaa ABCD_EFGH_XYZ.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbb ABCD_EFGH_XYZ.Table_Name1 cccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccccccccccccccccc ABCD_EFGH_XYZ.Table_Name2 ddddddddddddddd
aaaaaaaaaa ABCD_XYZ_EFGH.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
aaaaaaaaaa ABCD_XYZ_1234.Table_Name1 bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbb ABCD_EFGH_XYZ.ORDER_SUMM cccccccccccccccccccccccccccccccccccccccc
bbbbbbbbbb ABCD_EFGH_XYZ.PAYMENT_DETAIL cccccccccccccccccccccccccccccccccccccccc
bbbbbbbbbb ABCD_EFGH_XYZ.ORDER_SUMM cccccccccccccccccccccccccccccccccccccccc

> cut -d" " -f2 file171 | sort -u
ABCD_EFGH_XYZ.ORDER_SUMM
ABCD_EFGH_XYZ.PAYMENT_DETAIL
ABCD_EFGH_XYZ.Table_Name1
ABCD_EFGH_XYZ.Table_Name2
ABCD_XYZ_1234.Table_Name1
ABCD_XYZ_EFGH.Table_Name1

# 5  
Old 02-16-2009
Look into this post.
# 6  
Old 02-16-2009
Thanks for that pointer. That was helpful and is very similar to what I want to accomplish. I asked another q in that thread.
Quote:
Originally Posted by vgersh99
Look into this post.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Rearrange or replace only the second line after pattern match or pattern match

Im using the command below , but thats not the output that i want. it only prints the odd and even numbers. awk '{if(NR%2){print $0 > "1"}else{print $0 > "2"}}' Im hoping for something like this file1: Text hi this is just a test text1 text2 text3 text4 text5 text6 Text hi... (2 Replies)
Discussion started by: invinzin21
2 Replies

2. Shell Programming and Scripting

Finding log files that match number pattern

I have logs files which are generated each day depending on how many processes are running. Some days it could spin up 30 processes. Other days it could spin up 50. The log files all have the same pattern with the number being the different factor. e.g. LOG_FILE_1.log LOG_FILE_2.log etc etc ... (2 Replies)
Discussion started by: atelford
2 Replies

3. Shell Programming and Scripting

awk pattern match and count unique in column

Hi all I have a need of searching some pattern in file by month and then count unique records D11 G11 R11 -------> Pattern available in file S11 Jan$1 to $5 column contains some records in which I want to find unique for this purpose I have written script like below awk '/Jan/ ||... (4 Replies)
Discussion started by: nex_asp
4 Replies

4. Shell Programming and Scripting

Finding a text in files & replacing it with unique strings

Hallo Everyone. I have to admit I'm shell scripting illiterate . I need to find certain strings in several text files and replace each of the string by unique & corresponding text. I prepared a csv file with 3 columns: <filename>;<old_pattern>;<new_pattern> ... (5 Replies)
Discussion started by: gordom
5 Replies

5. Shell Programming and Scripting

Print strings that match pattern with awk

I have a file with many lines which contain strings like .. etc. But with no rule regarding field separators or anything else. I want to print ONLY THE STRING from each line , not the entire line !!! For example from the lines : Flow on service executed with success in . Performances... (5 Replies)
Discussion started by: black_fender
5 Replies

6. Shell Programming and Scripting

calculating unique strings values

Hi, Im looking for a script which will calculate the unique strings column 2 & 3 values in a log as mentioned in example eg:- bag 12 12 bag 18 15 bags 15 13 bags 15 14 blazer 24 24 blazer 33 32 boots 19 15 Result should be:- bag 30 27 bags 30 27... (9 Replies)
Discussion started by: Paulwintech
9 Replies

7. UNIX for Dummies Questions & Answers

finding all files that do not match a certain pattern

I hope I'm asking this the right way -- I've been sending out a lot of resumes and some of them I saw on Craigslist -- so I named the file as 'Craigslist -- (filename)'. Well I noticed that at least one of the files was misspelled as 'Craigslit.' I want to eventually try to write a shell... (5 Replies)
Discussion started by: Straitsfan
5 Replies

8. Shell Programming and Scripting

Performing pattern match for a string that might be intermingle with other strings

I have a log file that display the serial output coming from different places. Sometime the string in search gets clobbered with the other strings and consequently change form. For example: serial ouput: -------------- hello world! done with network configuring asic registers comJan 1... (2 Replies)
Discussion started by: timmylita
2 Replies

9. Shell Programming and Scripting

How to count unique strings

How do I count the total number of unique strings from a file using Perl? Any help is appreciated.. (6 Replies)
Discussion started by: my_Perl
6 Replies

10. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies
Login or Register to Ask a Question