Find pattern in first field of file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find pattern in first field of file
# 1  
Old 10-17-2015
Find pattern in first field of file

Hello all

I have two files.

1. Pattern.txt - It contains patterns to be matched. It has large number of patterns to be matched.
Code:
Cat Pattern.txt

Ram
Shyam
Mohan
Jhon

I have another file which has actual data and records are delimted by single or multiple spaces.
Code:
2. Content.txt

Cat Content.txt
1@GU00450012@Ram   @@@@ bla1  lba2.
2@GU11950004@David @@@@ uss  Ram 
3@GU11950004@Shyam @@@@ uss   rupa
etc etc

Now I need to find the pattern in content.txt but only in first field. I tried using
Code:
grep -F -f pattern.txt content.txt

It returns me rows like
Code:
2@GU11950004@David @@@@ uss  Ram

Becuase it contains pattern called 'Ram' somewhere
It seems to work but it looks the pattern all over the file. I need to restrict the search to first field only. Hen
I know we I can store patterns using awk in array using
Code:
NR==FNR

but not sure how to search each of them in content.txt in first field only.

Looking for any help.

Thanks

Last edited by krsnadasa; 10-17-2015 at 01:43 AM..
# 2  
Old 10-17-2015
How about:

Code:
#!/bin/sh

#awkcode=`sed 's,\(.*\),$1 ~ /\1/ { print $0 },' <Pattern.txt`
awkcode=`sed 's,\(.*\),$1 ~ /@\1$/ { print $0 },' <Pattern.txt`

awk "
$awkcode
" <Content.txt

Notice my awkcode line assumes the pattern to match is preceded by @ and must match to end of field. Take a look at the commented out awkcode line if it should match just on the name regardless of where located.

Using the commented out one, the following lines would match for Masters:

Code:
8@XXXXXXXX@Masters @@@@ blah masters
8@XXXXXXXX@McMasters @@@@ blah mcmasters

Warning, my quickie solution assumes that the Patterns are pretty simple and do not contain weird characters that might mess up awk.
This User Gave Thanks to cjcox For This Post:
# 3  
Old 10-17-2015
An Awk version
Code:
awk '
FNR==NR {                     # prevents loading Content.txt into array s   
    s[$0]                     # load Pattern.txt file into array s
    next                      # move to process next line of Pattern.txt
}
{
    for (p in s){             # iterate each pattern
        if(match($1, p)){     # check pattern for match against first field
            print             # print record if match is found
            next              # stop pattern iteration for this record, match was found already
        }
    }
}' Pattern.txt Content.txt

Perl version
Copy as search.pl and run as perl search.pl Pattern.txt Context.txt
Code:
#!/usr/bin/perl

# search.pl
# Perl facilities to help avoiding errors
use strict;
use warnings;

# files names to obtain from command line
my $pattern_file = shift or die;
my $context_file = shift or die;

# open pattern file for read
open my $fh, '<', $pattern_file or die;
# load pattern file into an array
my @patterns =<$fh>;
# dismiss patterns file handle
close $fh;
# remove the newline at end of record
chomp(@patterns);

# open context file for read
open $fh, '<', $context_file or die;
# iterate line by line through the context file
while(<$fh>) {
    # obtain the first field
    my ($field) = split;
    # search field for pattern; move to next line if match found
    for my $p (@patterns) {
        $field =~ /$p/ and print and next;
    }
}
# dismiss context file handle
close $fh;

This User Gave Thanks to Aia For This Post:
# 4  
Old 10-17-2015
I have been trying to solve this through sed.
inline!
Code:
sed -n -e '/\@{sed -e '1p' pattern.txt}/p' content.txt

Also tried curlys with many other combination, just can't get it working.
I Like the idea of passing the result of one sed to another with this sub {sed} convention. This is the code I found That put me in this direction
Code:
sed -e '/<TEXT1>/{r File1' -e 'd}' File2

particular example, which is not exactly what I need but tried to modify to fit this one.
No go!

---------- Post updated at 04:15 PM ---------- Previous update was at 04:01 PM ----------

Earlier I was able to pass it with xargs, but I still I would prefer sed only.
Code:
        sed -n '1p' < pattern.txt | xargs -I output sed -n '/\@output/p' content.txt

Been at it for a few hours.
thanks
This User Gave Thanks to Klasform For This Post:
# 5  
Old 10-18-2015
How about
Code:
sed  's/^/^[^ ]*/;s/$/ /' pattern | grep -f- content

These 2 Users Gave Thanks to RudiC For This Post:
# 6  
Old 10-18-2015
Find pattern in first field of file

Many thanks to all

I tried Aia AWK version and it worked for me. However, if RudiC can explain his sed version that would be very helpful.

Thanks
# 7  
Old 10-18-2015
Actually, it's a grep version. The sed command just makes sure grep is working on the first field by adding the needed regex parts (no spaces up to the name, trailing space).
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Get output of multiple pattern match from first field to a file

Hi All, Greetings! I have a file of 40000+ lines with different entries, I need matching entries filterd out to their files based on first filed pattern for the matching : For example: All server1 entries (in field1) to come together with its path in 2nd field. The best output I want... (9 Replies)
Discussion started by: rveri
9 Replies

2. Shell Programming and Scripting

Replace pattern from nth field from a file

I have posted this again as old post is closed and I am not able to reopen. so please consider this new post Input File : 1,A,Completed,06.02_19.36,Jun 30 20:00 2,BBB,Failed,07.04_05.12,Jul 21 19:06 3,CCCCC,New,07.21_03.03,Jul 26 12:57 4,DDDDD,Pending,, I wast output file as: ... (7 Replies)
Discussion started by: Amit Joshi
7 Replies

3. Shell Programming and Scripting

Replace pattern from nth field from a file

$ cat /cygdrive/d/Final2.txt 1,A ,Completed, 07.03_23.01 ,Jun 30 20:00 2,BBB,Pending,, 3,CCCCC,Pending,, 4,DDDDD,Pending,, 5,E,Pending,, 6,FFFF,Pending,, 7,G,Pending,, In the above file 4th field is date which is in MM.DD_HH.MIN format and I need to convert it to as it is there in 5th... (1 Reply)
Discussion started by: Amit Joshi
1 Replies

4. Shell Programming and Scripting

How to find a file with a specific pattern for current sysdate & upon find email the details?

I need assistance with following requirement, I am new to Unix. I want to do the following task but stuck with file creation date(sysdate) Following is the requirement I need to create a script that will read the abc/xyz/klm folder and look for *.err files for that day’s date and then send an... (4 Replies)
Discussion started by: PreetArul
4 Replies

5. UNIX for Dummies Questions & Answers

Match pattern in a field, print pattern only instead of the entire field

Hi ! I have a tab-delimited file, file.tab: Column1 Column2 Column3 aaaaaaaaaa bbtomatoesbbbbbb cccccccccc ddddddddd eeeeappleseeeeeeeee ffffffffffffff ggggggggg hhhhhhtomatoeshhh iiiiiiiiiiiiiiii ... (18 Replies)
Discussion started by: lucasvs
18 Replies

6. Shell Programming and Scripting

Spliting file based field pattern

Hi all, i have file that looks like as below 2263881188,24570896,439,SOLO,SOLO_UNBEATABLE,E,+3.13,+0.00 2263881964,24339077,439,SOLO,SOLO_UNBEATABLE,F,-0.67,+0.00 2263883220,22619162,228,Bell,Bell_MONTHLY,E,-2.04,+0.00 2263883220,22619162,228,Bell,Bell_MONTHLY,F,-2.04,+0.00... (3 Replies)
Discussion started by: raghavendra.cse
3 Replies

7. Shell Programming and Scripting

Find pattern, and then last field from subsequent lines

I've got a log file, of the format Name: network1 Dropped packets: 15618 Dropped packets for IPv6: 27 Dropped packets: 74 Dropped packets for IPv6: 0 Failed RADIUS Authentication procedures: 0 Failed RADIUS Accounting procedures: 0 Name: network2 Dropped packets: 1117 ... (18 Replies)
Discussion started by: Yorkie99
18 Replies

8. Shell Programming and Scripting

Displaying lines of a file where the second field matches a pattern

Howdy. I know this is most likely possible using sed or awk or grep, most likely a combination of them together, but how would one go about running a grep like command on a file where you only try to match your pattern to the second field in a line, space delimited? Example: You are... (3 Replies)
Discussion started by: LordJezoX
3 Replies

9. Shell Programming and Scripting

find pattern and replace another field

HI all I have a problem, I need to replace a field in a file, but only in the lines that have some pattern, example: 100099C01101C00000000059394200701CREoperadora_TX 100099C01201C00000000000099786137OPERADORA_TX2 in the example above I need to change the first field from 1 to 2 only if... (3 Replies)
Discussion started by: sergiioo
3 Replies
Login or Register to Ask a Question