Sponsored Content
Full Discussion: Match 2 patterns together
Top Forums Shell Programming and Scripting Match 2 patterns together Post 302937126 by abh.kumar on Tuesday 3rd of March 2015 06:06:21 PM
Old 03-03-2015
Match 2 patterns together

How can I quickly print out lines in a datafile which has presence of both patterns in a row of another file. Maybe awk can do it much faster than bash.


Patternfile

Code:
ID1 PAT11 PAT12
ID1 PAT21 PAT22
ID2 PAT31 PAT32

datafile
Code:
headerline
rgthhrhhhhhtnjttntjjtjtjtjtjtjtjjjtPAT31rf3fffffPAT32efgreggeeeeggge
fgegegPAT11.ewdwd88weded((gfefggegrg!///*...PAT12uuhhggggg
rgthhrhhhhhtnjttntjjtjtjtjtjtjtjjjtPAT41rf3fffffPAT32efgreggeeeeggge
fgegegPAT21.ewdwd88weded((gfefggegrg!///*..ttttuuu.PAT22uuhhggggg
fgegegPAT11.ewdwd88weded((gfefggegrg!///*...PAT12uuhhggggg====

The outputs must be split by the ID (col1) that the patterns belong to.


Outputs
Code:
ID1

headerline
fgegegPAT11.ewdwd88weded((gfefggegrg!///*...PAT12uuhhggggg
fgegegPAT21.ewdwd88weded((gfefggegrg!///*..ttttuuu.PAT22uuhhggggg
fgegegPAT11.ewdwd88weded((gfefggegrg!///*...PAT12uuhhggggg====


ID2

headerline
rgthhrhhhhhtnjttntjjtjtjtjtjtjtjjjtPAT31rf3fffffPAT32efgreggeeeeggge

My attempt is very slow in bash,

Code:
while read pat
do
while read data
do
if  grep -q $pat[1] $data
if  grep -q $pat[2] $data
echo $data >> $pat[0]
fi
fi
done < datafile
done < patfile

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing file lines that each match to a different patterns

I have a very large file (10,000,000 lines), that contains a sample id and a property of that sample. I have another file that contains around 1,000,000 lines with sample ids that I want to remove from the original file (create a new file without these lines). I know how to do this in Perl, but it... (9 Replies)
Discussion started by: Jo_puzzled
9 Replies

2. Shell Programming and Scripting

print lines which match multiple patterns

Hi, I have a text file as follows: 11:38:11.054 run1_rdseq avg_2-5 999988.0000 1024.0000 11:50:52.053 run3_rdrand 999988.0000 1135.0 128.0417 11:53:18.050 run4_wrrand avg_2-5 999988.0000 8180.5833 11:55:42.051 run4_wrrand avg_2-5 999988.0000 213.8333 11:55:06.053... (2 Replies)
Discussion started by: annazpereira
2 Replies

3. Shell Programming and Scripting

script to match patterns in 2 different files.

I am new to shell scripting and need some help. I googled, but couldn't find a similar scenario. Basically, I need to rename a datafile. This is the scenario - I have a file, readonly.txt that has 2 columns - file# and name. I have another file,missing_files.txt that has id and name. Both the... (3 Replies)
Discussion started by: mathews
3 Replies

4. Shell Programming and Scripting

Find files that do not match specific patterns

Hi all, I have been searching online to find the answer for getting a list of files that do not match certain criteria but have been unsuccessful. I have a directory that has many jpg files. What I need to do is get a list of the files that do not match both of the following patterns (I have... (21 Replies)
Discussion started by: nikos-koutax
21 Replies

5. Shell Programming and Scripting

Using AWK to match CSV files with duplicate patterns

Dear awk users, I am trying to use awk to match records across two moderately large CSV files. File1 is a pattern file with 173,200 lines, many of which are repeated. The order in which these lines are displayed is important, and I would like to preserve it. File2 is a data file with 456,000... (3 Replies)
Discussion started by: isuewing
3 Replies

6. Shell Programming and Scripting

Retrieve lines that match any occurence in a list of patterns

I have two files. The first containing a header and six columns of data. Example file 1: Number SNP ID dbSNP RS ID Chromosome Result_Call Physical Position 787066 SNP_A-8575395 RS6650104 1 NOCALL 564477 786872 SNP_A-8575125 RS10458597 1 AA ... (13 Replies)
Discussion started by: Selftaught
13 Replies

7. Shell Programming and Scripting

Match 2 different patterns and print the lines

Hi, i have been trying to extract multiple lines based on two different patterns as below:- file1 @jkm|kdo|aas012|192.2.3.1 blablbalablablkabblablabla sjfdsakfjladfjefhaghfagfkafagkjsghfalhfk fhajkhfadjkhfalhflaffajkgfajkghfajkhgfkf jahfjkhflkhalfdhfwearhahfl @jkm|sdf|wud08q|168.2.1.3... (8 Replies)
Discussion started by: redse171
8 Replies

8. UNIX for Dummies Questions & Answers

Match patterns from another file and tag

Hi all, I have a file , which has 6 tab delimited fields, with $3 and $4 subfielded with spaces. I wamt to match cols $2,$3,$4 of tmp1 with tmp2, ..and then flag the 5th col if found. tmp1 1756 Xerm XermA XermB XermC XermD AA TT AA GG A 1 1763 Xerm XermA XermB XermC... (3 Replies)
Discussion started by: senhia83
3 Replies

9. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

10. Shell Programming and Scripting

Match output fields agains two patterns

I need to print field and the next one if field matches 'patternA' and also print 'patternB' fields. echo "some output" | awk '{for(i=1;i<=NF;i++){if($i ~ /patternA/){print $i, $(i+1)}elif($i ~ /patternB/){print $i}}}' This code returnes me 'syntax error'. Pls advise how to do properly. (2 Replies)
Discussion started by: urello
2 Replies
case(n) 						       Tcl Built-In Commands							   case(n)

__________________________________________________________________________________________________________________________________________________

NAME
case - Evaluate one of several scripts, depending on a given value SYNOPSIS
case string ?in? patList body ?patList body ...? case string ?in? {patList body ?patList body ...?} _________________________________________________________________ DESCRIPTION
Note: the case command is obsolete and is supported only for backward compatibility. At some point in the future it may be removed entirely. You should use the switch command instead. The case command matches string against each of the patList arguments in order. Each patList argument is a list of one or more patterns. If any of these patterns matches string then case evaluates the following body argument by passing it recursively to the Tcl interpreter and returns the result of that evaluation. Each patList argument consists of a single pattern or list of patterns. Each pattern may con- tain any of the wild-cards described under string match. If a patList argument is default, the corresponding body will be evaluated if no patList matches string. If no patList argument matches string and no default is given, then the case command returns an empty string. Two syntaxes are provided for the patList and body arguments. The first uses a separate argument for each of the patterns and commands; this form is convenient if substitutions are desired on some of the patterns or commands. The second form places all of the patterns and commands together into a single argument; the argument must have proper list structure, with the elements of the list being the patterns and commands. The second form makes it easy to construct multi-line case commands, since the braces around the whole list make it unneces- sary to include a backslash at the end of each line. Since the patList arguments are in braces in the second form, no command or variable substitutions are performed on them; this makes the behavior of the second form different than the first form in some cases. SEE ALSO
switch(n) KEYWORDS
case, match, regular expression Tcl 7.0 case(n)
All times are GMT -4. The time now is 07:50 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy