Sponsored Content
Top Forums Shell Programming and Scripting Extract all the sentences that matched two patterns Post 302947909 by my_Perl on Tuesday 23rd of June 2015 08:35:11 PM
Old 06-23-2015
Extract all the sentences that matched two patterns

Hi

I have two lists of patterns named A and B consisting of around 200 entries in each and I want to extract all the sentences from a big text file which match atleast one pattern from both A and B.

For example, pattern list A consists of :

ama
ani
ahum
mari

...
...

and pattern list B consists of :

kok
sam
lai
mit

....
....


The sample input text file looks like below:

HTML Code:
This is first sentence with only one pattern ama.
This is second sentence with ahum and kok patterns.
This is the third sentence with only one pattern mit.
This is the fourth sentence consisting of samson and marigold.
This is the fifth sentence with more patterns such as ama, ani, lai and mit.
This is the sixth sentence with no patterns.
..
..

..
The sample output of the script for the given input text file is given below

HTML Code:
This is second sentence with ahum and kok patterns.
This is the fourth sentence consisting of samson and marigold.
This is the fifth sentence with more patterns such as ama, ani, lai and mit.

..
..

..

I need help to write a script to extract all the sentences that matches atleast one pattern from each A and B. That is, the output contains atleast one GREEN pattern and one RED pattern for every sentence. Thanks in advance. Smilie
This User Gave Thanks to my_Perl For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Anyways to find sentences with data format and extract it???

Hi guys,i got this problem which is..i need to find those sentences with date inside and extract them out,the input is somehow like this eg: $DATA42.GANTRY2.GA161147 DISKFILE 2007-10-16 11:56:45 SUPER.OPR \NETS.$Y4CB.#IN ... (4 Replies)
Discussion started by: cyberray
4 Replies

2. Programming

How to extract a sentences of word from a text file.

Hi , i have a text file that contain a story How do i extract the out all the sentences that contain the word Mon. in C++ I only want to show those sentences that contain the word mon eg. Monkey on a tree. Rabbit jumping around the tree. I am very rich, I have lots of money. Today... (1 Reply)
Discussion started by: xiaojesus
1 Replies

3. Shell Programming and Scripting

How to match all array contents and display all highest matched sentences in perl?

Hi, I have an array with 3 words in it and i have to match all the array contents and display the exact matched sentence i.e all 3 words should match with the sentence. Here are sentences. $arr1="Our data suggests that epithelial shape and growth control are unequally affected depending... (5 Replies)
Discussion started by: vanitham
5 Replies

4. Shell Programming and Scripting

How to group matched patterns in different files

Hi, I have a master file that i need to split into multiple files based on matched patterns. sample of my data as follows:- scaff_1 a e 123 130 c_scaff_100 scaff_1 a e 132 138 c_scaff_101 scaff_1 a e 140 150 ... (2 Replies)
Discussion started by: redse171
2 Replies

5. Shell Programming and Scripting

Grab contents between two matched patterns

I am wanting to fetch the content of the table within a file the table begins with data label like N Batch Mn(I) RMSdev I/rms Rmerge Number Nrej Cm%poss AnoCmp MaxRes CMlplc SmRmerge SmMaxRes $$ $$ . #columns of data . . . . . $$ I tried the command awk... (18 Replies)
Discussion started by: piynik
18 Replies

6. Shell Programming and Scripting

Delete lines and the first pattern between 2 matched patterns

Hi, i need help to delete all the lines between 2 matched patterns and the first pattern must be deleted too. sample as follows: inputfile.txt >kump_1 ........................... ........................... >start_0124 dgfhghgfh fgfdgfh fdgfdh >kump_2 ............................. (7 Replies)
Discussion started by: redse171
7 Replies

7. Shell Programming and Scripting

Matched multiple patterns that could be in a same line

Hi, I need help to match pattern started with "RW" in file 1 and with pattern in $1 in file 2 as follows:- File 1 BH /TOTAL=466(423); /POSITIVE=300(257); /UNKNOWN=25(25); BH /F_P=141(141); /F_N=136; /P=4; CC /TAX=!?; /MAX-R=2; CC /VER=2; RW P9610, AR_BSU , T; PAE25, AE_E57... (10 Replies)
Discussion started by: redse171
10 Replies

8. Shell Programming and Scripting

Find matched patterns and print them with other patterns not the whole line

Hi, I am trying to extract some patterns from a line. The input file is space delimited and i could not use column to get value after "IN" or "OUT" patterns as there could be multiple white spaces before the next digits that i need to print in the output file . I need to print 3 patterns in a... (3 Replies)
Discussion started by: redse171
3 Replies

9. Shell Programming and Scripting

Extract all the sentences from a text file that matches a pattern list

Hi I have a big text file. I want to extract all the sentences that matches at least 70% (seventy percent) of the words from each sentence based on a word list called A. Say the format of the text file is as given below: This is the first sentence which consists of fifteen words... (4 Replies)
Discussion started by: my_Perl
4 Replies

10. Shell Programming and Scripting

How to print two matched patterns only from each line?

My input looks like this. # Lot Of CODE Before AppType_somethinglese=$(cat << EOF AppType_test1='test-tool/blatest-tool-ear' AppType_test2='test/blabla-ear' # Lot Of CODE After I want to print text betwen 1) _ and = and 2)/ and ' from each line and exclude lines with "EOF". Output... (2 Replies)
Discussion started by: kchinnam
2 Replies
FNMATCH(3)						     Linux Programmer's Manual							FNMATCH(3)

NAME
fnmatch - match filename or pathname SYNOPSIS
#include <fnmatch.h> int fnmatch(const char *pattern, const char *string, int flags); DESCRIPTION
The fnmatch() function checks whether the string argument matches the pattern argument, which is a shell wildcard pattern. The flags argument modifies the behavior; it is the bitwise OR of zero or more of the following flags: FNM_NOESCAPE If this flag is set, treat backslash as an ordinary character, instead of an escape character. FNM_PATHNAME If this flag is set, match a slash in string only with a slash in pattern and not by an asterisk (*) or a question mark (?) metacharacter, nor by a bracket expression ([]) containing a slash. FNM_PERIOD If this flag is set, a leading period in string has to be matched exactly by a period in pattern. A period is considered to be leading if it is the first character in string, or if both FNM_PATHNAME is set and the period immediately follows a slash. FNM_FILE_NAME This is a GNU synonym for FNM_PATHNAME. FNM_LEADING_DIR If this flag (a GNU extension) is set, the pattern is considered to be matched if it matches an initial segment of string which is followed by a slash. This flag is mainly for the internal use of glibc and is implemented only in certain cases. FNM_CASEFOLD If this flag (a GNU extension) is set, the pattern is matched case-insensitively. FNM_EXTMATCH If this flag (a GNU extension) is set, extended patterns are supported, as introduced by 'ksh' and now supported by other shells. The extended format is as follows, with pattern-list being a '|' separated list of patterns. '?(pattern-list)' The pattern matches if zero or one occurrences of any of the patterns in the pattern-list match the input string. '*(pattern-list)' The pattern matches if zero or more occurrences of any of the patterns in the pattern-list match the input string. '+(pattern-list)' The pattern matches if one or more occurrences of any of the patterns in the pattern-list match the input string. '@(pattern-list)' The pattern matches if exactly one occurrence of any of the patterns in the pattern-list match the input string. '!(pattern-list)' The pattern matches if the input string cannot be matched with any of the patterns in the pattern-list. RETURN VALUE
Zero if string matches pattern, FNM_NOMATCH if there is no match or another nonzero value if there is an error. ATTRIBUTES
For an explanation of the terms used in this section, see attributes(7). +----------+---------------+--------------------+ |Interface | Attribute | Value | +----------+---------------+--------------------+ |fnmatch() | Thread safety | MT-Safe env locale | +----------+---------------+--------------------+ CONFORMING TO
POSIX.1-2001, POSIX.1-2008, POSIX.2. The FNM_FILE_NAME, FNM_LEADING_DIR, and FNM_CASEFOLD flags are GNU extensions. SEE ALSO
sh(1), glob(3), scandir(3), wordexp(3), glob(7) COLOPHON
This page is part of release 4.15 of the Linux man-pages project. A description of the project, information about reporting bugs, and the latest version of this page, can be found at https://www.kernel.org/doc/man-pages/. GNU
2015-12-28 FNMATCH(3)
All times are GMT -4. The time now is 01:28 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy