How to find multiple strings on different lines in file?


 
Thread Tools Search this Thread
Operating Systems Solaris How to find multiple strings on different lines in file?
# 1  
Old 10-09-2015
Computer How to find multiple strings on different lines in file?

Hello,

I have spent considerable amount of time breaking my head on this and reached out here. here is the back ground.

OS - Solaris 10

There are two strings '<Orin>sop' and '<Dup>two' which I wanted to look for in a file without the quotes on different lines and ONLY if both strings are present then give me count or output something as I just need to know if both are present or not.

Can someone advise on how to do this?

FYI - tried below but its acting more an OR condition and giving output if atleast one of the string is found.

Code:
egrep -e '<Orin>sop|<Dup>two' /opt/SampleData.txt

Moderator's Comments:
Mod Comment Please use CODE tags when displaying sample input, sample output, and code segments.

Last edited by Don Cragun; 10-09-2015 at 09:00 PM.. Reason: Add CODE tags.
# 2  
Old 10-09-2015
filters are your friend.

Assuming an "old world" fgrep.... then:

Code:
fgrep -l '<Orin>sop' list-of-files-goes-here... /dev/null | tr '\012' '\000' |
   xargs -0 fgrep -l '<Dup>two'

Some explanation. I wrote this to handle weird file as the "list-of-files-goes-here"... I wanted what I wrote to work for files with spaces in their names or other strange characters.

First fgrep finds one string, returns list of files which I then null terminate the file names, and pass into a second fgrep using xargs -0 and that should return a list of file where both strings were found.

Update: You don't need the /dev/null above... I was writing originally for really really old Unix.

Last edited by cjcox; 10-09-2015 at 06:14 PM.. Reason: added note
# 3  
Old 10-09-2015
Quote:
Originally Posted by cjcox
filters are your friend.

Assuming an "old world" fgrep.... then:

Code:
fgrep -l '<Orin>sop' list-of-files-goes-here... /dev/null | tr '\012' '\000' |
   xargs -0 fgrep -l '<Dup>two'

Some explanation. I wrote this to handle weird file as the "list-of-files-goes-here"... I wanted what I wrote to work for files with spaces in their names or other strange characters.

First fgrep finds one string, returns list of files which I then null terminate the file names, and pass into a second fgrep using xargs -0 and that should return a list of file where both strings were found.

Update: You don't need the /dev/null above... I was writing originally for really really old Unix.

thanks, I have tried this but unfortunately this throws the following error
xargs: illegal arguments --0

Any suggestions?
# 4  
Old 10-09-2015
I would try using a single awk (or on Solaris/SunOS systems /usr/xpg4/bin/awk or nawk) script instead of trying to compare outputs from two fgrep commands:
Code:
#!/bin/ksh
if [ $# -lt 1 ]
then	printf 'Usage: %s file...\n' "${0##*/}" >&2
	exit 1
fi
/usr/xpg4/bin/awk '
function check() {
	if(f1 && f2)
		print fn
	fn = FILENAME
	f1 = f2 = 0
}
FNR == 1 {
	check()
}
/<Orin>sop/ {
	f1 = 1
}
/<Dup>two/ {
	f2 = 1
}
END {	check()
}' "$@"

But, if you want to do it using fgrep, you could try something like:
Code:
#!/bin/ksh
if [ $# -lt 1 ]
then	printf 'Usage: %s file...\n' "${0##*/}" >&2
	exit 1
fi
fgrep -l '<Orin>sop' "$@" | sort -r > match1.$$
fgrep -l '<Dup>two' "$@" > match2.$$
fgrep -xf match1.$$ match2.$$
rm -rf match[12].$$

Note that I don't believe that the sort -r half of the pipeline with the 1st fgrep should be needed, but on some systems (including Apple OS X which uses a BSD based grep utility), the command:
Code:
fgrep -x -e pattern -e pattern2 file

will not print lines in file matching pattern2 if pattern is a string matching an initial substring of pattern2. But, if the order of those matching patterns is reversed, it correctly finds all of the matching lines. I don't know if the Solaris fgrep suffers from this issue or not, but with the sort -r, the problem should be avoided if it does exist.

Note also that the 2nd script takes fewer lines (and is 35 characters shorter), running 3 fgrep commands and an rm command (not counting the sort that might not be needed depending on how fgrep -x works on your system) will take longer to run than one invocation of awk unless you are processing LARGE files and the patterns you're looking for appear early in the files. On systems where awk includes the nextfile command, that difference could be removed, but awk, nawk, and /usr/xpg4/bin/awk on Solaris systems do not provide do not provide that extension.
# 5  
Old 10-10-2015
Code:
awk '/<Orin>sop/ {f1=1} /<Dup>two/ {f2=1} f1+f2==2 {print; exit}' /opt/SampleData.txt


Last edited by MadeInGermany; 10-10-2015 at 02:38 AM.. Reason: need 2 vars
# 6  
Old 10-10-2015
I apologize. The scripts I provided in post #4 in this thread solve the problem cjcox mentioned for the case where you are trying to determine which files (in a list of one or more files) contain both strings (not necessarily on the same line). Both of those scripts print the names of files (presented as operands to the script) that contain both strings.

If you are just trying to determine if a single file contains both strings, MadeInGermany's suggestion (using /usr/xpg4/bin/awk or nawk instead of awk) would be more efficient.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to pass strings from a list of strings from another file and create multiple files?

Hello Everyone , Iam a newbie to shell programming and iam reaching out if anyone can help in this :- I have two files 1) Insert.txt 2) partition_list.txt insert.txt looks like this :- insert into emp1 partition (partition_name) (a1, b2, c4, s6, d8) select a1, b2, c4, (2 Replies)
Discussion started by: nubie2linux
2 Replies

2. Shell Programming and Scripting

Exclude lines in a file with matches with multiple Strings using egrep

Hi I have a txt file and I would like to use egrep without using -v option to exclude the lines which matches with multiple Strings. Let's say I have some text in the txt file. The command should not fetch lines if they have strings something like CAT MAT DAT The command should fetch me... (4 Replies)
Discussion started by: Sathwik
4 Replies

3. Shell Programming and Scripting

Find position of character in multiple strings in a file

Greetings. I have a file with information like this: AMNDHRKEOEU?AMNDHRKEOEU?AMNDHRKEOEU?AMNDHRKEOEU? AMNDHRKEEU?AMNDHREOEU? AMNDHREU?AHRKEOEU?AMNDHRKEU?AMNDKEOEU? What I need to extract is the position, in every line, of every occurrence of '?' A desired output would be something... (6 Replies)
Discussion started by: Twinklefingers
6 Replies

4. Shell Programming and Scripting

Finding strings through multiple lines

Hi, I need to search for a multiple line pattern and remove it the pattern is search for (ln number) <TABLE name=*> and if 3 lines below that the line is (ln number) </TABLE> Then remove those 4 lines. Thank you (14 Replies)
Discussion started by: legolad
14 Replies

5. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of... (11 Replies)
Discussion started by: raidzero
11 Replies

6. Shell Programming and Scripting

Extract strings from multiple lines into one csv file

Hi all, Please go through my requirement. I have a log file in the location /opt/WebSphere61/AppServer/profiles/EMQbatchprofile/logs/EMQbatch This file contains the follwing pattern data <af type="tenured" id="42" timestamp="May 14 13:44:13 2011" intervalms="955.624"> <minimum... (8 Replies)
Discussion started by: satish.vampire
8 Replies

7. Shell Programming and Scripting

replace a string with contents of a txt file containing multiple lines of strings

Hello everyone, ive been trying to replace a string "kw01" in an xml file with the contents of a txt file having multiple lines. im a unix newbie and all the sed combinations i tried resulted to being garbled. Below is the contents of the txt file: RAISEDATTIME --------------------... (13 Replies)
Discussion started by: 4dirk1
13 Replies

8. Shell Programming and Scripting

Extract strings from multiple lines into one file -

input file Desired csv output gc_type, date/time, milli secs af, Mar 17 13:09:04 2011, 144.596 af, Mar 20 00:37:37 2011, 144.242 af, ar 20 21:30:59 2011, 108.518 Hi All, Any help in acheiving the above would be appreciated. I would like to parse through lines within one file and... (5 Replies)
Discussion started by: satish.vampire
5 Replies

9. UNIX for Dummies Questions & Answers

grep command to find multiple strings in multiple lines in a file.

I want to search files (basically .cc files) in /xx folder and subfolders. Those files (*.cc files) must contain #include "header.h" AND x() function. I am writing it another way to make it clear, I wanna list of *.cc files that have 'header.h' & 'x()'. They must have two strings, header.h... (2 Replies)
Discussion started by: ritikaSharma
2 Replies

10. Linux

To find multiple strings count in a file

I need to find the line count of multiple strings in a particular file. The strings are as follows: bmgcc bmgccftp bsmsftp bulkftp cctuneftp crbtftp crmpos cso gujhr I am doing manual grep for each of the string to find the line count. The command i am using right now is: grep mark... (3 Replies)
Discussion started by: salaathi
3 Replies
Login or Register to Ask a Question