Search for string and display those NOT found Post: 302118166

Sponsored Content

Top Forums Shell Programming and Scripting Search for string and display those NOT found Post 302118166 by aigles on Saturday 19th of May 2007 11:57:54 AM

05-19-2007

Registered User

The following awk program display all the strings that have not been found in any of the input files. The strings are to be searched are read from the first input file.

Code:

#!/usr/bin/awk -f
# Filename: not_found

#
# Strings file.
#

NR==FNR {
   string_found[$1] = 0; # 0 = No, >0 = Yes
   next;
}

#
# New data file.
# Build array with strings not yet found
#

FNR==1 {
   for (str in string_found) {
      if (string_found[str] == 0) {
         strings[str]++;
         strings_count++;
      }
   }
   if (strings_count == 0) exit;
}

#
# Input data;
# Search data for strings not yet found
#

{
   for (str in strings) {
      if ($0 ~ str) {
         string_found[str]++;
         delete strings[str];
         if (--strings_count == 0) exit;
      }
   }
}

#
# No more files or allstrings have been found
# Print strings not found
#

END {
   for (str in string_found)
      if (string_found[str] == 0) print str;
}

The file string_list contains the strings to be search.
The file strings_not_found will contain the strings that have not be found in any of the files.
Don't put files string_list and strings_not_found in one of the directories that you want to scan.

Code:

chmod +rx not_found
not_found string_list $(find . -type f) > strings_not_found

Jean-Pierre.

aigles

View Public Profile for aigles

Find all posts by aigles

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Recursive Search and replace only when found string

Hello all ( again ) I will like to search and replace string in text file ok I can loop throw the files like : foreach f ( ` find . -name "*."`) .. but here I like to examine the file if in contain the desired string and so do the sed -e 's/blah/foo/g' thingy on it or there is better way...

2. Shell Programming and Scripting

Display only found string

Is there a way for grep to output only the found string and not the whole line? I have a ksh script which reads in a file and loops through every line looking up on a grep -f list. For it to only display only the string found i pass this to awk as a variable and loop through the list file using...

3. UNIX for Advanced & Expert Users

Parsing String, Search then display rows

Get occurence of "open" considering duplicates(get the last open). Once you are pointing to the last open count 2 rows to get the correct data. Every begin and end statement, there is a "close" and "open". There can be many "close" and "open" within the begin and end statement but we are...

4. Shell Programming and Scripting

Search a String and display only word.

Hello Gurus, Apologies if this Q has been repeated but i was not able to find it :( I have an input file: ------------------------------- Replace DB.Employee as select column1 column2 from DB_T.Emp and DB.Test and DB.Dept and DB_T.Ter; ------------------------

5. Shell Programming and Scripting

Print lines after the search string until blank line is found

All I want is to look for the pattern in the file...If I found it at # places... I want print lines after those pattern(line) until I find a blank line. Log EXAMPLE : MT:Exception caught The following Numbers were affected: 1234 2345 2346 Error java.lang.InternalError:...

6. Shell Programming and Scripting

bash: need to have egrep to return a text string if the search pattern has NOT been found

Hello all, after spending hours of searching the web I decided to create an account here. This is my first post and I hope one of the experts can help. I need to resolve a grep / sed / xargs / awk problem. My input file is just like this: ----------------------------------...

7. Shell Programming and Scripting

Search for a tag and display a message if not found.

Hi All, I am working with a XML file. Below is part for the file. <Emp:Profile> <Emp:Description>Admin</Emp:Description> <Emp:Id>12347</Emp:Id> </Emp:Profile> <Emp:Profile> ...

8. UNIX for Beginners Questions & Answers

Search a string and display its location on the entire string and make a text file

I want to search a small string in a large string and find the locations of the string. For this I used grep "string" -ob <file name where the large string is stored>. Now this gives me the locations of that string. Now how do I store these locations in a text file. Please use CODE tags as...

9. Shell Programming and Scripting

Bash to search file for string and lauch function if found

In the bash below I am searching the filevirus-scan.log for the Infected files: 0 line (in bold) and each line for OK. If both of these are true then the function execute is automatically called and processing starts. If both these conditions are not meet then the line in the file is sent to the...

10. Shell Programming and Scripting

Search string in multiple files and display column wise

I have 3 files. Each of those files have the same number of records, however certain records have different values. I would like to grep the field in ALL 3 files and display the output with only the differences in column wise and if possible line number File1 Name = Joe Age = 33...

LEARN ABOUT DEBIAN

fastacmd

FASTACMD(1)						     NCBI Tools User's Manual						       FASTACMD(1)

NAME

       fastacmd - retrieve FASTA sequences from a BLAST database

SYNOPSIS

       fastacmd [-] [-D N] [-I] [-L start,stop] [-P N] [-S N] [-T] [-a] [-c] [-d str] [-i str] [-l N] [-o filename] [-p type] [-s str] [-t]

DESCRIPTION

       fastacmd retrieves FASTA formatted sequences from a blast(1) database formatted using the `-o' option.  An example fastacmd call would be

								fastacmd -d nr -s p38398

OPTIONS

       A summary of options is included below.

       -      Print usage message

       -D N   Dump the entire database in some format:
	      1      fasta
	      2      GI list
	      3      Accession.version list

       -I     Print database information only (overrides all other options)

       -L start,stop
	      Range of sequence to extract (0 in start is beginning of sequence, 0 in stop is end of sequence, default is whole sequence)

       -P N   Retrieve sequences with Protein Identification Group (PIG) N.

       -S N   Strand on subsequence (nucleotide only):
	      1      top (default)
	      2      bottom

       -T     Print taxonomic information for requested sequence(s)

       -a     Retrieve duplicate accessions

       -c     Use ^A (01) as non-redundant defline separator

       -d str Database (default is nr)

       -i str Input file with GIs/accessions/loci for batch retrieval

       -l N   Line length for sequence (default = 80)

       -o filename
	      Output file (default = stdout)

       -p type
	      Type of file:
	      G      guess (default): look for protein, then nucleotide
	      T      protein
	      F      nucleotide

       -s str Comma-delimited search string(s).  GIs, accessions, loci, or fullSeq-id strings may be used, e.g., 555, AC147927, 'gnl|dbname|tag'

       -t     Definition line should contain target GI only

EXIT STATUS

	      0      Completed successfully.
	      1      An error (other than those below) occurred.
	      2      The BLAST database was not found.
	      3      A search (accession, GI, or taxonomy info) failed.
	      4      No taxonomy database was found.

AUTHOR

       The National Center for Biotechnology Information.

SEE ALSO

       blast(1), /usr/share/doc/blast2/fastacmd.html.

NCBI
								    2005-11-04							       FASTACMD(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Recursive Search and replace only when found string

Discussion started by: umen

2. Shell Programming and Scripting

Display only found string

Discussion started by: Cranie

3. UNIX for Advanced & Expert Users

Parsing String, Search then display rows

Discussion started by: buddyme

4. Shell Programming and Scripting

Search a String and display only word.

Discussion started by: indrajit_u