The following awk program display all the strings that have not been found in any of the input files. The strings are to be searched are read from the first input file.
Code:
#!/usr/bin/awk -f
# Filename: not_found
#
# Strings file.
#
NR==FNR {
string_found[$1] = 0; # 0 = No, >0 = Yes
next;
}
#
# New data file.
# Build array with strings not yet found
#
FNR==1 {
for (str in string_found) {
if (string_found[str] == 0) {
strings[str]++;
strings_count++;
}
}
if (strings_count == 0) exit;
}
#
# Input data;
# Search data for strings not yet found
#
{
for (str in strings) {
if ($0 ~ str) {
string_found[str]++;
delete strings[str];
if (--strings_count == 0) exit;
}
}
}
#
# No more files or allstrings have been found
# Print strings not found
#
END {
for (str in string_found)
if (string_found[str] == 0) print str;
}
The file string_list contains the strings to be search.
The file strings_not_found will contain the strings that have not be found in any of the files.
Don't put files string_list and strings_not_found in one of the directories that you want to scan.
Hello all ( again )
I will like to search and replace string in text file
ok I can loop throw the files like :
foreach f ( ` find . -name "*."`)
.. but here I like to examine the file if in contain the desired string
and so do the sed -e 's/blah/foo/g' thingy on it or there is better way... (1 Reply)
Is there a way for grep to output only the found string and not the whole line?
I have a ksh script which reads in a file and loops through every line looking up on a grep -f list. For it to only display only the string found i pass this to awk as a variable and loop through the list file using... (5 Replies)
Get occurence of "open" considering duplicates(get the last open).
Once you are pointing to the last open count 2 rows to get the correct data.
Every begin and end statement, there is a "close" and "open".
There can be many "close" and "open" within the begin and end statement but
we are... (9 Replies)
Hello Gurus,
Apologies if this Q has been repeated but i was not able to find it :(
I have an input file:
-------------------------------
Replace DB.Employee
as
select
column1
column2
from DB_T.Emp
and DB.Test
and
DB.Dept
and
DB_T.Ter;
------------------------ (4 Replies)
All I want is to look for the pattern in the file...If I found it at # places... I want print lines after those pattern(line) until I find a blank line.
Log EXAMPLE :
MT:Exception caught
The following Numbers were affected:
1234
2345
2346
Error
java.lang.InternalError:... (3 Replies)
Hello all,
after spending hours of searching the web I decided to create an account here. This is my first post and I hope one of the experts can help.
I need to resolve a grep / sed / xargs / awk problem.
My input file is just like this:
----------------------------------... (6 Replies)
Hi All,
I am working with a XML file.
Below is part for the file.
<Emp:Profile>
<Emp:Description>Admin</Emp:Description>
<Emp:Id>12347</Emp:Id>
</Emp:Profile>
<Emp:Profile>
... (7 Replies)
I want to search a small string in a large string and find the locations of the string. For this I used grep "string" -ob <file name where the large string is stored>. Now this gives me the locations of that string. Now how do I store these locations in a text file.
Please use CODE tags as... (7 Replies)
In the bash below I am searching the filevirus-scan.log for the Infected files: 0 line (in bold) and each line for OK.
If both of these are true then the function execute is automatically called and processing starts. If both these conditions are not meet then the line in the
file is sent to the... (2 Replies)
I have 3 files. Each of those files have the same number of records, however certain records have different values. I would like to grep the field in ALL 3 files and display the output with only the differences in column wise and if possible line number
File1
Name = Joe
Age = 33... (3 Replies)
Discussion started by: sidnow
3 Replies
LEARN ABOUT DEBIAN
fastacmd
FASTACMD(1) NCBI Tools User's Manual FASTACMD(1)NAME
fastacmd - retrieve FASTA sequences from a BLAST database
SYNOPSIS
fastacmd [-] [-D N] [-I] [-L start,stop] [-P N] [-S N] [-T] [-a] [-c] [-d str] [-i str] [-l N] [-o filename] [-p type] [-s str] [-t]
DESCRIPTION
fastacmd retrieves FASTA formatted sequences from a blast(1) database formatted using the `-o' option. An example fastacmd call would be
fastacmd -d nr -s p38398
OPTIONS
A summary of options is included below.
- Print usage message
-D N Dump the entire database in some format:
1 fasta
2 GI list
3 Accession.version list
-I Print database information only (overrides all other options)
-L start,stop
Range of sequence to extract (0 in start is beginning of sequence, 0 in stop is end of sequence, default is whole sequence)
-P N Retrieve sequences with Protein Identification Group (PIG) N.
-S N Strand on subsequence (nucleotide only):
1 top (default)
2 bottom
-T Print taxonomic information for requested sequence(s)
-a Retrieve duplicate accessions
-c Use ^A ( 01) as non-redundant defline separator
-d str Database (default is nr)
-i str Input file with GIs/accessions/loci for batch retrieval
-l N Line length for sequence (default = 80)
-o filename
Output file (default = stdout)
-p type
Type of file:
G guess (default): look for protein, then nucleotide
T protein
F nucleotide
-s str Comma-delimited search string(s). GIs, accessions, loci, or fullSeq-id strings may be used, e.g., 555, AC147927, 'gnl|dbname|tag'
-t Definition line should contain target GI only
EXIT STATUS
0 Completed successfully.
1 An error (other than those below) occurred.
2 The BLAST database was not found.
3 A search (accession, GI, or taxonomy info) failed.
4 No taxonomy database was found.
AUTHOR
The National Center for Biotechnology Information.
SEE ALSO blast(1), /usr/share/doc/blast2/fastacmd.html.
NCBI 2005-11-04 FASTACMD(1)