If vgersh99's solution if matching lines are less than 30 lines apart some lines are printed multiple times (overlapping regions).
Try this modification:
Works great! Initially it was outputting 40 million rows, but that was my bad because a "." had made its way into the column of data in file2 and file1 had many rows for which $3 was a "."
Last edited by Geneanalyst; 10-31-2018 at 07:22 AM..
I am trying to automate a process of searching through a set of files and replace all occurrences of a formatted text with the next item in the list of a second file. Basically i need to replace all instances of T????CLK???? with an IP address from a list in a second file. the second file is one IP... (9 Replies)
Hi all,
Please your help with this.
I have 2 files,
File_1-->contains a column of N numbers
File_2-->contains many lines with other info and numbers from File_1 within it.
I would like to get from File_2 all the lines containing within the same line each of N numbers from File_1... (4 Replies)
File1 row is same as column 2 in file 2.
Also file 2 will either start with A, B or C.
And 3rd column in file 2 is always F2.
When column 2 of file 2 matches file1 column, print all those rows into a separate file.
Here is an example.
file 1:
100
103
104
108
file 2:
... (6 Replies)
I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string.
I'm looking to match column1 in file1 to the number... (3 Replies)
Hello to all,
I hope somebody could help me with this:
I have this File1 (real has 5 million of lines):
Number Category
--------------- --------------------------------------
8734060355 3
8734060356 ... (6 Replies)
I'm trying to use awk to do the following. I have file1 with many lines, each containing 5 fields describing an individual set. I have file2 which is a template config file with variable space holders to be replaced by the values in file1. I would like to substitute each set of values in file1 with... (6 Replies)
I have a list of IDs in file1 and a list of sequences in file2. I can print sequences from file2, but I'm asking for help in printing the sequences in the same order as the IDs appear in file1.
file1:
EN_comp12952_c0_seq3:367-1668
ES_comp17168_c1_seq6:1-864
EN_comp13395_c3_seq14:231-1088... (5 Replies)
I have 2 file, file1 and file2. file1 has some keys and file2 has keys+some other data. I want to remove the lines from file2,if the key for that line exists in file1.
file1:
key1
key2
flie2:
key1,moredata
key2,moredata
key3,moredata
Required output:
key3,moredata
Thanks
EDIT:... (6 Replies)
I want to print only the lines in file2 that match file1, in the same order as they appear in file 1
file1
file2
desired output:
I'm getting the lines to match
awk 'FNR==NR {a++}; FNR!=NR && a' file1 file2
but they are in sorted order, which is not what I want:
Can anyone... (4 Replies)
I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited.
I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Discussion started by: cmccabe
6 Replies
LEARN ABOUT BSD
bm
BM(PUBLIC) BM(PUBLIC)
NAME
bm - search a file for a string
SYNOPSIS
/usr/public/bm [ option ] ... [ strings ] [ file ]
DESCRIPTION
Bm searches the input files (standard input default) for lines matching a string. Normally, each line found is copied to the standard out-
put. It is blindingly fast. Bm strings are fixed sequences of characters: there are no wildcards, repetitions, or other features of regu-
lar expressions. Bm is also case sensitive. The following options are recognized.
-x (Exact) only lines matched in their entirety are printed
-l The names of files with matching lines are listed (once) separated by newlines.
-c Only a count of the number of matches is printed
-e string
The string is the next argument after the -e flag. This allows strings beginning with '-'.
-h No filenames are printed, even if multiple files are searched.
-n Each line is preceded by the number of characters from the beginning of the file to the match.
-s Silent mode. Nothing is printed (except error messages). This is useful for checking the error status.
-f file
The string list is taken from the file.
Unless the -h option is specified the file name is shown if there is more than one input file. Care should be taken when using the charac-
ters $ * [ ^ | ( ) and in the strings (listed on the command line) as they are also meaningful to the Shell. It is safest to enclose the
entire expression argument in single quotes ' '.
Bm searches for lines that contain one of the (newline-separated) strings, using the Boyer-Moore algorithm. It is far superior in terms of
speed to the grep (egrep, fgrep) family of pattern matchers for fixed-pattern searching, and its speed increases with pattern length.
SEE ALSO grep(1)DIAGNOSTICS
Exit status is 0 if any matches are found, 1 if none, 2 for syntax errors or inaccessible files.
AUTHOR
Peter Bain (pdbain@wateng), with modifications suggested by John Gilmore
BUGS
Only 100 patterns are allowed.
Patterns may not contain newlines.
If a line (delimited by newlines, and the beginning and end of the file) is longer than 8000 charcters (e.g. in a core dump), it will not
be completely printed.
If multiple patterns are specified, the order of the ouput lines is not necessarily the same as the order of the input lines.
A line will be printed once for each different string on that line.
The algorithm cannot count lines.
The -n and -c work differently from fgrep.
The -v, -i, and -b are not available.
4th Berkeley Distribution 8 July 1985 BM(PUBLIC)