If vgersh99's solution if matching lines are less than 30 lines apart some lines are printed multiple times (overlapping regions).
Try this modification:
Works great! Initially it was outputting 40 million rows, but that was my bad because a "." had made its way into the column of data in file2 and file1 had many rows for which $3 was a "."
Last edited by Geneanalyst; 10-31-2018 at 07:22 AM..
I am trying to automate a process of searching through a set of files and replace all occurrences of a formatted text with the next item in the list of a second file. Basically i need to replace all instances of T????CLK???? with an IP address from a list in a second file. the second file is one IP... (9 Replies)
Hi all,
Please your help with this.
I have 2 files,
File_1-->contains a column of N numbers
File_2-->contains many lines with other info and numbers from File_1 within it.
I would like to get from File_2 all the lines containing within the same line each of N numbers from File_1... (4 Replies)
File1 row is same as column 2 in file 2.
Also file 2 will either start with A, B or C.
And 3rd column in file 2 is always F2.
When column 2 of file 2 matches file1 column, print all those rows into a separate file.
Here is an example.
file 1:
100
103
104
108
file 2:
... (6 Replies)
I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string.
I'm looking to match column1 in file1 to the number... (3 Replies)
Hello to all,
I hope somebody could help me with this:
I have this File1 (real has 5 million of lines):
Number Category
--------------- --------------------------------------
8734060355 3
8734060356 ... (6 Replies)
I'm trying to use awk to do the following. I have file1 with many lines, each containing 5 fields describing an individual set. I have file2 which is a template config file with variable space holders to be replaced by the values in file1. I would like to substitute each set of values in file1 with... (6 Replies)
I have a list of IDs in file1 and a list of sequences in file2. I can print sequences from file2, but I'm asking for help in printing the sequences in the same order as the IDs appear in file1.
file1:
EN_comp12952_c0_seq3:367-1668
ES_comp17168_c1_seq6:1-864
EN_comp13395_c3_seq14:231-1088... (5 Replies)
I have 2 file, file1 and file2. file1 has some keys and file2 has keys+some other data. I want to remove the lines from file2,if the key for that line exists in file1.
file1:
key1
key2
flie2:
key1,moredata
key2,moredata
key3,moredata
Required output:
key3,moredata
Thanks
EDIT:... (6 Replies)
I want to print only the lines in file2 that match file1, in the same order as they appear in file 1
file1
file2
desired output:
I'm getting the lines to match
awk 'FNR==NR {a++}; FNR!=NR && a' file1 file2
but they are in sorted order, which is not what I want:
Can anyone... (4 Replies)
I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited.
I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Discussion started by: cmccabe
6 Replies
LEARN ABOUT OSX
comm
COMM(1) BSD General Commands Manual COMM(1)NAME
comm -- select or reject lines common to two files
SYNOPSIS
comm [-123i] file1 file2
DESCRIPTION
The comm utility reads file1 and file2, which should be sorted lexically, and produces three text columns as output: lines only in file1;
lines only in file2; and lines in both files.
The filename ``-'' means the standard input.
The following options are available:
-1 Suppress printing of column 1.
-2 Suppress printing of column 2.
-3 Suppress printing of column 3.
-i Case insensitive comparison of lines.
Each column will have a number of tab characters prepended to it equal to the number of lower numbered columns that are being printed. For
example, if column number two is being suppressed, lines printed in column number one will not have any tabs preceding them, and lines
printed in column number three will have one.
The comm utility assumes that the files are lexically sorted; all characters participate in line comparisons.
ENVIRONMENT
The LANG, LC_ALL, LC_COLLATE, and LC_CTYPE environment variables affect the execution of comm as described in environ(7).
EXIT STATUS
The comm utility exits 0 on success, and >0 if an error occurs.
SEE ALSO cmp(1), diff(1), sort(1), uniq(1)STANDARDS
The comm utility conforms to IEEE Std 1003.2-1992 (``POSIX.2'').
The -i option is an extension to the POSIX standard.
HISTORY
A comm command appeared in Version 4 AT&T UNIX.
BUGS
Input lines are limited to LINE_MAX (2048) characters in length.
BSD January 26, 2005 BSD