Retrieve lines that match any occurence in a list of patterns
I have two files. The first containing a header and six columns of data.
Example file 1:
Number SNP ID dbSNP RS ID Chromosome Result_Call Physical Position
787066 SNP_A-8575395 RS6650104 1 NOCALL 564477
786872 SNP_A-8575125 RS10458597 1 AA 564621
787077 SNP_A-8575389 RS8179414 1 NOCALL 565400
787080 SNP_A-8575376 RS9645428 1 NOCALL 566810
920528 SNP_A-8709646 RS12565286 1 AA 721290
710267 SNP_A-8497791 RS12082473 1 AA 740857
I wish to retrieve those lines where the third column (dbSNP RS ID) matches a number from file 2.
Example file 2:
rs10458597
rs12565286
rs12082473
rs3094315
rs2286139
rs11240776
In this example the first, second, and third lines from file 2 match the second, fifth, and sixth data rows in file 1.
The required output would be:
Number SNP ID dbSNP RS ID Chromosome Result_Call Physical Position
786872 SNP_A-8575125 RS10458597 1 AA 564621
920528 SNP_A-8709646 RS12565286 1 AA 721290
710267 SNP_A-8497791 RS12082473 1 AA 740857
I have found this code:
But the output contains all lines from file 1. My knowledge of awk is insufficient to see where it is going wrong. The 'trial and error' approach in altering bits of the code have been unsuccesful.
Hi,
From the pattern mentioned below remove lines based on pattern range.
Conditions
1 Look For all lines starting with ALTER TABLE and Ending with ; and contains the word MOVE.I wanto to remove these lines from the file sample below.
Note : The above pattern list could be found in... (1 Reply)
Hi,
I would like to know how can I get lines from a text file that match no more than 2 '>'. Example:
Input file:
a >cr1 4 a>b b>c
a >cr2 5 a>b
Output file:
a >cr2 5 a>b
Thanks in advance (2 Replies)
I have a very large file (10,000,000 lines), that contains a sample id and a property of that sample. I have another file that contains around 1,000,000 lines with sample ids that I want to remove from the original file (create a new file without these lines).
I know how to do this in Perl, but it... (9 Replies)
In the past I needed a help with the problem how to search for pattern after the occurence of another pattern which is described in this thread:
https://www.unix.com/shell-programmin...-pattern1.html
Now I would need something quite similar, only the pattern which is to be searched must be... (3 Replies)
Hi,
I have a file, which contains the following log data.
I am trying to print fromt he file the following data:
I have tried using sed, but I am getting from the first pattern
Thanks for your help. (5 Replies)
I have an output file which gives me the timely status of a server.
Sample file:
March 11 2014
21:10, 1, 2, 3, 4, 5, 6, 7, 8, 9, x, y, z...
21:05, 1, 2, 3, 4, 5, 6, 7, 8, 9, x, y, z...
21:00, 1, 2, 3, 4,... (3 Replies)
Hi,
i have been trying to extract multiple lines based on two different patterns as below:-
file1
@jkm|kdo|aas012|192.2.3.1 blablbalablablkabblablabla
sjfdsakfjladfjefhaghfagfkafagkjsghfalhfk
fhajkhfadjkhfalhflaffajkgfajkghfajkhgfkf
jahfjkhflkhalfdhfwearhahfl
@jkm|sdf|wud08q|168.2.1.3... (8 Replies)
In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies
LEARN ABOUT SUSE
split
SPLIT(1) User Commands SPLIT(1)NAME
split - split a file into pieces
SYNOPSIS
split [OPTION]... [INPUT [PREFIX]]
DESCRIPTION
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default size is 1000 lines, and default PREFIX is `x'. With no INPUT, or
when INPUT is -, read standard input.
Mandatory arguments to long options are mandatory for short options too.
-a, --suffix-length=N
use suffixes of length N (default 2)
-b, --bytes=SIZE
put SIZE bytes per output file
-C, --line-bytes=SIZE
put at most SIZE bytes of lines per output file
-d, --numeric-suffixes
use numeric suffixes instead of alphabetic
-l, --lines=NUMBER
put NUMBER lines per output file
--verbose
print a diagnostic just before each output file is opened
--help display this help and exit
--version
output version information and exit
SIZE may have a multiplier suffix: b 512, kB 1000, K 1024, MB 1000*1000, M 1024*1024, GB 1000*1000*1000, G 1024*1024*1024, and so on for T,
P, E, Z, Y.
AUTHOR
Written by Torbjorn Granlund and Richard M. Stallman.
REPORTING BUGS
Report split bugs to bug-coreutils@gnu.org
GNU coreutils home page: <http://www.gnu.org/software/coreutils/>
General help using GNU software: <http://www.gnu.org/gethelp/>
COPYRIGHT
Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
SEE ALSO
The full documentation for split is maintained as a Texinfo manual. If the info and split programs are properly installed at your site,
the command
info coreutils 'split invocation'
should give you access to the complete manual.
GNU coreutils 7.1 July 2010 SPLIT(1)