awk remove/grab lines from file with pattern from other file
Sorry for the weird title but i have the following problem.
We have several files which have between 10000 and about 500000 lines in them. From these files we want to remove lines which contain a pattern which is located in another file (around 20000 lines, all EAN codes). We also want to get the removed lines in a seperate file so we can check if lines get removed which shouldn't (this has nothing todo with the matching)
pattern file:
main file
With both the above i should get 1 file that only has "0018208944262;A 562381;VNA750E1;50;4999.14;Nikon" in it and one file which has the rest in it.
I tried with the following awk code
awk -F ';' 'NR==FNR {id[$1]; next} $1 in id' filter.csv main.csv but it does not remove the line or put it in another file. I also tried grep but that only works when the filter file has around 100 or so lines.
Does anyone know a way how i can get those 2 results like above?
I want to search a file for a string and then if the string is found I need the line that the string is on - but also the previous two lines from the file (that the pattern will not be found in)
This is on solaris
Can you help? (2 Replies)
hi,,
i hav a file with many lines.i need to remove all lines before a line begginning with a specific pattern from the file because these lines are not required.
Can u help me out with either a perl script or shell script
example:-
if file initially contains lines:
a
b
c
d
.1.2
d
e
f... (2 Replies)
Hi,
I need to get specific parts in a large file.
I need to:
Get a line containing an IP address, and read from there to another line saying ***SNMP-END***
So, I have the start and the end well defined, but the problem is that apparently the awk command using the -F option doesn't work... (17 Replies)
I need to send email to receipient in each block of data in a file which has the sender address under TO and just send that block of data where it ends as COMPANY.
I tried to work this out by getting line numbers of the string HELLO but unable to grab the next block of data to send the next... (5 Replies)
I have a file like this - I want to remove the 2015 (or any four digit #) from column $4 so I can get:
Nov 05 1997 /ifs/inventory2/
for example. Im not sure how. Should I use an if statement with awk?
Jan 16 2015 23:45 /ifs/sql_file
Jan 16 2015 23:45 /ifs/sql_file
Nov 05 2015 1997... (4 Replies)
I am trying to remove lines in the target.txt file if $5 before the - in that file matches sorted_list. I have tried grep and awk. Thank you :).
grep
grep -v -F -f targets.bed sort_list
grep -vFf sort_list targets
awk
awk -F, '
> FILENAME == ARGV {to_remove=1; next}
> ! ($5 in... (2 Replies)
I am trying to remove each line in which $2 is FP or RFP. I believe the below will remove one instance but not both. Thank you :).
file
12
123 FP
11
10 RFP
awk
awk -F'\t' '
$2 != "FP"' file
desired output
12
11 (6 Replies)
Hi,
I'd be grateful for your help with the following. I have a file (file.txt) with 10 columns and about half a million lines, which in simplified form looks like this:
ID Col1 Col2 Col3....
a 4 2 8
b 5 6 1
c 8 4 1
d... (4 Replies)
In the awk below I am trying to remove all lines above and including the pattern Test or Test2. Each block is seperated by a newline and Test2 also appears in the lines to keep but it will always have additional text after it. The Test to remove will not. The awk executed until the || was added... (2 Replies)
In the awk piped to sed below I am trying to format file by removing the odd xxxx_digits and whitespace after, then move the even xxxx_digit to the line above it and add a space between them. There may be multiple lines in file but they are in the same format. The Filename_ID line is the last line... (4 Replies)
Discussion started by: cmccabe
4 Replies
LEARN ABOUT REDHAT
pcregrep
PCREGREP(1) General Commands Manual PCREGREP(1)NAME
pcregrep - a grep with Perl-compatible regular expressions.
SYNOPSIS
pcregrep [-Vcfhilnrsvx] pattern [file] ...
DESCRIPTION
pcregrep searches files for character patterns, in the same way as other grep commands do, but it uses the PCRE regular expression library
to support patterns that are compatible with the regular expressions of Perl 5. See pcre(3) for a full description of syntax and semantics.
If no files are specified, pcregrep reads the standard input. By default, each line that matches the pattern is copied to the standard out-
put, and if there is more than one file, the file name is printed before each line of output. However, there are options that can change
how pcregrep behaves.
Lines are limited to BUFSIZ characters. BUFSIZ is defined in <stdio.h>. The newline character is removed from the end of each line before
it is matched against the pattern.
OPTIONS -V Write the version number of the PCRE library being used to the standard error stream.
-c Do not print individual lines; instead just print a count of the number of lines that would otherwise have been printed. If sev-
eral files are given, a count is printed for each of them.
-ffilename
Read patterns from the file, one per line, and match all patterns against each line. There is a maximum of 100 patterns. Trailing
white space is removed, and blank lines are ignored. An empty file contains no patterns and therefore matches nothing.
-h Suppress printing of filenames when searching multiple files.
-i Ignore upper/lower case distinctions during comparisons.
-l Instead of printing lines from the files, just print the names of the files containing lines that would have been printed. Each
file name is printed once, on a separate line.
-n Precede each line by its line number in the file.
-r If any file is a directory, recursively scan the files it contains. Without -r a directory is scanned as a normal file.
-s Work silently, that is, display nothing except error messages. The exit status indicates whether any matches were found.
-v Invert the sense of the match, so that lines which do not match the pattern are now the ones that are found.
-x Force the pattern to be anchored (it must start matching at the beginning of the line) and in addition, require it to match the
entire line. This is equivalent to having ^ and $ characters at the start and end of each alternative branch in the regular
expression.
SEE ALSO pcre(3), Perl 5 documentation
DIAGNOSTICS
Exit status is 0 if any matches were found, 1 if no matches were found, and 2 for syntax errors or inacessible files (even if matches were
found).
AUTHOR
Philip Hazel <ph10@cam.ac.uk>
Last updated: 15 August 2001
Copyright (c) 1997-2001 University of Cambridge.
PCREGREP(1)