Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Grep -v lines starting with pattern 1 and not matching pattern 2 Post 302949759 by drl on Wednesday 15th of July 2015 07:15:17 PM
Old 07-15-2015
Hi.

An alternative to grep, a Ruby code, glark, can do this:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate multiple matches, inverted, glark.
# For glark, see repository (Debian, Mac OS X "port"), otherwise:
# https://github.com/jpace/glark

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C ruby glark

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " Expected output:"
cat expected-output.txt

# Remove lines that start with Eat  and does not contain apple
pl " Results (no color, no line numbers):"
glark -U -N -v --and 0 '/^Eat/' '!/apple/' $FILE

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
ruby 1.8.7 (2008-08-11 patchlevel 72) [x86_64-linux]
glark version 1.8.0

-----
 Input data file data1:
Drink a soda
Eat a banana
Eat multiple bananas
Drink an apple juice
Eat an apple
Eat multiple apples
Fat fish

-----
 Expected output:
Drink a soda
Drink an apple juice
Eat an apple
Eat multiple apples
Fat fish

-----
 Results (no color, no line numbers):
Drink a soda
Drink an apple juice
Eat an apple
Eat multiple apples
Fat fish

Best wishes ... cheers, drl
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

comment/delete a particular pattern starting from second line of the matching pattern

Hi, I have file 1.txt with following entries as shown: 0152364|134444|10.20.30.40|015236433 0233654|122555|10.20.30.50|023365433 ** ** ** In file 2.txt I have the following entries as shown: 0152364|134444|10.20.30.40|015236433 0233654|122555|10.20.30.50|023365433... (4 Replies)
Discussion started by: imas
4 Replies

2. Shell Programming and Scripting

counting the lines matching a pattern, in between two pattern, and generate a tab

Hi all, I'm looking for some help. I have a file (very long) that is organized like below: >Cluster 0 0 283nt, >01_FRYJ6ZM12HMXZS... at +/99% 1 279nt, >01_FRYJ6ZM12HN12A... at +/99% 2 281nt, >01_FRYJ6ZM12HM4TS... at +/99% 3 283nt, >01_FRYJ6ZM12HM946... at +/99% 4 279nt,... (4 Replies)
Discussion started by: d.chauliac
4 Replies

3. Shell Programming and Scripting

Concatenate lines between lines starting with a specific pattern

Hi, I have a file such as: --- >contig00001 length=35524 numreads=2944 gACGCCGCGCGCCGCGGCCAGGGCTGGCCCA CAGGCCGCGCGGCGTCGGCTGGCTGAG >contig00002 length=4242 numreads=43423 ATGCCGAAGGTCCGCCTGGGGCTGG CGCCGGGAGCATGTAGCG --- I would like to concatenate the lines not starting with ">"... (9 Replies)
Discussion started by: s052866
9 Replies

4. Shell Programming and Scripting

delete lines starting with a pattern

i have a file sample.txt containing i want to delete lines starting with 123 neglecting spaces and tabs. but not lines containing 123. i.e. i want files sample.txt as help me thanxx (4 Replies)
Discussion started by: yashwantkumar
4 Replies

5. Shell Programming and Scripting

Grep the word from pattern line and update in subsequent lines till next pattern line reached

Hi, I have got the below requirement. please suggest. I have a file like, Processing Item is: /data/ing/cfg2/abc.txt /data/ing/cfg3/bgc.txt Processing Item is: /data/cmd/for2/ght.txt /data/kernal/config.klgt.txt I want to process the above file to get the output file like, ... (5 Replies)
Discussion started by: rbalaj16
5 Replies

6. Shell Programming and Scripting

Sed: printing lines AFTER pattern matching EXCLUDING the line containing the pattern

'Hi I'm using the following code to extract the lines(and redirect them to a txt file) after the pattern match. But the output is inclusive of the line with pattern match. Which option is to be used to exclude the line containing the pattern? sed -n '/Conn.*User/,$p' > consumers.txt (11 Replies)
Discussion started by: essem
11 Replies

7. Shell Programming and Scripting

sed and awk usage to grep a pattern 1 and with reference to this grep a pattern 2 and pattern 3

Hi , I have a file where i have modifed certain things compared to original file . The difference of the original file and modified file is as follows. # diff mir_lex.c.modified mir_lex.c.orig 3209c3209 < if(yy_current_buffer -> yy_is_our_buffer == 0) { --- >... (5 Replies)
Discussion started by: breezevinay
5 Replies

8. Shell Programming and Scripting

Grep lines before a pattern having some other pattern

Hi All, I am trying to fetch lines before a pattern, I got to know about -B flag in grep but we have to pass the number to get those lines before some pattern say (X), now what if I want to get line/s with some other pattern say (Y) before X pattern? How to get about it? please help. Input:... (5 Replies)
Discussion started by: dips_ag
5 Replies

9. Shell Programming and Scripting

sed -- Find pattern -- print remainder -- plus lines up to pattern -- Minus pattern

The intended result should be : PDF converters 'empty line' gpdftext and pdftotext?xml version="1.0"?> xml:space="preserve"><note-content version="0.1" xmlns:/tomboy/link" xmlns:size="http://beatniksoftware.com/tomboy/size">PDF converters gpdftext and pdftotext</note-content>... (9 Replies)
Discussion started by: Klasform
9 Replies

10. UNIX for Beginners Questions & Answers

Grep file starting from pattern matching line

I have a file with a list of references towards the end and want to apply a grep for some string. text .... @unnumbered References @sp 1 @paragraphindent 0 2017. @strong{Chalenski, D.A.}; Wang, K.; Tatanova, Maria; Lopez, Jorge L.; Hatchell, P.; Dutta, P.; @strong{Small airgun... (1 Reply)
Discussion started by: kristinu
1 Replies
GLARK(1)							    glark 1.8.0 							  GLARK(1)

NAME
glark - Search text files for complex regular expressions SYNOPSIS
glark [options] expression file ... DESCRIPTION
Similar to "grep", "glark" offers: Perl-compatible regular expressions, color highlighting of matches, context around matches, complex expressions ("and" and "or"), grep output emulation, and automatic exclusion of non-text files. Its regular expressions should be familiar to persons experienced in Perl, Python, or Ruby. File may also be a list of files in the form of a path. OPTIONS
Input -0[nnn] Use nn (octal) as the input record separator. If nnn is omitted, use ' ' as the record separator, which treats paragraphs as lines. -d ACTION, --directories=ACTION Directories are processed according to the given ACTION, which by default is "read". If ACTION is "recurse", each file in the directory is read and each subdirectory is recursed into (equivalent to the "-r" option). If ACTION is "skip", directories are not read, and no message is produced. --binary-files=TYPE Specify how to handle binary files, thus overriding the default behavior, which is to denote the binary files that match the expression, without displaying the match. TYPE may be one of: "binary", the default; "without-match", which results in binary files being skipped; and "text", which results in the binary file being treated as text, the display of which may have bad side effects with the terminal. Note that the default behavior has changed; this previously was to skip binary files. The same effect may be achieved by setting binary-files to "without-match" in the ~/.glarkrc file. --[with-]basename EXPR, --[with-]name EXPR Search only files whose names match the given regular expression. As in find(1), this works on the basename of the file. This expression can be negated and modified with "!" and "i", such as '!/io.[hc]$/i'. --[with-]fullname EXPR, --[with-]path EXPR Search only files whose names, including path, match the given regular expression. As in find(1), this works on the path of the file. This expression can be negated and modified with "!" and "i", such as '!/Dialog.*.java$/i'. --without-basename EXPR, --without-name EXPR Do not search files with base names matching the given regular expression. --without-fullname EXPR, --without-path EXPR Do not search files with full names matching the given regular expression. -M, --exclude-matching Do not search files whose names match the given expression. This can be useful for finding external references to a file, or to a class (assuming that class names match file names). -r, --recurse Recurse through directories. Equivalent to --directories=read. --split-as-path(=VALUE), --no-split-as-path Sets whether, if a command line argument includes the path separator (such as ":"), the argument should be split by the path sepa- rator. This functionality is useful for using environment variables as input, such as $PATH and $CLASSPATH, which are automatically split and processed as a list of files and directories. The default value of this option is "true". "--no-split-as-path" is equiv- alent to "--split-as-path=false". --size-limit=SIZE If provided, files no larger than SIZE bytes will be searched. This is useful when running the "--recurse" option on directories that may contain large files. Matching -a NUM expr1 expr2 --and NUM expr1 expr2 --and=NUM expr1 expr2 ( expr1 --and=NUM expr2 ) Match both of the two expressions, within NUM lines of each other. See the EXPRESSIONS section for more information. -b NUM[%], --before NUM[%] Restrict the search to before the given location, which represents either the number of the last line within the valid range, or the percentage of lines to be searched. --after NUM[%] Restrict the search to after the given section, which represents either the number of the first line within the valid range, or the percentage of lines to be skipped. -f FILE, --file=FILE Use the lines in the given file as expressions. Each line consists of a regular expression. -i, --ignore-case Match regular expressions without regard to case. The default is case sensitive. -m NUM, --match-limit NUM Find only the first NUM matches in each file. -o expr1 expr2 --or expr1 expr2 ( expr1 --or expr2 ) Match either of the two expressions. See the EXPRESSIONS section for more information. -R, --range NUM[%],NUM[%] Restrict the search to the given range of lines, as either line numbers or a percentage of the length of the file. -v, --invert-match Show lines that do not match the expression. -w, --word, --word-regexp Put word boundaries around each pattern, thus matching only where the full word(s) occur in the text. Thus, "glark -w Foo" is the same as "glark '/Foo/'". -x, --line-regexp Select only where the entire line matches the pattern(s). --xor expr1 expr2 ( expr1 --xor expr2 ) Match either of the two expressions, but not both. See the EXPRESSIONS section for more information. Output -A NUM, --after-context=NUM Print NUM lines after a matched expression. -B NUM, --before-context=NUM Print NUM lines before a matched expression. -C [NUM], -NUM, --context[=NUM] Output NUM lines of context around a matched expression. The default is no context. If no NUM is given for this option, the number of lines of context is 2. -c, --count Instead of normal output, display only the number of matches in each file. -F, --file-color COLOR Specify the highlight color for file names. See the HIGHLIGHTING section for the values that can be used. --no-filter Display the entire file(s), presumably with matches highlighted. -g, --grep Produce output like the grep default: file names, no line numbers, and a single line of the match, which will be the first line for matches that span multiple lines. If the EMACS environment variable is set, this value is set to true. Thus, running glark under Emacs results in the output format expected by Emacs. -h, --no-filename Do not display the names of the files that matched. -H, --with-filename Display the names of the files that matched. This is the default behavior. -l, --files-with-matches Print only the names of the file that matched the expression. -L, --files-without-match Print only the names of the file that did not match the expression. --label=NAME Use NAME as output file name. This is useful when reading from standard input. -n, --line-number Display the line numbers. This is the default behavior. -N, --no-line-number Do not display the line numbers. --line-number-color Specify the highlight color for line numbers. This defaults to none (no highlighting). See the HIGHLIGHTING section for more infor- mation. -T, --text-color COLOR Specify the highlight color for text. See the HIGHLIGHTING section for more information. --text-color-NUM COLOR Specify the highlight color for the regular expression capture NUM. Colors are used by regular expressions in the order they are created (that is, with the "--and" and "--or" option), or with captures within a regular expression (such as '/(this)|(that)/'). is See the HIGHLIGHTING section for more information. -u, --highlight=[FORMAT] Enable highlighting. This is the default behavior. Format is "single" (one color) or "multi" (different color per regular expres- sion). See the HIGHLIGHTING section for more information. -U, --no-highlight Disable highlighting. -y, --extract-matches Display only the region that matched, not the entire line. If the expression contains "backreferences" (i.e., expressions bounded by "( ... )"), then only the portion captured will be displayed, not the entire line. This option is useful with "-g", which elimi- nates the default highlighting and display of file names. -Z, --null When in -l mode, write file names followed by the ASCII NUL character ('') instead of ' '. Debugging/Errors -?, --help Display the help message. --config Display the settings glark is using, and exit. Since this is run after configuration files are read, this may be useful for deter- mining values of configuration parameters. --explain Write the expression in a more legible format, useful for debugging. -q, -s, --quiet, --no-messages Suppress warnings. -Q, --no-quiet Enable warnings. This is the default. -V, --version Display version information. --verbose Display normally suppressed output, for debugging purposes. EXPRESSIONS
Regular Expressions Regular expressions are expected to be in the Perl/Ruby format. "perldoc perlre" has more general information. The expression may be of either form: something /something/ There is no difference between the two forms, except that with the latter, one can provide the "ignore case" modifier, thus matching "some- Thing" and "SoMeThInG": % glark /something/i Note that this is redundant with the "-i" ("--ignore-case") option. All regular expression characters and options are available, such as "w", ".*?" and "[^9]". For example: % glark '[a-z][^d]d{1,3}.*s*>>s*d+s*.*& +d{3}' If the and and or options are not used, the last non-option is considered to be the expression to be matched. In the following, "printf" is used as the expression. % glark -w printf *.c POSIX character classes (e.g., [[:alpha:]]) are also supported. Complex Expressions Complex expressions combine regular expressions (and complex expressions themselves) with logical AND, OR, and XOR operators. Both prefix and infix notation is supported. -o expr1 expr2 --or expr1 expr2 --end-of-or ( expr1 --or expr2 ) Match either of the two expressions. The results of the two forms are equivalent. In the latter syntax, the --end-of-or is optional. -a number expr1 expr2 --and=number expr1 expr2 --end-of-and ( expr1 --and number expr2 ) Match both of the two expressions, within <number> lines of each other. As with the "or" option, the results of the two forms are equivalent, and the "--end-of-and" is optional. The forms "-aNUM" and "--and=NUM" are also supported. If the number provided is -1 (negative one), the distance is considered to be "infinite", and thus, the condition is satisfied if both expressions match within the same file. If the number provided is 0 (zero), the condition is satisfied if both expressions match on the same line. If the --and option is used, and the follow argument is not numeric, then the value defaults to zero. A warning will be issued if the value given in the number position does not appear to be numeric. --xor expr1 expr2 --end-of-xor ( expr1 --xor expr2 ) Match either of the two expressions, but not both. "--end-of-xor" is optional. Negated Regular Expressions Regular expressions can be negated, by being prefixed with '!', and using the '/' quote characters around the expression, such as: !/expr/ This has the effect of "match anything other than this". For a single expression, this is no different than the -v/--invert-match option, but it can be useful in complex expressions, such as: --and 0 this '!/that/' which means "match and line that has "this", but not "that". HIGHLIGHTING
Matching patterns and file names can be highlighted using ANSI escape sequences. Both the foreground and the background colors may be specified, from the following: black blue cyan green magenta red white yellow The foreground may have any number of the following modifiers applied: blink bold concealed reverse underline underscore The format is "MODIFIERS FOREGROUND on BACKGROUND". For example: red black on yellow (the default for patterns) reverse bold (the default for file names) green on white bold underline red on cyan By default text is highlighted as black on yellow. File names are written in reversed bold text. EXAMPLES
Basic Usage % glark format *.h Searches for "format" in the local .h files. % glark --ignore-case format *.h Searches for "format" without regard to case. Short form: % glark -i format *.h % glark --context=6 format *.h Produces 6 lines of context around any match for "format". Short forms: % glark -C 6 format *.h % glark -6 format *.h % glark --exclude-matching Object *.java Find references to "Object", excluding the files whose names match "Object". Thus, SessionBean.java would be searched; EJBOb- ject.java would not. Short form: % glark -M Object *.java % glark --grep --extract-matches 'w+.printStackTrace(.*)' *.java Show where exceptions are dumped. Note that the "--grep" option is used, thus turning off highlighting and display of file names. If the "--no-filename" option is used, the output will consist of only the matching portions. The short form of this command is: % glark -gy 'w+.printStackTrace(.*)' *.java % glark --grep --extract-matches '(w+).printStackTrace(.*)' *.java Show only the variable name of exceptions that are dumped. Short form: % glark -gy '(w+).printStackTrace(.*)' *.java % who | glark -gy '^(S+)s+S+s*May 15' Display only the names of users who logged in today. % glark -l 'w{25,}' *.txt Display (only) the names of the text files that contain "words" at least 25 characters long. % glark --files-without-match '"w+"' Display (only) the names of the files that do not contain strings consisting of a single word. Short form: % glark -L '"w+"' % for i in *.jar; do jar tvf $i | glark --LABEL=$i Exception; done Search the files for 'Exception', displaying the jar file name instead of the standard input marker ('-'). Highlighting % glark --text-color "red on white" '[[:digit:]]{5}' *.c Display (in red text on a white background) occurrences of exactly 5 digits. Short form: % glark -T "red on white" 'd{5}' *.c See the HIGHLIGHTING section for valid colors and modifiers. Complex Expressions % glark --or format print *.h Searches for either "printf" or "format". Short form: % glark -o format print *.h % glark --and 4 printf format *.c *.h Searches for both "printf" or "format" within 4 lines of each other. Short form: % glark -a 4 printf format *.c *.h % glark --context=3 --and 0 printf format *.c Searches for both "printf" or "format" on the same line ("within 0 lines of each other"). Three lines of context are displayed around any matches. Short form: % glark -3 -a 0 printf format *.c % glark -8 -i -a 15 -a 2 pthx '...' -o 'va_w+t' die *.c (In order of the options:) Produces 8 lines of context around case insensitive matches of ("phtx" within 2 lines of '...' (lit- eral)) within 15 lines of (either "va_w+t" or "die"). % glark --and -1 '#defines+YIELD' '#defines+dTHR' *.h Looks for "#defines+YIELD" within the same file (-1 == "infinite distance") of "#defines+dTHR". Short form: % glark -a -1 '#defines+YIELD' '#defines+dTHR' *.h Range Limiting % glark --before 50% cout *.cpp Find references to "cout", within the first half of the file. Short form: % glark -b 50% cout *.cpp % glark --after 20 cout *.cpp Find references to "cout", starting at the 20th line in the file. Short form: % glark -b 50% cout *.cpp % glark --range 20 50% cout *.cpp Find references to "cout", in the first half of the file, after the 20th line. Short form: % glark -R 20 50% cout *.cpp File Processing % glark -r print . Search for "print" in all files at and below the current directory. % glark --fullname='/.java$/' -r println org Search for "println" in all Java files at and below the "org" directory. % glark --basename='!/CVS/' -r 'dd:dd:dd' . Search for a time pattern in all files without "CVS" in their basenames. % glark --size-limit=1024 -r main -r . Search for "main" in files no larger than 1024 bytes. ENVIRONMENT
GLARKOPTS A string of whitespace-delimited options. Due to parsing constraints, should probably not contain complex regular expressions. $HOME/.glarkrc A resource file, containing name/value pairs, separated by either ':' or '='. The valid fields of a .glarkrc file are as follows, with example values: after-context: 1 before-context: 6 context: 5 file-color: blue on yellow highlight: off ignore-case: false quiet: yes text-color: bold reverse line-number-color: bold verbose: false grep: true "yes" and "on" are synonymnous with "true". "no" and "off" signify "false". My ~/.glarkrc file is the following: file-color: bold reverse text-color: bold black on yellow context: 2 highlight: on verbose: false ignore-case: false quiet: yes word: false binary-files: without-match local .glarkrc See the local-config-files field below: Fields after-context See the "--after-context" option. For example, for 3 lines of context after the match: after-context: 3 basename See the "--basename" option. For example, to omit Subversion directories: basename: !/.svn/ before-context See the "--before-context" option. For example, for 7 lines of context before the match: before-context: 7 binary-files See the "--binary-files" option. For example, to skip binary files: binary-files: without-match context See the "--context" option. For example, for 2 lines before and after matches: context: 2 expression See the EXPRESSION section. Example: expression: --or '^s*publics+classs+w+' '^s*w+( file-color See the "--file-color" option. For example, for white on black: file-color: white on black filter See the "--filter" option. For example, to show the entire file: filter: false fullname See the "--fullname" and "--basename" options. For example, to omit CVS files: fullname: !/CVS/ grep See the "--grep" option. For example, to always run in grep mode: grep: true highlight See the "--highlight" option. To turn off highlighting: highlight: false ignore-case See the "--ignore-case" option. To make matching case-insensitive: ignore-case: true known-nontext-files The extensions of files that should be considered to always be nontext (binary). If a file extension is not known, the file contents are examined for nontext characters. Thus, setting this field can result in faster searches. Example: known-nontext-files: class exe dll com See the Exclusion of Non-Text Files section in NOTES for the default settings. known-text-files The extensions of files that should be considered to always be text. See above for more. Example: known-text-files: ini bat xsl xml See the Exclusion of Non-Text Files section in NOTES for the default settings. local-config-files By default, glark uses only the configuration file ~/.glarkrc. Enabling this makes glark search upward from the current directory for the first .glarkrc file. This can be used, for example, in a Java project, where .class files are binary, versus a PHP project, where .class files are text: /home/me/.glarkrc local-config-files: true /home/me/projects/java/.glarkrc known-nontext-files: class /home/me/projects/php/.glarkrc known-text-files: class With this configuration, .class files will automatically be treated as binary file in Java projects, and .class files will be treated as text. This can speed up searches. Note that the configuration file ~/.glarkrc is read first, so the local configuration file can override those settings. quiet See the "--quiet" option. show-break Whether to display breaks between sections, when displaying context. Example: show-break: true By default, this is false. text-color See the "--text-color" option. Example: text-color: bold blue on white verbose See the "--verbose" option. Example: verbose: true verbosity See the "--verbosity" option. Example: verbosity: 4 NOTES
Exclusion of Non-Text Files Non-text files are automatically skipped, by taking a sample of the file and checking for an excessive number of non-ASCII characters. For speed purposes, this test is skipped for files whose suffixes are associated with text files: c cpp css h f for fpp hpp html java mk php pl pm rb rbw txt Similarly, this test is also skipped for files whose suffixes are associated with non-text (binary) files: Z a bz2 elc gif gz jar jpeg jpg o obj pdf png ps tar zip See the "known-text-files" and "known-nontext-files" fields for denoting file name suffixes to associate as text or nontext. Exit Status The exit status is 0 if matches were found; 1 if no matches were found, and 2 if there was an error. An inverted match (the -v/--invert-match option) will result in 1 for matches found, 0 for none found. SEE ALSO
For regular expressions, the "perlre" man page. Mastering Regular Expressions, by Jeffrey Friedl, published by O'Reilly. CAVEATS
"Unbalanced" leading and trailing slashes will result in those slashes being included as characters in the regular expression. Thus, the following pairs are equivalent: /foo "/foo" /foo/ "/foo/" /foo/i "/foo/i" foo/ "foo/" foo/ "foo/" The code to detect nontext files assumes ASCII, not Unicode. AUTHOR
Jeff Pace <jpace at incava dot org> COPYRIGHT
Copyright (c) 2006, Jeff Pace. All Rights Reserved. This module is free software. It may be used, redistributed and/or modified under the terms of the Lesser GNU Public License. See http://www.gnu.org/licenses/lgpl.html for more information. glark 1.8.0 2006-12-30 GLARK(1)
All times are GMT -4. The time now is 08:29 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy