Sponsored Content
Top Forums UNIX for Dummies Questions & Answers delete lines matching a regular expression Post 302604693 by pathunkathunk on Monday 5th of March 2012 05:11:53 PM
Old 03-05-2012
delete lines matching a regular expression

I have a very large file (over 700 million lines) that has some lines that I need to delete. An example of 5 lines of the file:

HS4_80:8:2303:19153:193032 153 k80:138891 [...]
HS4_80:8:2105:5544:43174 89 k88:81949 [...]
165 k88:81949 323 0 * = 323 0 [...]
HS4_80:8:2105:5544:19502 73 k64:351700 [...]
HS4_80:8:2303:19154:108202 137 k72:245019 [...]

Two questions:
1. How can I confirm that the first character in line 3 above (beginning with 165) is a tab, so that I might use that as a regular expression, i.e. delete all lines that begin with tab. (Not all the lines I want to keep begin with HS4, nor do all the lines I want to delete begin with 165).

1. Assuming I find the right regular expression, how do I create a new file that is exactly the same, minus the lines I want to delete (like line 3 above)?

I'm a biologist who is new to programming. I know a lot more than I did a few weeks ago but it's slow going...so thanks for any help!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular expression matching a new line

I have written a script to test some isdn links in my network and I am trying to format the output to be more readable. Each line of the output has a different number of digits as follows... Sitename , spid1 12345678901234 1234567890 1234567 , spid2 1234567890 1234567890 1234567 Sitename , ... (1 Reply)
Discussion started by: drheams
1 Replies

2. Programming

Regular Expression matching in PERL

I am trying to read a file and capture particular lines into different strings: LENGTH: Some Content here TEXT: Some Content Here COMMENT: Some Content Here I want to be able to get (LENGTH: .... ) into one array and so on... I'm trying to use PERL in slurp mode but for some reason... (8 Replies)
Discussion started by: Legend986
8 Replies

3. Shell Programming and Scripting

Help: Regular Expression for Negate Matching String

Hi guys, as per subject I am having problem with regular expressions. Example, if i got a string "javax.servlet.http.HttpServlet.service" that may occurred anywhere within a text file. How can I used the negate pattern matching of regular expression? I tried the below pattern but it... (4 Replies)
Discussion started by: DrivesMeCrazy
4 Replies

4. Shell Programming and Scripting

Regular expression matching in BASH (equivalent of =~ in Perl)

In Perl I can write a condition that evaluates a match expression like this: if ($foo =~ /^bar/) { do blah blah blah } How do I write this in shell? What I need to know is what operator do I use? The '=~' doesn't seem to fit. I've tried different operators, I browsed the man page for... (3 Replies)
Discussion started by: indiana_tas
3 Replies

5. Shell Programming and Scripting

Regular expression matching

Hi, I have a variable in my script that gets its value from a procstack output. It could be a number of any length, or it could just be a '1' with 0 or more white spaces around it. I would like to detect when this variable is just a 1 and not a 1234, for example. This is as far as I got: ... (3 Replies)
Discussion started by: tmf33uk
3 Replies

6. Shell Programming and Scripting

How to delete the word after a regular expression

Example: Lucas RUNCYCLE Rule1 Astigmatism Robot RUNCYCLE Rule2 Jack RUNCYCLE Calendar1 June Lucy RUNCYCLE Exception4 Fear RUNCYCLE Calendar5 August In this example, how can I delete the next after the expression RUNCYCLE? (i.e. Rule1, Rule2, Calendar1, Exception1, Calendar5) I'm... (3 Replies)
Discussion started by: The Gamemaster
3 Replies

7. Shell Programming and Scripting

Matching single quote in a regular expression

I trying to match the begining of the following line in a perl script with a regular expression. $ENV{'ORACLE_HOME'} I tried this regluar expession: /\$ENV\{\'ORACLE_HOME\'\}/ Instead of match, I got a blank prompt > It seems to be a problem with the single quote. If I take it... (11 Replies)
Discussion started by: JC9672
11 Replies

8. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print... (1 Reply)
Discussion started by: jessy83
1 Replies

9. UNIX for Dummies Questions & Answers

Finding lines with a regular expression, replacing them with blank lines

So the tag for this forum says all newbies welcome... All I want to do is go through my file and find lines which contain a given string of characters then replace these with a blank line. I really tried to find a simple command to do this but failed. Here's what I did come up with though: ... (2 Replies)
Discussion started by: Golpette
2 Replies

10. Shell Programming and Scripting

regular expression matching whole words

Hi Consider the file this is a good line when running grep '\b(good|great|excellent)\b' file5 I expect it to match the line but it doesn't... what am i doing wrong?? (ultimately this regex will be in a awk script- just using grep to test it) Thanks, Storms (5 Replies)
Discussion started by: Storms
5 Replies
GREP(1) 						      General Commands Manual							   GREP(1)

NAME
grep, g - search a file for a pattern SYNOPSIS
grep [ option ... ] pattern [ file ... ] g [ option ... ] pattern [ file ... ] DESCRIPTION
Grep searches the input files (standard input default) for lines that match the pattern, a regular expression as defined in regexp(7) with the addition of a newline character as an alternative (substitute for |) with lowest precedence. Normally, each line matching the pattern is `selected', and each selected line is copied to the standard output. The options are -c Print only a count of matching lines. -h Do not print file name tags (headers) with output lines. -e The following argument is taken as a pattern. This option makes it easy to specify patterns that might confuse argument parsing, such as -n. -i Ignore alphabetic case distinctions. The implementation folds into lower case all letters in the pattern and input before interpre- tation. Matched lines are printed in their original form. -l (ell) Print the names of files with selected lines; don't print the lines. -L Print the names of files with no selected lines; the converse of -l. -n Mark each printed line with its line number counted in its file. -s Produce no output, but return status. -v Reverse: print lines that do not match the pattern. -f The pattern argument is the name of a file containing regular expressions one per line. -b Don't buffer the output: write each output line as soon as it is discovered. Output lines are tagged by file name when there is more than one input file. (To force this tagging, include /dev/null as a file name argument.) Care should be taken when using the shell metacharacters $*[^|()= and newline in pattern; it is safest to enclose the entire expression in single quotes '...'. An expression starting with '*' will treat the rest of the expression as literal characters. G invokes grep with -n and forces tagging of output lines by file name. If no files are listed, it searches all files matching *.C *.b *.c *.h *.m *.cc *.java *.cgi *.pl *.py *.tex *.ms SOURCE
/src/cmd/grep /bin/g SEE ALSO
ed(1), awk(1), sed(1), sam(1), regexp(7) DIAGNOSTICS
Exit status is null if any lines are selected, or non-null when no lines are selected or an error occurs. GREP(1)
All times are GMT -4. The time now is 12:33 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy