Sponsored Content
Top Forums Shell Programming and Scripting awk removing data before or after a pattern Post 302443394 by agama on Sunday 8th of August 2010 03:42:12 PM
Old 08-08-2010
This is one way to accomplish what you need; may not be the most efficient, but is easy to understand. It will do both 'drop before' and 'drop after' functions. I've set it up to match strings, not patterns, if you truly need to match patterns, use the match() function in awk rather than index().

Code:
#!/usr/bin/env ksh

# parms:        $1 -- before string; all records before matching this string are dropped
#                       If this is "none" then all records until the after string is matched are kept
#               $2 -- after string; all records after this string are dropped.

awk -v toss_before="${1:-none}" -v toss_after="$2" '
        BEGIN {
                if( toss_before == "none" )     # keep everything from the start
                        snarf = 1;
                else
                        snarf = 0;              # must wait until we see toss_before to start keeping data
        }

        {
                if( snarf ) 
                {
                        printf( "%s\n", $0 );           # print if snarfing 

                        if( toss_after && index( $0, toss_after ) )     # check to see if this has the end string
                                exit( 0 );
                }
                else                                    # not snarfing, see if this is the start string
                {
                        if( index( $0, toss_before ) )
                        {
                                printf( "%s\n", $0 );
                                snarf = 1;
                        }
                }
        }
'

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

removing a line containing a pattern in sed

i need to use sed to remove an entire line containing a pattern stored in a variable say $var1 this var1 will be a URL and will therefore contain slashes any help would be greatly appreciated (1 Reply)
Discussion started by: Fire_Storm
1 Replies

2. Shell Programming and Scripting

Replacing or removing a long list of pattern by using awk or sed

Input: >abc|123456|def|EXIT| >abc|203456|def|EXIT2| >abc|234056|def|EXIT3| >abc|340056|def|EXIT4| >abc|456000|def|EXIT5| . . . Output: def|EXIT| def|EXIT2| def|EXIT3| def|EXIT4| def|EXIT5| . . My try code: (9 Replies)
Discussion started by: patrick87
9 Replies

3. Shell Programming and Scripting

SFTP to server, pulling data and removing the data

Hi all, I have the following script, but are not too sure about the syntax to complete the script. In essence, the script must connect to a SFTP server at a client site with username and password located in a file on my server. Then change to the appropriate directory. Pull the data to the... (1 Reply)
Discussion started by: codenjanod
1 Replies

4. Shell Programming and Scripting

Removing data with pattern matching

I have the following: HH:MM:SS I want to use either % or # sign to remove :SS can somebody please provide me an example. I know how to do this in awk, but awk is too much overhead for something this simple since I will be doing this in a loop a lot of times. Thanks in advance to all... (2 Replies)
Discussion started by: BeefStu
2 Replies

5. Shell Programming and Scripting

Removing repeating lines from a data frame (AWK)

Hey Guys! I have written a code which combines lots of files into one big file(.csv). However, each of the original files had headers on the first line, and now that I've combined the files the headers are interspersed throughout the new combined data frame. For example, throughout the data... (21 Replies)
Discussion started by: gd9629
21 Replies

6. Shell Programming and Scripting

how to get data from hex file using SED or AWK based on pattern sign

I have a binary (hex) file I need to parse to get some data which are encoded this way: .* b4 . . . 01 12 .* af .* 83 L1 x1 x2 xL 84 L2 y1 y2 yL By another words there is a stream of hexadecimal bytes (in my example separated by space for better readability). I need to get value stored in... (3 Replies)
Discussion started by: sameucho
3 Replies

7. Shell Programming and Scripting

Removing files matching a pattern

I am on ubuntu 11.10 using bash scripts I want to remove all files matching a string pattern and I am using the following code find . -name "*$pattern*" -exec rm -f {} \;I have encountered a problem when $pattern is empty. In this case all my files in my current directory were deleted. This... (3 Replies)
Discussion started by: kristinu
3 Replies

8. Shell Programming and Scripting

Removing a pattern in a line

Dear team, I have a file curve.csv which is generated from oracle and each line has a comment associated with it, I want to get rid of this comment, can you please suggest me a command as how to do it Eg, cat curve.csv /*data for today curve*/ /*data for text1*/ this is the header /*data... (6 Replies)
Discussion started by: infyanurag
6 Replies

9. UNIX for Dummies Questions & Answers

Removing PATTERN from txt without removing lines and general text formatting

Hi Everybody! First post! Totally noobie. I'm using the terminal to read a poorly formatted book. The text file contains, in the middle of paragraphs, hyphenation to split words that are supposed to be on multiple pages. It looks ve -- ry much like this. I was hoping to use grep -v " -- "... (5 Replies)
Discussion started by: AxeHandle
5 Replies

10. Shell Programming and Scripting

awk to grab data in range then search for pattern

im using the following code to grab data, but after the data in the range im specifying has been grabbed, i want to count how many instances of a particular pattern is found? awk 'BEGIN{count=0} /parmlib.*RSP/,/seqfiles.*SSD/ {print; count++ } /103 error in ata file/ END { print count }'... (3 Replies)
Discussion started by: SkySmart
3 Replies
regexp(n)						       Tcl Built-In Commands							 regexp(n)

__________________________________________________________________________________________________________________________________________________

NAME
regexp - Match a regular expression against a string SYNOPSIS
regexp ?switches? exp string ?matchVar? ?subMatchVar subMatchVar ...? _________________________________________________________________ DESCRIPTION
Determines whether the regular expression exp matches part or all of string and returns 1 if it does, 0 if it doesn't, unless -inline is specified (see below). (Regular expression matching is described in the re_syntax reference page.) If additional arguments are specified after string then they are treated as the names of variables in which to return information about which part(s) of string matched exp. MatchVar will be set to the range of string that matched all of exp. The first subMatchVar will con- tain the characters in string that matched the leftmost parenthesized subexpression within exp, the next subMatchVar will contain the char- acters that matched the next parenthesized subexpression to the right in exp, and so on. If the initial arguments to regexp start with - then they are treated as switches. The following switches are currently supported: -about Instead of attempting to match the regular expression, returns a list containing information about the regular expression. The first element of the list is a subexpression count. The second element is a list of property names that describe vari- ous attributes of the regular expression. This switch is primarily intended for debugging purposes. -expanded Enables use of the expanded regular expression syntax where whitespace and comments are ignored. This is the same as speci- fying the (?x) embedded option (see METASYNTAX, below). -indices Changes what is stored in the subMatchVars. Instead of storing the matching characters from string, each variable will con- tain a list of two decimal strings giving the indices in string of the first and last characters in the matching range of characters. -line Enables newline-sensitive matching. By default, newline is a completely ordinary character with no special meaning. With this flag, `[^' bracket expressions and `.' never match newline, `^' matches an empty string after any newline in addition to its normal function, and `$' matches an empty string before any newline in addition to its normal function. This flag is equivalent to specifying both -linestop and -lineanchor, or the (?n) embedded option (see METASYNTAX, below). -linestop Changes the behavior of `[^' bracket expressions and `.' so that they stop at newlines. This is the same as specifying the (?p) embedded option (see METASYNTAX, below). -lineanchor Changes the behavior of `^' and `$' (the ``anchors'') so they match the beginning and end of a line respectively. This is the same as specifying the (?w) embedded option (see METASYNTAX, below). -nocase Causes upper-case characters in string to be treated as lower case during the matching process. | -all | Causes the regular expression to be matched as many times as possible in the string, returning the total number of matches | found. If this is specified with match variables, they will continue information for the last match only. | -inline | Causes the command to return, as a list, the data that would otherwise be placed in match variables. When using -inline, | match variables may not be specified. If used with -all, the list will be concatenated at each iteration, such that a flat | list is always returned. For each match iteration, the command will append the overall match data, plus one element for | each subexpression in the regular expression. Examples are: | regexp -inline -- {w(w)} " inlined " | => {in n} | regexp -all -inline -- {w(w)} " inlined " | => {in n li i ne e} | -start index | Specifies a character index offset into the string to start matching the regular expression at. When using this switch, `^' | will not match the beginning of the line, and A will still match the start of the string at index. If -indices is speci- | fied, the indices will be indexed starting from the absolute beginning of the input string. index will be constrained to | the bounds of the input string. -- Marks the end of switches. The argument following this one will be treated as exp even if it starts with a -. If there are more subMatchVar's than parenthesized subexpressions within exp, or if a particular subexpression in exp doesn't match the string (e.g. because it was in a portion of the expression that wasn't matched), then the corresponding subMatchVar will be set to ``-1 -1'' if -indices has been specified or to an empty string otherwise. SEE ALSO
re_syntax(n), regsub(n) KEYWORDS
match, regular expression, string Tcl 8.3 regexp(n)
All times are GMT -4. The time now is 11:21 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy