Need help with sed and regexp


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need help with sed and regexp
# 1  
Old 08-23-2012
Question Need help with sed and regexp

Hi everyone, I would really appreciate any help I could get on the following topic.
I am not very familiar with reg expressions nor with sed, I just know the basic uses. What I am trying to do is the following: I have a huge text file where I would like to replace all occurnces of a certain pattern with another one. Here an example:

Code:
 "name1" = "12;15";
  "name5" = "7";
  "abc1" = "3";
  "5" = "";
  "-7" = "";
  "hgf" = "12;15";
  "e1" = "8";
  "-5" = "";

Should change to:

Code:
  "name1" = "12;15";
  "name5" = "7";
  "abc1" = "3;5;-7";
  "hgf" = "12;15";
  "e1" = "8;-5";

The rule is: any assign statement to a variable starting with a letter ( like "name1" ) is preserved, while any assignment to a numerical variable ( like "1" or "-1" should be appended to the previous line.

I couldnt get it to work because of the condition is across more than one line. I know I am supposed to use the N instruction but dunno how. Any help how to do the above with sed or any other way is cery appreciated.

Thanks.
# 2  
Old 08-23-2012
I think it's easier to code, read, maintain in awk:

Code:
awk -F = '
    match( $1, /"(-*[0-9]+)"/, hit ) {
        gsub( "\";", ";" hit[1] "\";", p );
        next;
    }
    {
        if( p )
            print p;
        p = $0;
    }

    END {
        if( p )
            print p;
    }
' input-file >output-file

These 2 Users Gave Thanks to agama For This Post:
# 3  
Old 08-24-2012
Pls help me out: which awk version supports match(s, r, X)? mawk does not. And what is the resulting action? Couldn't find in this forum's man pages.
# 4  
Old 08-24-2012
From man page(s) for gawk:
Code:
       match(s, r [, a])       Return the position in s where the regular expression r occurs, or 0 if r is not present, and set
                               the  values  of RSTART and RLENGTH.  Note that the argument order is the same as for the ~ opera‐
                               tor: str ~ re.  If array a is provided, a is cleared and then elements 1  through  n  are  filled
                               with  the  portions of s that match the corresponding parenthesized subexpression in r.  The 0'th
                               element of a contains the portion of s matched by the entire regular  expression  r.   Subscripts
                               a[n,  "start"],  and  a[n,  "length"] provide the starting index in the string and length respec‐
                               tively, of each matching substring.

# 5  
Old 08-24-2012
Ah, gawk. Got it, thank you!
# 6  
Old 08-24-2012
agama, thanks a lot! Your awk example works like a charm.
# 7  
Old 08-25-2012
Note that if you're on a system that doesn't have gawk and awk's match() function only takes two arguments (such as on OS X), the following should also work:
Code:
awk -F = '$1 ~ /"(-*[0-9]+)"/ {
        split($1, f, "\"")
        sub( "\";", ";" f[2] "\";", p )
        next
}
        { if(p) print p
        p = $0
}
END { if(p) print p
}' input_file > output_file

This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

sed with regexp in Linux

OFF 00280456 - 2014|1|2020_STATUS|GROUP_NAME|SUBGROUP_NAME|CLASS_NAME|GROUP_ID|SUBGROUP_ID I have above header in file. I need to replace 2020_STATUS with STATUS. 2020_STATUS is not always same but the column name will have STATUS all of the time. For instance column name might be 2019_STATUS... (1 Reply)
Discussion started by: jmadhams
1 Replies

2. Shell Programming and Scripting

sed regexp teaser

G'day, Here's a teaser for a sed guru, which I surely am not one, as even my basic sed skills are rusted from years of not practising ... lol Ok ... we have a string of digits such as: 632413741610252847552619172459483022433027602515212950543016701812771409213148672112 we want it split... (9 Replies)
Discussion started by: naderra
9 Replies

3. Shell Programming and Scripting

sed with multiple regexp

Dealing with Linux servers script would be in korn or bash shell syntax file is /etc/fstab I want to insert something if regex is matched to all matched lines in the /etc/fstab file and print out entire /etc/fstab file with the changes example 58.228.111.111:/my/file/system... (5 Replies)
Discussion started by: snoman1
5 Replies

4. Shell Programming and Scripting

Help with Regexp replace in vim/sed

Hi! I have a file with multiple lines following this format: <a href="xxx.aaa_bbb_ccc.yyy">xxx.aaa_bbb_ccc.yyy</a> The goal is to replace the title (not modifying the href) so the new lines looks like this: <a href="xxx.aaa_bbb_ccc.yyy">Aaa bbb ccc</a> The number of underscores in the... (2 Replies)
Discussion started by: ericjohansson
2 Replies

5. UNIX for Dummies Questions & Answers

sed before and after regexp

Dear all i have the code which print 1 line of context before and after regexp, with line number sed -n -e '/regexp/{=;x;1!p;g;$!N;p;D;}' -e h the code work well but any one can tell me what each letter mean {=;x;1!p;g;$!N;p;D;} also how i can print 2 line before and onle line after ... (2 Replies)
Discussion started by: soly
2 Replies

6. Shell Programming and Scripting

sed regexp

Hi, I am not that good with reg exp and sed. But I was just looking at something the other day and came across a situation. When I ran the below command: echo "123 word" | sed 's/*/(&)/' the op was: (123) word But when I ran: echo "123 word" | sed 's/*/(&)/g' the o/p was: (123)... (4 Replies)
Discussion started by: King Nothing
4 Replies

7. Shell Programming and Scripting

Help regarding behavior sed regexp query

Hi all, I have one question regarding sed regexp (or any regexp in general), I have some path like this C:/Abc/def/ghi/jkl in a file file1 Now if i use following code cat file1 | sed 's#\(.*\)/.*#\1#' Now it give me following output C:/Abc/def/ghi, which is fine But i just... (2 Replies)
Discussion started by: sarbjit
2 Replies

8. Shell Programming and Scripting

Regexp and sed problem

Basically it should identify what ever is in between /*< >*/ (tags) and replace dbname ending with (.) with the words in between the tags i.e. DELETE FROM /*<workDB>*/epd_test./*<multi>*//*<version>*/epd_tbl1 ALL; into DELETE FROM... (4 Replies)
Discussion started by: sol_nov
4 Replies

9. Shell Programming and Scripting

regexp with sed again!!!

please help: I want to add 1 space between string and numbers: input file: abcd12345 output file: abcd 1234 The following sed command does not work: sed 's/\(+\)\(+\)/\1 \2/' file Any ideas, please Andy (2 Replies)
Discussion started by: andy2000
2 Replies

10. UNIX for Dummies Questions & Answers

GnuWin32 sed 4.1.4 regexp matching

I am using GnuWin32 sed and am having trouble with the regexp - i.e., they don't behave the same way as in UNIX (POSIX and and all that). I have a stream of data, e.g.: 11111'222?'22'33?'333'44444'55555' I want to insert a \n after those apostrophes that are *not* preceded by a ?. ... (2 Replies)
Discussion started by: Simerian
2 Replies
Login or Register to Ask a Question