Script to find & replace a multiple lines string across multiple php files and subdirectories

03-04-2012

Registered User

1,466, 512

Join Date: Jul 2010

Last Activity: 7 April 2014, 3:02 PM EDT

Location: earth>US>UTC-5

Posts: 1,466

Thanks Given: 110

Thanked 512 Times in 491 Posts

Grumble!! I added comments as I posted it and my correct spelling introduced the bug. Change the comment

Code:

# magic string found, drop if we're in a php block

so that it doesn't have the single quote (') (make ""we're" "we are" or somesuch).

As for the cd command...
Does the directory "/home/username/" exist, or should username be the name of the user running the script, or did you change it to 'username' so as not to post the real name here? If you really have /home/username, then change username to $USER, or the real name.

There are two schools of thought on the "#!" line. My school of thought is to use #!/usr/bin/env with the parameter ksh, bash, etc. This allows the shell/interpreter to be found using my PATH, and not the hard coded /usr/bin/ksh or whatever is coded. The advantage is that when I have a group of scripts that need to be tested with a particular version of the shell/interpreter, I need only to set up my PATH correctly and the proper version of the interpreter will be invoked for every one of them. I don't need to modify any script to point to the version I am testing under, nor do I need to install the new/beta/old version of the interpreter in /usr/bin or where ever.

The other side of the coin is to hard code the path to the interpreter as you have pointed out. It works, but it is limited in my opnion.

agama

View Public Profile for agama

Find all posts by agama

03-05-2012

Registered User

10, 0

Join Date: Mar 2012

Last Activity: 25 April 2012, 3:27 PM EDT

Posts: 10

Thanks Given: 0

Thanked 0 Times in 0 Posts

I wrote "username" just to hide the real username, im using the real username instead of it, dont worry.

Ok, heres the script i used:

Code:

#!/usr/bin/env ksh
cd /home/username/public_html/tests/
find . -name "*.php" | while read file
do
    echo "munging: $file"             # nice to see progress as it works
    mv "$file" "$file-"      # back it up
    awk '     # read the file and delete the block of php code
    /<?php/ { drop = idx = 0; snarf = 1; }  # start of a block start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer[i] );
            printf( "%s\n", $0 );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    ### change the string between the slants to be something unique to the block you wish to delete.
    /PHP_UNIQUE_CODE/ { drop = 1; }    # magic string found drop if we are in a php block

    snarf {                 # if buffering hold the record until end of block reached.
        buffer[idx++] = $0;
        next;
    }

    { print; }              # not buffering just print the record.
    '  "$file-" >"$file"
    if (( $? > 0 ))            # handle failure by putting the file back in place
    then
        echo "edit of $file failed" >&2
        mv "$file-" "$file"             # restore original
    else
        rm "$file-"               # worked delete backup
    fi
done

(my script filename is "newdelete2.ksh" ) Still when i execute it just by typing

Code:

newdelete2.ksh

it does not work (same problem as before) BUT i tried this

Code:

/newdelete2.ksh

and this

Code:

./newdelete2.ksh

and they both return the following:

Code:

[/]# /newdelete2.ksh
munging: ./footer.php
munging: ./home.php
munging: ./index.php

I've made a "test" directory (as you can see on the "cd /path/") with those 3 php files in it. footer, home and index.

Unfortunatly nothing happens, the php code was not removed from those files, no code was removed.

Any more ideas ? Thank you! Already helped too much

---------- Post updated 03-05-12 at 11:57 AM ---------- Previous update was 03-04-12 at 01:32 PM ----------

Just found out something, when the PHP block i want to delete is in a position like the one below (notice the "<?php get_header(); ?>" that comes before and on the same line as the "<?php"):

Code:

<?php get_header(); ?><?php
  $sql = "SELECT * FROM articles WHERE id = '".$_GET['article']."'";
  $do->doQuery($sql);
  $article = $do->getRows();
  if(isset($_POST['add'])) {
    if(trim($_POST['nick']) != '') {
      $nick = trim($_POST['nick']);
    } else {
      $errorX['nick'] = 'Please enter your nickname.';
    }
    if(trim($_POST['comment']) != '') {
      $comment = trim($_POST['comment']);
    } else {
      $errorX['comment'] = 'Please enter a comment.';
    }
    if(empty($errorX)) {
      $sql = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".$_POST['website']."','".$_GET['article']."','".$nick."','".$comment."','".$email."')";
      $do->doQuery($sql);
      header('Location: '.$_SERVER['HTTP_REFERER']);
    }
  }
  ?>

it will only delete the "?>" at the end of the code, and leave the rest of the code intact.

And there are a lot of instances in which the PHP block appears in that position (with the "<?php" appearing in front and right next to a random piece of code, without space between them)

So i edited the PHP test files, and placed the PHP block exactly like the one below:

Code:

<?php
  $sql = "SELECT * FROM articles WHERE id = '".$_GET['article']."'";
  $do->doQuery($sql);
  $article = $do->getRows();
  if(isset($_POST['add'])) {
    if(trim($_POST['nick']) != '') {
      $nick = trim($_POST['nick']);
    } else {
      $errorX['nick'] = 'Please enter your nickname.';
    }
    if(trim($_POST['comment']) != '') {
      $comment = trim($_POST['comment']);
    } else {
      $errorX['comment'] = 'Please enter a comment.';
    }
    if(empty($errorX)) {
      $sql = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".$_POST['website']."','".$_GET['article']."','".$nick."','".$comment."','".$email."')";
      $do->doQuery($sql);
      header('Location: '.$_SERVER['HTTP_REFERER']);
    }
  }
 ?>

and it worked, the whole PHP block was removed.

But now, how to make it work when the "<?php" is on the same line, and right next to a random piece of code ?

Thank you!

spfc_dmt

View Public Profile for spfc_dmt

Find all posts by spfc_dmt

03-05-2012

Registered User

1,466, 512

Join Date: Jul 2010

Last Activity: 7 April 2014, 3:02 PM EDT

Location: earth>US>UTC-5

Posts: 1,466

Thanks Given: 110

Thanked 512 Times in 491 Posts

I figured you dummied in 'username', but I've also learned not to assume!

That's an interesting twist, and here is some revised code that should do the trick. You'll need to supply the unique string in the BEGIN block as it's needed twice; On the off chance that it happens, if the unique string appears inside of an open/close pair that are on the same line, that will be removed.

Code:

awk '
    BEGIN { unique_str = "unique thing"; }          # stick in the unique string here

    {
        partial = 0;
        while( match( $0, "<\\?php [^\\?]*\\?>" ) > 0 )         # complete beginning/ending on the line
        {
            if( index( substr( $0, RSTART, RLENGTH ), unique_str ) )        # if it contains the magic string
                printf( "%s", substr( $0, 1, RSTART-1 ) );      # print evrything before it, and skip it
            else
                printf( "%s", substr( $0, 1, RSTART + (RLENGTH-1) ) );      # print everything including the begin and end  (edited)

            $0 = substr( $0, RSTART + RLENGTH  );

            partial = 1;
        }

    }

    /<?php$/ { drop = idx = 0; snarf = 1; } # start of a block; start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer[i] );
            printf( "%s\n", $0 );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    match( $0, unique_str ) { drop = 1; }   # magic string found, drop if were in a php block

    snarf {                 # if buffering, hold the record until end of block reached.
        buffer[idx++] = $0;
        if( partial )
            printf( "\n" );
        next;
    }

    { print; }              # not buffering, just print the record.

'

Hope this works better for you.

As for needing ./scriptname to execute your script, that implies that the current directory is not in PATH. You can add '.' to your PATH or just type the additional './' at the front.

---------- Post updated at 22:33 ---------- Previous update was at 22:10 ----------

Small revision. I realised that if something like this occurs

Code:

some text before block opening tag<?php

and the block is dropped, the text before the opening tag is also dropped. This code fixes that bug:

Code:

awk '
    BEGIN { unique_str = "unique thing"; }          # stick in the unique string here

    {
        while( match( $0, "<\\?php [^\\?]*\\?>" ) > 0 )     # complete beginning/ending on the line
        {
            if( index( substr( $0, RSTART, RLENGTH ), unique_str ) )        # if it contains the magic string
                printf( "%s", substr( $0, 1, RSTART - 1 ) );        # print evrything before it, and skip it
            else
                printf( "%s", substr( $0, 1, RSTART + (RLENGTH-1) ) );      # print everything including the begin and end

            $0 = substr( $0, RSTART + RLENGTH  );  
        }

    }

    /<?php$/ { drop = idx = 0; snarf = 1; } # start of a block; start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer[i] );
            printf( "%s\n", $0 );
        }
        else
        {
            if( (i = index( buffer[0], "<?php" )) > 0 )    # if something before <?php, and we dropped the block, print the leading text
                printf( "%s\n", substr( buffer[0], 1, i-1 ) );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    match( $0, unique_str ) { drop = 1; }   # magic string found, drop if were in a php block

    snarf {                 # if buffering, hold the record until end of block reached.
        buffer[idx++] = $0;
        if( partial )
            printf( "\n" );
        next;
    }

    { print; }              # not buffering, just print the record.
'

Last edited by agama; 03-05-2012 at 11:25 PM.. Reason: Small typo in the awk marked with edit

agama

View Public Profile for agama

Find all posts by agama

03-06-2012

Registered User

10, 0

Join Date: Mar 2012

Last Activity: 25 April 2012, 3:27 PM EDT

Posts: 10

Thanks Given: 0

Thanked 0 Times in 0 Posts

Wait, this last script you sent is the full thing ? Or do i need to replace that part ( from "awk" till ' ) in the original script im using ?

I'll be trying it now tho, but please let me know.

Thanks !

EDIT : Replaced the new "awk" code in the script, unfortunatly it didnt work. Same thing happened, only the string that closes the PHP block was moved ( ?> ) , the rest of the PHP block was intact.

Even for the PHP block that completly isolated from other pieces of code, it only removed the ?>

Script i used was:

Code:

#!/usr/bin/env ksh
cd /home/username/public_html/tests/
find . -name "*.php" | while read file
do
    echo "munging: $file"             # nice to see progress as it works
    mv "$file" "$file-"      # back it up
    awk '
    BEGIN { unique_str = "UNIQUE CODE"; }          # stick in the unique string here

    {
        while( match( $0, "<\\?php [^\\?]*\\?>" ) > 0 )     # complete beginning/ending on the line
        {
            if( index( substr( $0, RSTART, RLENGTH ), unique_str ) )        # if it contains the magic string
                printf( "%s", substr( $0, 1, RSTART - 1 ) );        # print evrything before it, and skip it
            else
                printf( "%s", substr( $0, 1, RSTART + (RLENGTH-1) ) );      # print everything including the begin and end

            $0 = substr( $0, RSTART + RLENGTH  );
        }

    }

    /<?php$/ { drop = idx = 0; snarf = 1; } # start of a block; start buffering

    /?>/ {                  # end of a block
        if( ! drop )        # magic string not found -- show this block
        {
            for( i = 0; i < idx; i++ )
                printf( "%s\n", buffer[i] );
            printf( "%s\n", $0 );
        }
        else
        {
            if( (i = index( buffer[0], "<?php" )) > 0 )    # if something before <?php, and we dropped the block, print the leading text
                printf( "%s\n", substr( buffer[0], 1, i-1 ) );
        }

        snarf = 0;          # turn off buffering
        next;
    }

    match( $0, unique_str ) { drop = 1; }   # magic string found, drop if were in a php block

    snarf {                 # if buffering, hold the record until end of block reached.
        buffer[idx++] = $0;
        if( partial )
            printf( "\n" );
        next;
    }

    { print; }              # not buffering, just print the record.
    '  "$file-" >"$file"
    if (( $? > 0 ))            # handle failure by putting the file back in place
    then
        echo "edit of $file failed" >&2
        mv "$file-" "$file"             # restore original
    else
        rm "$file-"               # worked delete backup
    fi
done

Last edited by spfc_dmt; 03-06-2012 at 11:16 AM..

spfc_dmt

View Public Profile for spfc_dmt

Find all posts by spfc_dmt

03-06-2012

Registered User

1,466, 512

Join Date: Jul 2010

Last Activity: 7 April 2014, 3:02 PM EDT

Location: earth>US>UTC-5

Posts: 1,466

Thanks Given: 110

Thanked 512 Times in 491 Posts

Sorry for the confusion, yes I only pasted the awk portion figuring you could insert that into your script body.

Very strange. I took this to a different machine (FreeBSD) just to see if a different flavour of awk might barf and its only compplaint was to escape the question mark in the following line (new character in red):

Code:

/\?>/ {                  # end of a block

The Gnu awk on my Linux host wasn't complaining about that. What version of awk do you have installed?

Code:

awk --version

should give that to you. I've been testing this with GNU Awk 3.1.6.

To test a bit further....
I've cut/pasted the test file I'm using and it doesn't have any issues. I took the awk straight from your post just to be sure and used the dummy "UNIQUE STRING" as well. The result, when I execute it, is the middle section is dropped.

Code:

 <?php
   = "SELECT * FROM articles WHERE id = '".['article']."'";
  ->doQuery();
   = ->getRows();
  if(isset(['add'])) {
    if(trim(['nick']) != '') {
       = trim(['nick']);
    } else {
      ['nick'] = 'Please enter your nickname.';
    }
    if(trim(['comment']) != '') {
       = trim(['comment']);
    } else {
      ['comment'] = 'Please enter a comment.';
    }
    if(empty()) {
       = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".['website']."','".['article']."','".."','".."','".."')";
      ->doQuery();
      header('Location: '.['HTTP_REFERER']);
    }
  }
  ?>
<?php get_header(); ?><?php
   = "SELECT * FROM articles WHERE id = '".['article']."'";
  ->doQuery();
   = ->getRows();
  if(isset(['add'])) {
    if(trim(['nick']) != '') {
       = trim(['nick']);
    } else {
      ['nick'] = 'Please enter your nickname.';
    }
    if(trim(['comment']) != '') {
       = trim(['comment']);
    } else {  //"UNIQUE CODE"
      ['comment'] = 'Please enter a comment.';
    }
    if(empty()) {
       = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".['website']."','".['article']."','".."','".."','".."')";
      ->doQuery();
      header('Location: '.['HTTP_REFERER']);
    }
  }
  ?>
  <?php
   = "SELECT * FROM articles WHERE id = '".['article']."'";
  ->doQuery();
   = ->getRows();
  if(isset(['add'])) {
    if(trim(['nick']) != '') {
       = trim(['nick']);
    } else {
      ['nick'] = 'Please enter their nickname.';
    }
    if(trim(['comment']) != '') {
       = trim(['comment']);
    } else {
      ['comment'] = 'Please enter a comment.';
    }
    if(empty()) {
       = "INSERT INTO comments (website, article_id, nickname, message, email) VALUES ('".['website']."','".['article']."','".."','".."','".."')";
      ->doQuery();
      header('Location: '.['HTTP_REFERER']);
    }
  }
  ?>

What happens if you save just the awk in a file (lets say test_awk), the data in test_data and try this:

Code:

ksh test_awk <test_data

agama

View Public Profile for agama

Find all posts by agama

03-07-2012

Registered User

10, 0

Join Date: Mar 2012

Last Activity: 25 April 2012, 3:27 PM EDT

Posts: 10

Thanks Given: 0

Thanked 0 Times in 0 Posts

Hi, the AWK version is:

GNU Awk 3.1.5

But here's the thing, i tested with these same 3 PHP codes you were testing, and it worked. So the problem is probably with the PHP block im trying to remove.

I will PM you this PHP block so you can test with it, ok ?

Please check your PM box.

---------- Post updated at 09:05 AM ---------- Previous update was at 08:35 AM ----------

Just found out you have PM's disabled. Could you enable it for a second ?

I really can't post this PHP code in here. Let me know!

spfc_dmt

View Public Profile for spfc_dmt

Find all posts by spfc_dmt

Shell Programming and Scripting

Script to find & replace a multiple lines string across multiple php files and subdirectories

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace a string with multiple lines

Discussion started by: jassi10781

2. Shell Programming and Scripting

Search & Replace: Multiple Strings / Multiple Files

Discussion started by: spacegoose

3. Shell Programming and Scripting

replace (sed?) a string in file with multiple lines (string) from variable

Discussion started by: jforce

4. Shell Programming and Scripting

how can i find number of lines in files & subdirectories

Discussion started by: pcbuilder

5. Shell Programming and Scripting

Single/Multiple Line with Special characters - Find & Replace in Unix Script

Discussion started by: r_sarnayak

6. Shell Programming and Scripting

Find & Replace string in multiple files & folders using perl

Discussion started by: Zaheer.mic

7. Shell Programming and Scripting

shell script to find and replace string in multiple files

Discussion started by: pharos467

8. Shell Programming and Scripting

replace multiple lines in multiple files

Discussion started by: unihp1

9. UNIX for Dummies Questions & Answers

Find and replace a string in multiple files

Discussion started by: pharos467

10. Shell Programming and Scripting

Find and Replace in multiple files (Shell script)

Discussion started by: jatins_s