Perl syntax for sed searches

08-01-2008

Registered User

21, 0

Join Date: Aug 2008

Last Activity: 21 March 2011, 6:57 PM EDT

Location: Concord, NH

Posts: 21

Thanks Given: 0

Thanked 0 Times in 0 Posts

Perl syntax for sed searches

I am aware that Perl has a lot of features that originally came from sed and awk. I have a pattern that I am using like this:

sed -n '/|Y|/p'

I want to do the same thing in Perl and be able to either save that value in some kind of variable or array or potentially write it out to a file.

For the simple case, writing out to a file, I think the syntax is very close to the sed syntax. I would like to get a few recommendations, first, on a few alternative ways to write a similar expression in Perl, then how to do I/O properly.

My second question is that I have two files, both with pipe separated data. In the first file, I want to do a large data reduction first, taking the pattern above, and retaining only records containing |Y|. In the second file, I have a field containing an employee number with an A as the first digit. The other larger file contains this same data, but with a lower case a.

The complete exercise, then, is to first reduce the first file to records containing Y in the field, surrounded by the pipe symbol. Then compare the records that match in the second file to the employee ID field, after making sure it is lower cased in both files.

Can anyone give me a few key technology snippets on this so I don't keep struggling with it, and I will then apply that technology to my modest sized application, which I am writing in Perl - for speed and portability. I have used Perl before, but I have never become an expert, and it has been years since I used it last. I am confusing myself with pieces of different syntax and making a lot of silly mistakes, therefore I would appreciate some sound advice to set me back on course. I am reading up using a few classics - the Perl Cookbook and Programming Perl, but both are large books and daunting to get through. Until I can digest them, I'd appreciate some pointers to accelerate my learning, and more importantly, get a script in at least a minimally usable form ASAP. Therefore, I appreciate specific tips. I'll get better at it once I have digested the classic resources and actually done more coding to regain the experience.

masinick

View Public Profile for masinick

Find all posts by masinick

08-04-2008

Registered User

21, 0

Join Date: Aug 2008

Last Activity: 21 March 2011, 6:57 PM EDT

Location: Concord, NH

Posts: 21

Thanks Given: 0

Thanked 0 Times in 0 Posts

Need to compare two files

I got the first part of my question answered. Here is a code snip for that part:

Code:

while (<MINPUT>) {
    if ( /\|Y\|/ ) {
        print MOUTPUT;
        my $line = <SINPUT>;
    }
}

On the second part, the part I am having trouble with, now that I have matched records in the first file containing |Y|, I want to compare records in the first file with the second file, taking the second field in the first file and the fourth field in the second file. When I find a match, I want to output the first, second, and fourth fields in the second file to a third output file, which will be the resulting output of the comparisons.

Any suggestions on how to do this please?

masinick

View Public Profile for masinick

Find all posts by masinick

08-04-2008

Registered User

1,009, 2

Join Date: May 2008

Last Activity: 28 October 2009, 7:03 PM EDT

Location: Sydney, Australia

Posts: 1,009

Thanks Given: 0

Thanked 2 Times in 2 Posts

Have a look at the perlrun man page, specifically the -n option. This would allow you to write the above script as a shorter one-liner since perl looks after the while (<>) part for you. Probably not relevant since you intend to write a more complex script anyway, but useful to know.

For the second part, I would load the first file into a hash indexed by the comparison field, then read through the second file, and for any record where the fourth field exists in your previously populated hash, output the wanted fields.

Annihilannic

View Public Profile for Annihilannic

Find all posts by Annihilannic

08-04-2008

Registered User

21, 0

Join Date: Aug 2008

Last Activity: 21 March 2011, 6:57 PM EDT

Location: Concord, NH

Posts: 21

Thanks Given: 0

Thanked 0 Times in 0 Posts

The line my $line = <SINPUT>; is not needed. Was originally going to include that to work on the second file.

I still need to work on that part.

I am thinking of using split to pick off the second field in the first file and the fourth field in the second file. I have not gotten the syntax right yet though.

Quote:

Originally Posted by masinick

I got the first part of my question answered. Here is a code snip for that part:

Code:

while (<MINPUT>) {
    if ( /\|Y\|/ ) {
        print MOUTPUT;
       
    }
}

On the second part, the part I am having trouble with, now that I have matched records in the first file containing |Y|, I want to compare records in the first file with the second file, taking the second field in the first file and the fourth field in the second file. When I find a match, I want to output the first, second, and fourth fields in the second file to a third output file, which will be the resulting output of the comparisons.

Any suggestions on how to do this please?

masinick

View Public Profile for masinick

Find all posts by masinick

08-04-2008

Registered User

1,009, 2

Join Date: May 2008

Last Activity: 28 October 2009, 7:03 PM EDT

Location: Sydney, Australia

Posts: 1,009

Thanks Given: 0

Thanked 2 Times in 2 Posts

If you use this:

Code:

@fields=split "[|]";

It will split the current record (i.e. $_) into fields and assign it to the @fields array, which you can access using $fields[0], $fields[1], etc.

Annihilannic

View Public Profile for Annihilannic

Find all posts by Annihilannic

08-04-2008

Registered User

21, 0

Join Date: Aug 2008

Last Activity: 21 March 2011, 6:57 PM EDT

Location: Concord, NH

Posts: 21

Thanks Given: 0

Thanked 0 Times in 0 Posts

OK, good. Zero offset arrays. I can deal with that.

Would the comparisons then be something like this?

Assuming that I do:

Code:

@fields1=split "[|]";
@fields2=split "[|]";

Code:

if (@fields1[1] eq @fields[3])  {
   print @fields2;
}

or is the syntax something different? Is there a convenient way to take the input and put it straight into those fields, something like:

Code:

while (<MOUTPUT> =@fields1=split "[|]") {
   while (<SINPUT> = @fields2=split "[|]") {
      if (@fields1[1] eq @fields[3])  {
         print @fields2;
      }
   }
}

or does it work a different way?

masinick

View Public Profile for masinick

Find all posts by masinick

08-04-2008

Registered User

1,009, 2

Join Date: May 2008

Last Activity: 28 October 2009, 7:03 PM EDT

Location: Sydney, Australia

Posts: 1,009

Thanks Given: 0

Thanked 2 Times in 2 Posts

You use $ to refer to individual elements of an array, @ is for referring to the entire array.

Those nested while loops would be quite inefficient because you would be reading the entire SINPUT file for every iteration of the outside loop. I would read in the MOUTPUT file first and load it into a hash, using something like:

Code:

while (<MOUTPUT>) { @fields = split "[|]"; $moutput{@fields[1]}=$_; };

Then process the second file checking against that %moutput hash:

Code:

while (<SINPUT>) { @fields=split "[|]"; if (exists $moutput{$fields[3]}) { print $moutput{$fields[3]}; } ;

(all code untested!)

Obviously you will have a little more work to do to only print out the fields you are interested in, but you get my drift.

Annihilannic

View Public Profile for Annihilannic

Find all posts by Annihilannic

Shell Programming and Scripting

Perl syntax for sed searches

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed searches a character string for a specified delimiter character, and returns a leading or traili

Discussion started by: fspalero

2. Shell Programming and Scripting

Rewrite sed to perl or run sed in perl

Discussion started by: cmccabe

3. Shell Programming and Scripting

Perl/sed Escape Syntax Problem . . .

Discussion started by: LinQ

4. UNIX for Dummies Questions & Answers

Perl syntax

Discussion started by: sa@@

5. Shell Programming and Scripting

Perl syntax

Discussion started by: scj2012

6. Shell Programming and Scripting

sed s/// syntax help

Discussion started by: dba_frog

7. UNIX for Dummies Questions & Answers

sed - need help for syntax

Discussion started by: Aswex

8. Shell Programming and Scripting

perl syntax help

Discussion started by: livewire06

9. Shell Programming and Scripting

What syntax to use with sed c\

Discussion started by: SusanDAC

10. UNIX for Dummies Questions & Answers

sed syntax

Discussion started by: jo_aze