[Perl] Split lines into array - variable line items - variable no of lines.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting [Perl] Split lines into array - variable line items - variable no of lines.
# 1  
Old 09-27-2011
[Perl] Split lines into array - variable line items - variable no of lines.

Hi,

I have the following lines that I would like to see in an array for easy comparisons and printing:

Example 1:

Code:
field1,field2,field3,field4,field5
value1,value2,value3,value4,value5

Example 2:

Code:
field1,field3,field4,field2,field5,field6,field7
value1,value3,value4,value2,value5,value6,field7
value1,value3,value4,value2,value5,value6,field7
value1,value3,value4,value2,value5,value6,field7

So, the number of lines, the number of fields and the field order can differ.

As output I would like to see:

Code:
field2,field3,field5
value2,value3,value5

Code:
field2,field3,field5
value2,value3,value5
value2,value3,value5
value2,value3,value5

Those to be printed fields and values are always present, independent from the number of fields and the field order.
And field1 and value1 are always on the first place.
The background is that has to run on different systems and those different systems will deliver different field numbers and field order.

I started with something like this, but got stuck somehow due to a lack of Perl knowledge.
Would it have been a more static input then I it would be a bit more easier.

Code:
my @LineItems;
my $LineItems;
my $FieldValue;
my $i;
my $NumItems;
my $LineCount;

open GETLINES, "cat /tmp/lines.txt |");
$LineCount = 0;
while ( $Line = <GETLINES> ) {
  if ( $Line =~ /^Field1,/ ) {
    $LineCount++;
    @LineItems = split (/,/, $Line);
    $NumItems = @LineItems;
    for ( $i = 1; $i < $NumItems; $i++ ) {
       $FieldValue{$i} = $LineItems[$i];
    }
  }
  if ( $Line =~ /^Value1,/ ) {
    $LineCount++;
    @LineItems = split (/,/, $Line);
    $NumItems = @LineItems;
    for ( $i = 1; $i < $NumItems; $i++ ) {
      $FieldValue{$i} = $LineItems[$i];
    }
  }
}

I would appreciate any kind of assistance.

ejdv

Last edited by Scott; 09-27-2011 at 10:54 AM.. Reason: Please use code tags for code
# 2  
Old 09-27-2011
Quote:
if ( $Line =~ /^Field1,/ ) {
Regex is case sensitive and your field headers are lower case try the i
modifier if you are unsure.

I would be inclined to read the next line within the first block and then assign it to a hash keyed on fieldname
# 3  
Old 09-27-2011
@Skrynesaver,

Thanks for the quick reply.
Point taken about the i modifier.
In the example of course it had to be field1 iso Field1.

With "a hash keyed on fieldname" you mean something like this ?

$FieldValue{ 'field2' } = 'value2'

Reading the first line builds the hash and then the next lines fill the hash.
For example 1 I can imagine where this is going, but not for example 2.
Not into keyed hashes and hashed keyed hashes yet :-)

Got some more hints ?
# 4  
Old 09-27-2011
Since the data in your sample file is repetitive, I've used a different sample data file for this problem.

Let's say the data file looks like this:

Code:
$
$
$ cat lines.txt
Microsoft,IBM,Oracle,Apple
Windows 95,DB2,Oracle,MacBook
MS Excel,Fortran,Siebel,iPod
XBox,ATM,MySQL,iPad
Zune,Deep Blue,PeopleSoft,Pixar
$
$

Now, the first line has the keys, in this case - it is the company name.
The second line onwards, we have the values in columnar fashion.
A single key (Company) may have multiple values (Products).

For example, the first column has the key "Microsoft" and the values as the list ("Windows 95", "MS Excel", "XBox", "Zune"). The case for column 2 is similar, and so on.

We could create a nested data structure to store all this information.
At the top level, we have an array, say, @all_comp_products. Each element of this array is an array reference. This array reference has the Company Name as the first element, and the second element is yet another array reference to the list of products of that company.

Thus, the first element of @all_comp_products looks like this:

Code:
$all_comp_products[0] = [ "Microsoft", [ "Windows 95", "MS Excel", "XBox", "Zune" ] ];

The second element looks like this:

Code:
$all_comp_products[1] = [ "IBM", [ "DB2", "Fortran", "ATM", "Deep Blue" ] ];

and so on.

The Perl program looks like this:

Code:
$
$
$ cat -n lines.pl
     1  #perl -w
     2  # ##################################################################################################
     3  #
     4  #  For the data file that looks like this:
     5  #
     6  #  Microsoft,IBM,Oracle,Apple
     7  #  Windows 95,DB2,Oracle,MacBook
     8  #  MS Excel,Fortran,Siebel,iPod
     9  #  XBox,ATM,MySQL,iPad
    10  #  Zune,Deep Blue,PeopleSoft,Pixar
    11  #
    12  #  this Perl program creates a nested data structure @all_comp_products that looks like this:
    13  #
    14  #  $all_comp_products[0] = [ "Microsoft", [ "Windows 95", "MS Excel", "XBox",  "Zune"       ] ];
    15  #  $all_comp_products[1] = [ "IBM",       [ "DB2",        "Fortran",  "ATM",   "Deep Blue"  ] ];
    16  #  $all_comp_products[2] = [ "Oracle",    [ "Oracle",     "Siebel",   "MySQL", "PeopleSoft" ] ];
    17  #  $all_comp_products[3] = [ "Apple",     [ "MacBook",    "iPod",     "iPad",  "Pixar"      ] ];
    18  #
    19  # ##################################################################################################
    20
    21  my $file = "lines.txt";
    22  my $company;
    23  my $product;
    24  my @all_comp_products;
    25  my $idx = 0;
    26
    27  open (FH, "<", $file) or die "Can't open $file for reading: $!";
    28  while (<FH>) {
    29    chomp;
    30    if (/^Microsoft/) {
    31      foreach $company (split /,/) {
    32         push @all_comp_products, [ $company ];
    33      }
    34    } else {
    35      foreach $product (split /,/) {
    36         push @{${$all_comp_products[$idx]}[1]}, $product;
    37         $idx++;
    38      }
    39      $idx = 0;
    40    }
    41  }
    42  close (FH) or die "Can't close $file: $!";
    43
    44  # Now, we'll iterate through the nested data structure and display the data
    45  foreach my $item (@all_comp_products) {
    46    $company = $$item[0];
    47    print "Company  : $company\n";
    48    print "Products :\n";
    49    foreach my $prod (@{$$item[1]}) {
    50      print "           $prod\n";
    51    }
    52    print "=" x 40,"\n";
    53  }
    54
$
$

And here's a test run:

Code:
$
$ perl lines.pl
Company  : Microsoft
Products :
           Windows 95
           MS Excel
           XBox
           Zune
========================================
Company  : IBM
Products :
           DB2
           Fortran
           ATM
           Deep Blue
========================================
Company  : Oracle
Products :
           Oracle
           Siebel
           MySQL
           PeopleSoft
========================================
Company  : Apple
Products :
           MacBook
           iPod
           iPad
           Pixar
========================================
$
$

tyler_durden
This User Gave Thanks to durden_tyler For This Post:
# 5  
Old 09-28-2011
@tyler_durden,

Thanks a lot for this great example.
It should enable me to solve my 'problem'.

---------- Post updated at 11:07 AM ---------- Previous update was at 08:52 AM ----------

@tyler_durden,

I have an additional question.

What if I only want to print Apple and IBM and in that exact order ?
Or for example Oracle, Apple and IBM in this exact order.

I tried to add this:

Quote:
my %OrderedList = (
'Apple' => "1",
'IBM' => "1");

:
:

foreach my $item (@all_comp_products) {
$company = $$item[0];
if ( $OrderedList{$company} ) {
print "Company : $company\n";
print "Products :\n";
foreach my $prod (@{$$item[1]}) {
print " $prod\n";
}
}
}
That results in this:

Quote:
Company : IBM
Products :
DB2
Fortran
ATM
Deep Blue
Company : Apple
Products :
MacBook
iPod
iPad
Pixar
But I need it to be in the order as specified by %OrderedList.

Could you please assist me with this final step ?
# 6  
Old 09-28-2011
A hash has no specified order, it is not an array, however if you were to use ordinal values when defining your list you could do the following (remember that any number > 0 is true).
Code:
my %OrderedList = (
    'Apple' => "1",
    'IBM' => "2",
    'Uwanted'=>"0",
);
for $company (sort {$OrderedList{$a}<=> $OrderedList{$b}}keys %OrderedList){
    if ($OrderedList{$company}){
        print "$company is in position $OrderedList{$company}\n";
    }
}


Hope that helps
# 7  
Old 09-28-2011
Thanks, but I fail to see how to use that in printing the desired output. :-(
This is the @all_comp_products:

Code:
  DB<2> x @all_comp_products
0  ARRAY(0x22904)
   0  'Microsoft'
   1  ARRAY(0x2912cc)
      0  'Windows 95'
      1  'MS Excel'
      2  'XBox'
      3  'Zune'
1  ARRAY(0x204a94)
   0  'IBM'
   1  ARRAY(0x27f494)
      0  'DB2'
      1  'Fortran'
      2  'ATM'
      3  'Deep Blue'
2  ARRAY(0x204e30)
   0  'Oracle'
   1  ARRAY(0x27f4b8)
      0  'Oracle'
      1  'Siebel'
      2  'MySQL'
      3  'PeopleSoft'
3  ARRAY(0x291248)
   0  'Apple'
   1  ARRAY(0x299bcc)
      0  'MacBook'
      1  'iPod'
      2  'iPad'
      3  'Pixar'

Do not know how to access the array items when I have the desired $company.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search for a pattern in a file and split the line into two lines

Hi All, Greetings everyone !!! I have a file which has many lines, out of which one line is as below. I need to search for pattern "varchar(30) Select" and if exists, then split the line as below. I am trying to achieve this in ksh. Can anyone help me on this. (8 Replies)
Discussion started by: Pradhikshan
8 Replies

2. Shell Programming and Scripting

Split a line into multiple lines based on delimeters

Hi, I need help to split any lines that contain ; or , input.txtAc020 Not a good chemical process AC030 many has failed, 3 still maintained AC040 Putative; epithelial cells AC050 Predicted binding activity AC060 rodC Putative; upregulated in 48;h biofilm vs planktonic The output... (8 Replies)
Discussion started by: redse171
8 Replies

3. Shell Programming and Scripting

How do I split a single-line input into five lines?

Example input: John:Shepherd:770-767-4040:U.S.A:New York Mo Jo:Jo Jo: 666-666-6666:U.S.A:Townsville Expected Output: First Name: John Last Name: Shepherd Phone Number: 770-767-4040 Country: U.S.A State: New York First Name: Mo Jo Last Name: Jo Jo Phone Number: 666-666-6666... (10 Replies)
Discussion started by: Camrikron
10 Replies

4. Shell Programming and Scripting

How to get lines having a variable in perl?

Hi, This might be simple question. But i couldn't find the answer. #$ra is having a value which i have got from some other file I want the first occurrence of the line having the value that is in variable "$ra" This is what i have tried. while ( $iop = <TST>) { if($iop =~ /$ra/) { ... (6 Replies)
Discussion started by: twistedpair
6 Replies

5. Shell Programming and Scripting

Need to remove first 6 lines and last line in a array ---- perl scripting

Hi I have stored a command output in an array like below @a = `xyz`; actually xyz comnad will give the output like this tracker date xxxxxxx xxxxxxx --------------------- 1 a 2 b ---------------------- i have stored the "xyz" output to an... (3 Replies)
Discussion started by: siva kumar
3 Replies

6. Shell Programming and Scripting

Delete lines in an array Using perl

im having an array @check which contains text ..i want to open the array and i have to delete lines starting from a word called "check1" till "check2" for eg:- check1 Use descriptive titles when posting. For example, do not post questions with subjects like "Help Me!", "Urgent!!" or "Doubt".... (0 Replies)
Discussion started by: rajkrishna89
0 Replies

7. Shell Programming and Scripting

Split line in to 3 lines

Hi, I have a file which contains 1000's of lines. Each line is a log which is pretty long. So i want to split the each line based on 3 category. 1> Date 2><REQUEST> 3><RESPONSE> So below is the example of a line. 2010-11-16 00:45:12,314<REQUEST><VALIDATION-ERROR><soapenv:Envelope... (16 Replies)
Discussion started by: raghunsi
16 Replies

8. Shell Programming and Scripting

split single line into two line or three lines

Dear All, I want to split single line into two line or three lines wherever | separated values comes using Input line test,DEMTEMPUT20100404010012,,,,,,,,|0070086|0070087, output shoule be test,DEMTEMPUT20100404010012,,,,,,,,0070086, test,DEMTEMPUT20100404010012,,,,,,,,0070087, (14 Replies)
Discussion started by: arvindng
14 Replies

9. Shell Programming and Scripting

Join in a single line variable number of lines

Hi all, I have a file with little blocks beginning with a number 761XXXXXX, and 0, 1, 2 or 3 lines below of it beginning with STUS as follow: 761625820 STUS ACTIVE 16778294 STUS NOT ACTIVE 761157389 STUS ACTIVE 16778294 761554921 STUS ACTIVE 16778294 STUS NOT ACTIVE STUS ACTIVE OP... (4 Replies)
Discussion started by: cgkmal
4 Replies

10. Shell Programming and Scripting

split variable values into array

i have these values inside variable $blah BUNGA TERATAI 3 5055 ITH 1 0 0 0 1 1 JADE TRADER 143W ITH 4 0 0 0 4 4 MOL SPLENDOR 0307A ITH 3 0 0 0 3 3 so how do I split them into array with the... (4 Replies)
Discussion started by: finalight
4 Replies
Login or Register to Ask a Question