Visit Our UNIX and Linux User Community


Help with parsing file with combination of pattern


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with parsing file with combination of pattern
# 1  
Old 10-01-2014
Help with parsing file with combination of pattern

I have a file1 like

Code:
    prt1|als28.1 prt3|als53.1 prt2|als550.1 prt1|bls9.2 prt2|als7.2 prt2|bls0.2
    prt2|als872.1 prt1|bls871.1    prt2|als6.2    prt4|als22.1 prt2|bls43.2

I want to create a file2 from this file by comparing all the possible combinations of patterns (prt) assuming prt1 as a reference pattern. The number of pattern can be differentin each lines of file1. For first line in file1 there can be several pairs considering each prt1 as reference (for example `
Code:
prt1|als28.1 prt3|als53.1; prt1|als28.1 prt2|als550.1; prt1|als28.1 prt2|als7.2; prt1|als28.1 prt2|bls0.2; prt1|bls9.2 prt3|als53.1; prt1|bls9.2 prt2|als550.1; prt1|bls9.2 prt2|als7.2; prt1|bls9.2 prt2|bls0.2

`). The combination like `
Code:
prt1|als28.1 prt1|bls9.2

` should be ignored. So the output of first line in file2(result) will be

Code:
    prt1|als28.1 prt3|als53.1
    prt1|als28.1 prt2|als550.1
    prt1|als28.1 prt2|als7.2
    prt1|als28.1 prt2|bls0.2
    prt1|bls9.2 prt3|als53.1
    prt1|bls9.2 prt2|als550.1
    prt1|bls9.2 prt2|als7.2
    prt1|bls9.2 prt2|bls0.2

likewise the output of second line will be

Code:
    prt1|bls871.1 prt2|als872.1
    prt1|bls871.1 prt2|als6.2
    prt1|bls871.1 prt4|als22.1
    prt1|bls871.1 prt2|bls43.2

I can't figure out how exactly can do this. any suggestions/programs will be helpful. This is one I wrote

Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    open F1,$ARGV[0] or die "\n can not open file $ARGV[0]\n";
    my $pattern1 = $ARGV[1];
    my $otherpattern = $ARGV[2];
    while (my $line=<F1>) 
    {
        if ($line=~/ ($querypattern\S+)/i) { print $1; }
        {
            if ($line=~/  ($otherpattern\S+)/i)
            {
                print "\t".$1."\n";
            }
            else
            {
                if ($line=~ m/\bNo pairs found\b/g)
                {
                    print "\t".$line;
                    print "\t"."No pairs Found"."\n";


Last edited by Scrutinizer; 10-01-2014 at 04:41 PM.. Reason: CODE tags
# 2  
Old 10-01-2014
How about
Code:
awk     '       {c++
                 for (i=1; i<=NF; i++) if ($i ~ /prt1/) A[$i]
                                         else           B[$i]
                 for (i in A) for (j in B) print i, j > "file"c
                 delete A; delete B
                }
        ' file
file1:
prt1|bls9.2 prt2|bls0.2
prt1|bls9.2 prt3|als53.1
prt1|bls9.2 prt2|als550.1
prt1|bls9.2 prt2|als7.2
prt1|als28.1 prt2|bls0.2
prt1|als28.1 prt3|als53.1
prt1|als28.1 prt2|als550.1
prt1|als28.1 prt2|als7.2
file2:
prt1|bls871.1 prt4|als22.1
prt1|bls871.1 prt2|als872.1
prt1|bls871.1 prt2|bls43.2
prt1|bls871.1 prt2|als6.2

Does the output order matter?
This User Gave Thanks to RudiC For This Post:
# 3  
Old 10-01-2014
Quote:
Originally Posted by [/CODE
Does the output order matter?
No the order does not matter.
# 4  
Old 10-02-2014
If you still are interested in a Perl solution.

Code:
#!/usr/bin/perl

use strict;
use warnings;

my $filename = shift or die "Missing filename to operate on it" ;
my $re = shift or die "Missing regex to match";

open my $fh, '<', $filename or die "Could not open $filename: $!\n";

while (my $line = <$fh>) {
    chomp $line;
    print "Line #$.\n";
    my @fields = split /\s+/, $line;
    my @patterns = grep{/$re/} @fields;

    my %patterns = map{$_ => 1} @patterns;
    my @NF = grep(!defined $patterns{$_}, @fields);

    for my $pattern (@patterns) {
        for my $field (@NF) {
            print "$pattern $field\n";
        }
    }
    print "\n";
}
close $fh

Result:

Code:
→ perl prog.pl filename prt1
Line #1
prt1|als28.1 prt3|als53.1
prt1|als28.1 prt2|als550.1
prt1|als28.1 prt2|als7.2
prt1|als28.1 prt2|bls0.2
prt1|bls9.2 prt3|als53.1
prt1|bls9.2 prt2|als550.1
prt1|bls9.2 prt2|als7.2
prt1|bls9.2 prt2|bls0.2

Line #2
prt1|bls871.1 prt2|als872.1
prt1|bls871.1 prt2|als6.2
prt1|bls871.1 prt4|als22.1
prt1|bls871.1 prt2|bls43.2


Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk or a combination of commands to read and calculate nth lines from pattern

Two numerical lines, found by either header line, need to be added and the total placed in a new-header section. Also the total should should be rounded or cut to a two decimal anynumber.XX format with the AB string added on the end. For example: The numerical lines from headers 2 and 3 are... (3 Replies)
Discussion started by: jessandr
3 Replies

2. UNIX for Dummies Questions & Answers

Display latest record from file based on multiple columns combination

I have requirement to print latest record from file based on multiple columns combination. EWAPE EW1SLE0000 EW1SOMU01 ABORTED 03/16/2015 100004 03/16/2015 100005 001 EWAPE EW1SLE0000 EW1SOMU01 ABORTED 03/18/2015 140003 03/18/2015 140004 001 EWAPE EW1SLE0000 EW1SOMU01 ABORTED 03/18/2015 220006... (1 Reply)
Discussion started by: tmalik79
1 Replies

3. Linux

Berkeley version bdf: combination of inode and file system usage

hello, i need a command like "bdf" of HP-UX. ( report number of free disk blocks (Berkeley version) ). it should report inode and file system usage in one line like bdf -i manpage : Man Page for bdf (all Section 1m) - The UNIX and Linux Forums HP-UX command "" report inode and file... (2 Replies)
Discussion started by: bora99
2 Replies

4. Shell Programming and Scripting

Parsing and timestamp a pattern in log

Hello Thanks to Chubler_XL and MadeInGermany for their help few weeks ago. Now, i would like modifying the script, see the next POST. The old script works like that : I picked any random hours In the logs there is the stamp time of webservices, i can see the behavior or errors of... (3 Replies)
Discussion started by: amazigh42
3 Replies

5. UNIX for Dummies Questions & Answers

sed - combination of line deletion and pattern matching

I want to delete all the blank lines from a file before a certain line number. e.g. Input file (n: denotes line number) 1: a 2: 3: b 4: c 5: 6: d I want to delete all blank lines before line number 3, such that my output is: a b c d I see that sed '/^$/d' in_file works... (9 Replies)
Discussion started by: jawsnnn
9 Replies

6. Shell Programming and Scripting

help with awk for file combination

1)file1: | *Local Communication Bandwidths (MB/Sec) | Memory copy (bcopy) | | ^ | mmap_bandwidth | | ^ | mmap_read bandwidth | | ^ | memory write bandwidth | | Local Communication Latencies | Pipe Latency | 2)file2 422.6903 1948.9000 ... (9 Replies)
Discussion started by: yanglei_fage
9 Replies

7. Shell Programming and Scripting

pattern parsing

Can somebody show me an example of of using either '#' or '?' to remove part of string. I am reading files from directories and I want to check if file ends with *.log *.dmp or begins with a arch_* I DONT want to use AWK or SED to do this since there maybe 1000's of files I need to test... (4 Replies)
Discussion started by: BeefStu
4 Replies

8. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

9. UNIX for Dummies Questions & Answers

shell help - file combination

Dear all, I have a question about merging multiple files to one. For example, I have 4 files, named file_1, file_2, file_3 and file_4, they all have same line number, and only one word in each line. I want to combine these four files to one file, file_1 becomes the first column of the new... (4 Replies)
Discussion started by: ting123
4 Replies

10. UNIX for Dummies Questions & Answers

awk and file combination

Hi there, I have 3 files and i want to take different fields from each file and combine them in one. I would like to ask if somebody tell me how can I refer to each field of the different files to write an awk command. I mean can I do sth like awk '........... print $1.file1 $3.file2}'... (1 Reply)
Discussion started by: sickboy
1 Replies

Featured Tech Videos