Sponsored Content
Full Discussion: matching group of words
Top Forums Shell Programming and Scripting matching group of words Post 302556746 by shamrock on Monday 19th of September 2011 11:56:21 AM
Old 09-19-2011
Quote:
Originally Posted by DGPickett
The sort merge is to build three 2-column files: file 1 key to each one file 1 word, file 2 key to each file 2 word, and file 2 key to word count. Sort file 1 and 2 by word and merge the sorted output to build a three column file: file 2 key, file 1 key, word, sort that by file 2 key and merge with third file, seeing if the number of matched words is right. In some respects, this is simpler than the array solution, where you need to deal with unique and which field is the key to which array. This is robust against all file sizes, duplicates. If there are duplicates, sort can remove them, or the merge, knowing some files are not unique, can deal with that. You want to avoid the cartesian join problem, where the N records in one file for a key field match M records on the other file, for NxM output records. If this is the case, sort into flat files and use 'join' to do the walking.
Not sure how this solution is going to pan out because sorting file1 and file2 still wont lineup the records for a match...unless you provide the script to show how it works as i am unable to visualize it.

Here's my solution which stores file1 and file2 in arrays and matches items of file2 array against items of file1 array...incrementing a counter so no of matches equals no of items in each record of file2.
Code:
awk '{
   FS = " : "
   if (FILENAME == "file1") x[$1] = $2
   if (FILENAME == "file2") y[$1] = $2
} END {
   for (i in x) {
      for (j in y) {
         n = split(y[j], a, " ")
         for (p = 1; p <= n; p++)
            s += gsub(a[p], a[p], x[i])
         if (n == s) u[i] = u[i] ? u[i]" "j : i FS j
         s = 0
      }
      if (!u[i]) u[i] = i" : None"
   }
   for (i in u) print u[i]
}' file1 file2

These 2 Users Gave Thanks to shamrock For This Post:
 

10 More Discussions You Might Find Interesting

1. Programming

getting file words as pattern matching

Sir, I want to check for the repation of a user address in a file i used || as my delimiter and want to check repetaip0n of the address that is mailid and then i have to use IMAP and all. How can i do this... I am in linux ...and my file is linux file. ... (5 Replies)
Discussion started by: arunkumar_mca
5 Replies

2. UNIX for Advanced & Expert Users

matching words using regular expressions

following file is taken as input aaa bbb ccc ddd eee ffff grep -w aaa <filename> gives proper output. grep \<\(aaa\).*\> filename :- should give output, since aaa is at begining, however i dosen't get any ouput. Any discrepancy. machine details:- Linux anaconda... (1 Reply)
Discussion started by: bishweshwar
1 Replies

3. Shell Programming and Scripting

How to from grep command from a file which contains matching words?

Hi all I have a file with below content (content is variable whenever new product is launched). I need form a grep command like this egrep "Unknown product|Invalid symboland so on" How to do it using a script? Unknown product Invalid symbol No ILX exch found exceeds maximum size AFX... (4 Replies)
Discussion started by: johnl
4 Replies

4. Shell Programming and Scripting

Matching words in Perl

Hi, I have an array in which one column can contain any statement. From multiple rows of that column I want to match the statement like "Execution Started." If that row contains "Execution started." then only I have to fetch other data of other columns of that particular row. I dont want... (2 Replies)
Discussion started by: monika
2 Replies

5. Shell Programming and Scripting

Print only matching words

Hi All, I have searched the forum and tried to print only matching(pattern) words from the file, but its printing entire line. I tried with grep -w. I am on sunsolaris. Eg: cat file A|A|F1|F2|A|F3|A A|F10|F11|F14|A| F20|A|F21|A|F25 I have to search for F (F followed by numbers) and ... (5 Replies)
Discussion started by: gsjdrr
5 Replies

6. Shell Programming and Scripting

Adding numbers matching with words

Hi All, I have a file which looks like this: abc 1 abc 2 abc 3 abc 4 abc 5 bcd 1 bcd 3 bcd 3 bcd 5 cde 7 This file is just a miniature version of what I really have. Original file is some 1 million lines long. I have tried to come up with the code for what I wish to accomplish... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies

7. Shell Programming and Scripting

How to move a group of words before another group of words

Hi I have a file containing lines with several consecutive words starting with a capital letter (i.e. Zuvaia Flex), followed by "de The New Foul", and I would like to put "The New Foul" before the group with capital letters and delete "de" From the line: Le short femme Zuvaia Flex de The... (2 Replies)
Discussion started by: louisJ
2 Replies

8. Shell Programming and Scripting

Get group of consecutive uppercase words using gawk

Hi I'd like to extract, from a text file, the strings starting with "The Thing" and only composed of words with a capital first letter and apostrophes, like for example: "The Thing I Only" from "those are the The Thing I Only go for whatever." or "The Thing That Are Like Men's Eyewear" ... (7 Replies)
Discussion started by: louisJ
7 Replies

9. Shell Programming and Scripting

regular expression matching whole words

Hi Consider the file this is a good line when running grep '\b(good|great|excellent)\b' file5 I expect it to match the line but it doesn't... what am i doing wrong?? (ultimately this regex will be in a awk script- just using grep to test it) Thanks, Storms (5 Replies)
Discussion started by: Storms
5 Replies

10. Shell Programming and Scripting

Print ALL matching words in a string

Hi. str=" {aaID=z_701; time=2012-10-08 00:00:00.000}; {aaID=S_300; time=2012-10-08 00:00:00.000}]}; ansokningsunderlag={anmaln......} {aaID=x_500; time=2012-10-08 00:00:00.000}]}; ansokningsunderlag={anmaln......}" I want to print: z_701 S_300 x_500 if I use : echo $str | sed -n... (4 Replies)
Discussion started by: freddan25
4 Replies
Test::Fatal(3)						User Contributed Perl Documentation					    Test::Fatal(3)

NAME
Test::Fatal - incredibly simple helpers for testing code with exceptions VERSION
version 0.013 SYNOPSIS
use Test::More; use Test::Fatal; use System::Under::Test qw(might_die); is( exception { might_die; }, undef, "the code lived", ); like( exception { might_die; }, qr/turns out it died/, "the code died as expected", ); isa_ok( exception { might_die; }, 'Exception::Whatever', 'the thrown exception', ); DESCRIPTION
Test::Fatal is an alternative to the popular Test::Exception. It does much less, but should allow greater flexibility in testing exception-throwing code with about the same amount of typing. It exports one routine by default: "exception". FUNCTIONS
exception my $exception = exception { ... }; "exception" takes a bare block of code and returns the exception thrown by that block. If no exception was thrown, it returns undef. Achtung! If the block results in a false exception, such as 0 or the empty string, Test::Fatal itself will die. Since either of these cases indicates a serious problem with the system under testing, this behavior is considered a feature. If you must test for these conditions, you should use Try::Tiny's try/catch mechanism. (Try::Tiny is the underlying exception handling system of Test::Fatal.) Note that there is no TAP assert being performed. In other words, no "ok" or "not ok" line is emitted. It's up to you to use the rest of "exception" in an existing test like "ok", "isa_ok", "is", et cetera. Or you may wish to use the "dies_ok" and "lives_ok" wrappers, which do provide TAP output. "exception" does not alter the stack presented to the called block, meaning that if the exception returned has a stack trace, it will include some frames between the code calling "exception" and the thing throwing the exception. This is considered a feature because it avoids the occasionally twitchy "Sub::Uplevel" mechanism. Achtung! This is not a great idea: sub exception_like(&$;$) { my ($code, $pattern, $name) = @_; like( &exception($code), $pattern, $name ); } exception_like(sub { }, qr/foo/, 'foo appears in the exception'); If the code in the "..." is going to throw a stack trace with the arguments to each subroutine in its call stack (for example via "Carp::confess", the test name, "foo appears in the exception" will itself be matched by the regex. Instead, write this: like( exception { ... }, qr/foo/, 'foo appears in the exception' ); Achtung: One final bad idea: isnt( exception { ... }, undef, "my code died!"); It's true that this tests that your code died, but you should really test that it died for the right reason. For example, if you make an unrelated mistake in the block, like using the wrong dereference, your test will pass even though the code to be tested isn't really run at all. If you're expecting an inspectable exception with an identifier or class, test that. If you're expecting a string exception, consider using "like". success try { should_live; } catch { fail("boo, we died"); } success { pass("hooray, we lived"); }; "success", exported only by request, is a Try::Tiny helper with semantics identical to "finally", but the body of the block will only be run if the "try" block ran without error. Although almost any needed exception tests can be performed with "exception", success blocks may sometimes help organize complex testing. dies_ok lives_ok Exported only by request, these two functions run a given block of code, and provide TAP output indicating if it did, or did not throw an exception. These provide an easy upgrade path for replacing existing unit tests based on "Test::Exception". RJBS does not suggest using this except as a convenience while porting tests to use Test::Fatal's "exception" routine. use Test::More tests => 2; use Test::Fatal qw(dies_ok lives_ok); dies_ok { die "I failed" } 'code that fails'; lives_ok { return "I'm still alive" } 'code that does not fail'; AUTHOR
Ricardo Signes <rjbs@cpan.org> COPYRIGHT AND LICENSE
This software is copyright (c) 2010 by Ricardo Signes. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. perl v5.18.2 2013-09-23 Test::Fatal(3)
All times are GMT -4. The time now is 04:22 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy