The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #2 (permalink)  
Old 04-27-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652
Different registrars use different output formats. So unless you are querying a very restricted set of domains, for example domains all registered by one person, or for other reasons all registered with the same registrar or only a small set of registrars, this may turn out to be more complex than you thought.

Perhaps it would be useful as a first step to separate the entries to different files depending on the [Querying ... line? Try the csplit command for that. Then you can create a parser for each of the formats you find in there.

How do you know when to stop? Often a record will include hierarchical information (especially for the ARIN information, which is what your ABCE.TSD example looks like) in which the later lines are more specific than the earlier ones. Then you often want the later lines, not the earlier ones. (But this depends on what you need this for, of course.)

Anyway, here's an attempt at implementing your current spec. This simply picks out the first of anything after the Querying line:

Code:
perl -ne 'if (/^\[Querying/) {
  print; @wanted = qw(OrgName NetRange inetnum descr owner Country);
  $wanted = &wanted(@wanted);
}
sub wanted {
  return "^(" . join ("|", map { quotemeta $_ } @_) . "):";
}
if ($wanted && $_  =~ m/$wanted/i) {
  print;
  @wanted = grep { $_ ne $1 } @wanted;
  $wanted = @wanted ? &wanted(@wanted) : "";
}' file
This came out a little more monstrous than I'd like it to be, but maybe you can use it as a starting point.

(In retrospect, maybe it would have been better to use a hash to keep track of which values are already captured, and not capture if the hash says we already have the one we are looking at. Push the captured ones to an array if preserving order is important.)

Last edited by era; 04-27-2008 at 07:53 AM.. Reason: Add /i flag to make matching ignore case