Visit Our UNIX and Linux User Community


shell script for extracting out the shortest substring from the given starting and en


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting shell script for extracting out the shortest substring from the given starting and en
# 8  
Old 10-19-2007
Hi, radoulov.
Quote:
Originally Posted by radoulov
Am I missing something, or the OP wanted acd (not abcd)from abcdpqracdpqaserd with a and d?
I assumed the OP missed something, namely the b key. If not, he can explain how that should be obtained, give another example, etc. ... cheers, drl
# 9  
Old 10-19-2007
I assumed he wanted the shortest match.
# 10  
Old 10-19-2007
Hi.
Quote:
Originally Posted by radoulov
I assumed he wanted the shortest match.
If it were true that we could arbitrarily omit characters, then the shortest match would always be "ad", and we wouldn't need to work so hard.

Do you see any other algorithmic way to get "acd" from "abcdpqracdpqaserd"? -- or did I miss something this time? ... cheers, drl
# 11  
Old 10-19-2007
abcdpqracdpqaserd

acd is the shortest match of a[^d]*d

Smilie
# 12  
Old 10-19-2007
Hi.
Quote:
Originally Posted by radoulov
abcdpqracdpqaserd

acd is the shortest match of a[^d]*d

Smilie
Good eye; got it, thanks. I'll need to scan the entire string, as you did ... cheers, drl
# 13  
Old 10-19-2007
Hi.

Modified perl code to scan entire string:
Code:
#!/usr/bin/perl

# @(#) p1       Demonstrate non-greedy matching perl RE syntax.

use warnings;
use strict;

my ($debug);
$debug = 1;
$debug = 0;

my ($lines) = 0;

my ($usage) = "usage: $0 first last\n";
my ($first) = shift || die "$usage";
my ($last)  = shift || die "$usage";

my ($string);
my ($input);
my ($winner);
my ($min) = 1.0E300;

while (<>) {
  print " Bounds on this search: $first, $last\n" unless $lines;
  $lines++;
  chomp;
  print "\n";
  print " Initial string = \"$_\"\n";
  $input = $_;
  pos $input = 0;
  $min    = 1.0E300;
  $winner = 0;

  # See Perl Best Practices, p 250 ff for details on loops like
  # this.
  while ( pos $input < length $input ) {
    if ( $input =~ m{ \G ($first.*?$last) }gcxms ) {
      $string = $1;
      print " matched string :$string:\n" if $debug;
      if ( length $string < $min ) {
        $winner = $string;
        $min    = length $winner;
      }
    }
    else {    # move pointer ahead
      $input =~ m/ \G (.) /gcxms;
    }
    print " so far, winner :$winner:, min :$min:\n" if $debug;
  }
  if ($winner) {
    print " Shortest substring = \"$winner\"\n";
  }
  else {
    print STDERR " No substring found, continuing.\n";
  }
}

print STDERR " ( Lines read: $lines )\n";

exit(0);

Prodcuing:
Code:
% ./p1 a d data1
 Bounds on this search: a, d

 Initial string = "abcdpqracdpqaserd"
 Shortest substring = "acd"

 Initial string = "abc"
 No substring found, continuing.

 Initial string = "ad"
 Shortest substring = "ad"

 Initial string = "abcdabcadabcefdadabcd"
 Shortest substring = "ad"

 Initial string = "abc--------------------de"
 Shortest substring = "abc--------------------d"

 Initial string = "abc0123456789defgh   d"
 Shortest substring = "abc0123456789d"
 ( Lines read: 6 )

A tip of the hat to radoulov for noting the discrepancy ... cheers, drl
# 14  
Old 10-20-2007
An (not only GNU) Awk solution:

Code:
awk -v s="abcdpqracdpqaserd" -v start="a" -v end="d" 'BEGIN{
	re=start"[^"end"]*"end
	min=(length(s)+0)
		while (match(s,re)){
			all[length(substr(s,RSTART,RLENGTH))]=substr(s,RSTART,RLENGTH)
			s=substr(s,++i)
			}
	for(p in all)
		if((p+0)<min)
			{min=p;shortest=all[p]}
print shortest
}'


Last edited by radoulov; 10-20-2007 at 03:23 PM.. Reason: ... some corrections

Previous Thread | Next Thread
Test Your Knowledge in Computers #229
Difficulty: Easy
According to NetMarketShare, in September 2019 Linux had a 2% global market share of the desktop / laptop computer market.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting substring

Hi, I have string in variable like '/u/dolfin/in/DOLFIN.PRL_100.OIB.TLU.001.D20110520.T040010' and i want to conevrt this string into only "DOLFIN.PRL_100.OIB.TLU.001.D20110520.T040010" (i.e file name). Is there any command to extracting string in some part ?(rather than whole path)? ... (5 Replies)
Discussion started by: shyamu544
5 Replies

2. Shell Programming and Scripting

Extracting substring from string

Hi awk and sed gurus, Please help me in the following. I have the following entries in the file ABCDErules AbHDPrules ABCrules -- -- and other entries in the file. Now, I want to extract from the file that contain entries for *rules and process it separately. How can i do it... (6 Replies)
Discussion started by: sdosanjh
6 Replies

3. Shell Programming and Scripting

extracting substring from a file name

hi i need to name a file with a substring of a another file name. i.e. if the old filename is abc.txt , the new filename should be abc_1.txt i should get the substring of the file name and then name the new one please let me know how to do it (4 Replies)
Discussion started by: adityamahi
4 Replies

4. Shell Programming and Scripting

using substring in shell script

This is the data I am having in a file Just for sample I have given 3 records. The file which I am having consists of n number of records. ABC123 10 01/02/2008 2008-01-03-00.00.00.000000 DYUU 22 02/03/2008 2008-01-04-00.00.00.000000 RF33 88 03/05/2008 2008-01-05-00.00.00.000000 ... (24 Replies)
Discussion started by: kmanivan82
24 Replies

5. Shell Programming and Scripting

Substring in shell script

I need a help in getting substring of each line in input file. I am writing a script that will read a file from a directory on daily basis, I mean everyday a new file will be stored in this directory, it will replace old file. I have to read contents of this file, the contents will be as... (5 Replies)
Discussion started by: jyotib
5 Replies

6. Shell Programming and Scripting

Extracting a substring starting from last occurance of a string/character

Hi All, This is Ram. I'm new to this forum & new to shell scripts as well. I've a requirement in which I want to extract a substring from a given string based on last occurance of a character. for eg. I have a string of a file name with absolute path like... (2 Replies)
Discussion started by: krramkumar
2 Replies

7. Shell Programming and Scripting

help for shell script of finding shortest substring from given string by user

please give me proper solution for finding a shortest substring from given string if string itself and first char and last char of that substr are also given by user if S="dpoaoqooroo" and FC="o" and LC="o",then shortest substr is "oo" and rest of the string is "dpoaoqroo" i have code but it is... (1 Reply)
Discussion started by: pankajd
1 Replies

8. UNIX for Dummies Questions & Answers

Substring in Shell Script

Hi I'm new to Shell scripting. Someone please help me in extracting a portion of string from a file. Eg: I got a file like, Readme.txt and has the following name value pairs input1 : /homes/input1/ input2 : /homes/input2/ ... ... When I give the parameter input1, the value... (3 Replies)
Discussion started by: smartbuddy
3 Replies

9. UNIX for Dummies Questions & Answers

problem extracting substring in korn shell

hi all, I have read similiar topics in this board, but i didn' t find the posting which is the same with the problem i face.. I try to extract string from the end. i try to do this: num=abcdefghij num2=${num:-5} echo $num2 #this should print the last 5 characters (fghij) but it doesn;t... (3 Replies)
Discussion started by: nashrul
3 Replies

10. Shell Programming and Scripting

Substring in C shell script?

i am a new user of C-shell script. I want to know can i create a substring in a string. That means when i got a variable $input = "it is number 2" I want to get the "2" to be another variable. Can i do that in C-shell and how to ? Thank you so much dinodash (0 Replies)
Discussion started by: dinodash
0 Replies

Featured Tech Videos