The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Substring in shell script jyotib Shell Programming and Scripting 5 01-16-2008 03:58 PM
Extracting a substring starting from last occurance of a string/character krramkumar Shell Programming and Scripting 2 12-18-2007 11:16 PM
help for shell script of finding shortest substring from given string by user pankajd Shell Programming and Scripting 1 11-22-2007 08:27 AM
problem extracting substring in korn shell nashrul UNIX for Dummies Questions & Answers 3 08-14-2007 11:45 PM
Substring in C shell script? dinodash Shell Programming and Scripting 0 03-20-2005 09:59 AM

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 10-18-2007
Registered User
 

Join Date: Oct 2007
Posts: 11
shell script for extracting out the shortest substring from the given starting and en

hi all,
i need an urgent help for writing a shell script which will extract out and print a substring which is the shortest substring from the given string where first and last character of that substring will be given by the user.
for e.g.
if str="abcdpqracdpqaserd"
now if the user gives 'a' and 'd' as the first and last character of the substringi.e. command line arguments.this should extract out acd as the shortest string.
please give simple solution to this.
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 10-19-2007
Registered User
 

Join Date: Sep 2006
Posts: 1,434
Code:
str="abcpqracdpqaserd"
startch="a"
endch="d"
awk -v str=$str -v st=$startch -v end=$endch 'BEGIN{ 
s=index(str,startch)
e=index(str,end)
print substr(str,s,e)
}'
output:
Code:
# ./test.sh
abcpqracd
Reply With Quote
  #3 (permalink)  
Old 10-19-2007
aigles's Avatar
Registered User
 

Join Date: Apr 2004
Location: Bordeaux, France
Posts: 1,198
Another way with sed (first and last can't be special chars) :
Code:
str="abcdpqracdpqaserd"
first=a
last=d
substr=$(echo "$str"| sed -n "s/^[^$first]*\($first[^$last]*$last\).*/\1/p")
Code:
$ sh -x substr.sh
+ str=abcdpqracdpqaserd
+ first=a
+ last=d
++ echo abcdpqracdpqaserd
++ sed 's/^[^a]*\(a[^d]*d\).*/\1/p'
+ substr=abcd
+ echo abcd
abcd
$
Jean-Pierre.

Last edited by aigles; 10-19-2007 at 04:38 AM.
Reply With Quote
  #4 (permalink)  
Old 10-19-2007
Registered User
 

Join Date: Jun 2007
Posts: 350
awk

Hi,

If really took my much efforts. I have tested it for many cases. And they are all ok. Hope this is right on your target.

input:
Code:
abcdpqracdpqaserd
abcdpqracdpqaserd
abcdpqracdpqaserd
output (start:a end:d):
Code:
acd
acd
acd
output (start:a end):
Code:
acdp
acdp
acdp
output (start:a end:r):
Code:
abcdpqr
abcdpqr
abcdpqr
code:
Code:
read a
read b
sed -e "s/$a[^$b]*$b/|&|/g" a > temp_a
sed 's/^|//' temp_a > temp_b

nawk -v st=$a -v ed=$b 'BEGIN{
FS="|"
}
{
for(i=1;i<=NF;i++)
{
	str=sprintf("b%s",$i)
	if(index(str,"a")==2)
	{
		if(tmp=="")
		{
			tmp=$i
		}
		else
		{
			if (length($i)<length(tmp))
				tmp=$i
		}
	}
}
print tmp
}
' temp_b
Reply With Quote
  #5 (permalink)  
Old 10-19-2007
radoulov's Avatar
addict
 

Join Date: Jan 2007
Location: Milan, Italy/Varna, Bulgaria
Posts: 1,391
With GNU Awk:
Code:
awk 'NF>1&&$0=(FS $NF RT){
	if(length<min){
		min=length;rec=$0}
	}END{
print rec
}' FS="$start" RS="$end" min=9^9 filename
Code:
$ cat file
abcdpqracdpqaserd
$ start=a
$ end=d
$ awk 'NF>1&&$0=(FS $NF RT){
if(length<min){
min=length;rec=$0}
}END{
print rec
}' FS="$start" RS="$end" min=9^9 file
acd
$ start=a
$ end=p
$ awk 'NF>1&&$0=(FS $NF RT){
if(length<min){
min=length;rec=$0}
}END{
print rec
}' FS="$start" RS="$end" min=9^9 file
acdp
$ start=a
$ end=r
$ awk 'NF>1&&$0=(FS $NF RT){
if(length<min){
min=length;rec=$0}
}END{
print rec
}' FS="$start" RS="$end" min=9^9 file
aser
Reply With Quote
  #6 (permalink)  
Old 10-19-2007
drl's Avatar
drl drl is offline
Registered User
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 481
Hi.

I like the solution from aigles. I don't see one yet on perl.

The perl RE syntax has special features for the shortest match. Here is the entire code, along with diagnostic code, minimal argument processing, etc:
Code:
#!/usr/bin/perl

# @(#) p1       Demonstrate non-greedy matching perl RE syntax.

use warnings;
use strict;

my ($debug);
$debug = 0;
$debug = 1;

my ($lines) = 0;

my ($usage) = "usage: $0 first last\n";
my ($first) = shift || die "$usage";
my ($last)  = shift || die "$usage";

my ($string);

while (<>) {
  print " Bounds on this search: $first, $last\n" unless $lines;
  $lines++;
  chomp;
  print "\n";
  print " Initial string = \"$_\"\n";
  if (/($first.*?$last)/) {
    $string = $1;
    print " Shortest substring = \"$string\"\n";
  }
  else {
    print STDERR " No substring found, continuing.\n";
  }
}

print STDERR " ( Lines read: $lines )\n";

exit(0);
Running this on your test line and a few others in file data1:
Code:
% ./p1 a d data1
 Bounds on this search: a, d

 Initial string = "abcdpqracdpqaserd"
 Shortest substring = "abcd"

 Initial string = "abc"
 No substring found, continuing.

 Initial string = "abcdddd"
 Shortest substring = "abcd"
 ( Lines read: 3 )
The heart of the match is in these characters .*?

See the man pages for:
Code:
perlre              Perl regular expressions, the rest of the story
perlreref           Perl regular expressions quick reference
for details ... cheers, drl
Reply With Quote
  #7 (permalink)  
Old 10-19-2007
radoulov's Avatar
addict
 

Join Date: Jan 2007
Location: Milan, Italy/Varna, Bulgaria
Posts: 1,391
Am I missing something, or the OP wanted acd (not abcd)from abcdpqracdpqaserd with a and d?
Reply With Quote
Google UNIX.COM
Reply

Tags
regex, regular expressions

Thread Tools
Display Modes




All times are GMT -7. The time now is 09:08 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0