Sponsored Content
Top Forums Shell Programming and Scripting shell script for extracting out the shortest substring from the given starting and en Post 302141524 by drl on Friday 19th of October 2007 12:59:21 PM
Old 10-19-2007
Hi.

I like the solution from aigles. I don't see one yet on perl.

The perl RE syntax has special features for the shortest match. Here is the entire code, along with diagnostic code, minimal argument processing, etc:
Code:
#!/usr/bin/perl

# @(#) p1       Demonstrate non-greedy matching perl RE syntax.

use warnings;
use strict;

my ($debug);
$debug = 0;
$debug = 1;

my ($lines) = 0;

my ($usage) = "usage: $0 first last\n";
my ($first) = shift || die "$usage";
my ($last)  = shift || die "$usage";

my ($string);

while (<>) {
  print " Bounds on this search: $first, $last\n" unless $lines;
  $lines++;
  chomp;
  print "\n";
  print " Initial string = \"$_\"\n";
  if (/($first.*?$last)/) {
    $string = $1;
    print " Shortest substring = \"$string\"\n";
  }
  else {
    print STDERR " No substring found, continuing.\n";
  }
}

print STDERR " ( Lines read: $lines )\n";

exit(0);

Running this on your test line and a few others in file data1:
Code:
% ./p1 a d data1
 Bounds on this search: a, d

 Initial string = "abcdpqracdpqaserd"
 Shortest substring = "abcd"

 Initial string = "abc"
 No substring found, continuing.

 Initial string = "abcdddd"
 Shortest substring = "abcd"
 ( Lines read: 3 )

The heart of the match is in these characters .*?

See the man pages for:
Code:
perlre              Perl regular expressions, the rest of the story
perlreref           Perl regular expressions quick reference

for details ... cheers, drl
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Substring in C shell script?

i am a new user of C-shell script. I want to know can i create a substring in a string. That means when i got a variable $input = "it is number 2" I want to get the "2" to be another variable. Can i do that in C-shell and how to ? Thank you so much dinodash (0 Replies)
Discussion started by: dinodash
0 Replies

2. UNIX for Dummies Questions & Answers

problem extracting substring in korn shell

hi all, I have read similiar topics in this board, but i didn' t find the posting which is the same with the problem i face.. I try to extract string from the end. i try to do this: num=abcdefghij num2=${num:-5} echo $num2 #this should print the last 5 characters (fghij) but it doesn;t... (3 Replies)
Discussion started by: nashrul
3 Replies

3. UNIX for Dummies Questions & Answers

Substring in Shell Script

Hi I'm new to Shell scripting. Someone please help me in extracting a portion of string from a file. Eg: I got a file like, Readme.txt and has the following name value pairs input1 : /homes/input1/ input2 : /homes/input2/ ... ... When I give the parameter input1, the value... (3 Replies)
Discussion started by: smartbuddy
3 Replies

4. Shell Programming and Scripting

help for shell script of finding shortest substring from given string by user

please give me proper solution for finding a shortest substring from given string if string itself and first char and last char of that substr are also given by user if S="dpoaoqooroo" and FC="o" and LC="o",then shortest substr is "oo" and rest of the string is "dpoaoqroo" i have code but it is... (1 Reply)
Discussion started by: pankajd
1 Replies

5. Shell Programming and Scripting

Extracting a substring starting from last occurance of a string/character

Hi All, This is Ram. I'm new to this forum & new to shell scripts as well. I've a requirement in which I want to extract a substring from a given string based on last occurance of a character. for eg. I have a string of a file name with absolute path like... (2 Replies)
Discussion started by: krramkumar
2 Replies

6. Shell Programming and Scripting

Substring in shell script

I need a help in getting substring of each line in input file. I am writing a script that will read a file from a directory on daily basis, I mean everyday a new file will be stored in this directory, it will replace old file. I have to read contents of this file, the contents will be as... (5 Replies)
Discussion started by: jyotib
5 Replies

7. Shell Programming and Scripting

using substring in shell script

This is the data I am having in a file Just for sample I have given 3 records. The file which I am having consists of n number of records. ABC123 10 01/02/2008 2008-01-03-00.00.00.000000 DYUU 22 02/03/2008 2008-01-04-00.00.00.000000 RF33 88 03/05/2008 2008-01-05-00.00.00.000000 ... (24 Replies)
Discussion started by: kmanivan82
24 Replies

8. Shell Programming and Scripting

extracting substring from a file name

hi i need to name a file with a substring of a another file name. i.e. if the old filename is abc.txt , the new filename should be abc_1.txt i should get the substring of the file name and then name the new one please let me know how to do it (4 Replies)
Discussion started by: adityamahi
4 Replies

9. Shell Programming and Scripting

Extracting substring from string

Hi awk and sed gurus, Please help me in the following. I have the following entries in the file ABCDErules AbHDPrules ABCrules -- -- and other entries in the file. Now, I want to extract from the file that contain entries for *rules and process it separately. How can i do it... (6 Replies)
Discussion started by: sdosanjh
6 Replies

10. Shell Programming and Scripting

Extracting substring

Hi, I have string in variable like '/u/dolfin/in/DOLFIN.PRL_100.OIB.TLU.001.D20110520.T040010' and i want to conevrt this string into only "DOLFIN.PRL_100.OIB.TLU.001.D20110520.T040010" (i.e file name). Is there any command to extracting string in some part ?(rather than whole path)? ... (5 Replies)
Discussion started by: shyamu544
5 Replies
GREP(1) 						      General Commands Manual							   GREP(1)

NAME
grep, egrep, fgrep - search a file for a pattern SYNOPSIS
grep [ option ] ... expression [ file ] ... egrep [ option ] ... [ expression ] [ file ] ... fgrep [ option ] ... [ strings ] [ file ] DESCRIPTION
Commands of the grep family search the input files (standard input default) for lines matching a pattern. Normally, each line found is copied to the standard output; unless the -h flag is used, the file name is shown if there is more than one input file. Grep patterns are limited regular expressions in the style of ed(1); it uses a compact nondeterministic algorithm. Egrep patterns are full regular expressions; it uses a fast deterministic algorithm that sometimes needs exponential space. Fgrep patterns are fixed strings; it is fast and compact. The following options are recognized. -v All lines but those matching are printed. -c Only a count of matching lines is printed. -l The names of files with matching lines are listed (once) separated by newlines. -n Each line is preceded by its line number in the file. -b Each line is preceded by the block number on which it was found. This is sometimes useful in locating disk block numbers by con- text. -s No output is produced, only status. -h Do not print filename headers with output lines. -y Lower case letters in the pattern will also match upper case letters in the input (grep only). -e expression Same as a simple expression argument, but useful when the expression begins with a -. -f file The regular expression (egrep) or string list (fgrep) is taken from the file. -x (Exact) only lines matched in their entirety are printed (fgrep only). Care should be taken when using the characters $ * [ ^ | ? ' " ( ) and in the expression as they are also meaningful to the Shell. It is safest to enclose the entire expression argument in single quotes ' '. Fgrep searches for lines that contain one of the (newline-separated) strings. Egrep accepts extended regular expressions. In the following description `character' excludes newline: A followed by a single character matches that character. The character ^ ($) matches the beginning (end) of a line. A . matches any character. A single character not otherwise endowed with special meaning matches that character. A string enclosed in brackets [] matches any single character from the string. Ranges of ASCII character codes may be abbreviated as in `a-z0-9'. A ] may occur only as the first character of the string. A literal - must be placed where it can't be mistaken as a range indicator. A regular expression followed by * (+, ?) matches a sequence of 0 or more (1 or more, 0 or 1) matches of the regular expression. Two regular expressions concatenated match a match of the first followed by a match of the second. Two regular expressions separated by | or newline match either a match for the first or a match for the second. A regular expression enclosed in parentheses matches a match for the regular expression. The order of precedence of operators at the same parenthesis level is [] then *+? then concatenation then | and newline. SEE ALSO
ed(1), sed(1), sh(1) DIAGNOSTICS
Exit status is 0 if any matches are found, 1 if none, 2 for syntax errors or inaccessible files. BUGS
Ideally there should be only one grep, but we don't know a single algorithm that spans a wide enough range of space-time tradeoffs. Lines are limited to 256 characters; longer lines are truncated. GREP(1)
All times are GMT -4. The time now is 07:03 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy