Sponsored Content
Full Discussion: How to substitute?
Top Forums Shell Programming and Scripting How to substitute? Post 302268034 by vanitham on Sunday 14th of December 2008 11:12:18 PM
Old 12-15-2008
Hi,

Thanks a lot!!!


Quote:
Originally Posted by drl
Hi.

Here is a start that shows ParseWords:
Code:
#!/usr/bin/perl

# @(#) p1       Demonstrate parsing with quotes, operators AND, OR.
# http://search.cpan.org/~chorny/Text-ParseWords-3.27/ParseWords.pm

use warnings;
use strict;
use Text::ParseWords;

my ($debug);
$debug = 0;
$debug = 1;

my ( $i, $last, $line, @tokens );

while (<>) {
  chomp;
  @tokens = quotewords( '\s+', 0, $_ );
  print " input is :@tokens:\n" if $debug;
  $last = undef;
  foreach $i (@tokens) {
    if ( not defined($last) ) {
      $last = $i;
      print qin($i) . " ";
    }
    elsif ( $i eq "OR" or $i eq "AND" ) {
      $last = $i;
      print qin($i) . " ";
    }
    else {
      if ( $last ne "OR" and $last ne "AND" ) {
        print "AND " . qin($i) . " ";
      }
      else {
        print qin("$i") . " ";
      }
      $last = $i;
    }
  }
  print "\n";
}

print STDERR " ( Lines read: $. )\n" if $debug;

# qin - quote if necessary.

sub qin {
  my ($phrase) = $_[0];
  if ( $phrase =~ / / ) {
    return '"' . $phrase . '"';
  }
  else {
    return $phrase;
  }
}

exit(0);

Producing (on your data in file data1):
Code:
% ./p1 data1
 input is :apple bannana:
apple AND bannana
 input is :apple bannana AND chickko:
apple AND bannana AND chickko
 input is :milk shake OR Graphes orange:
"milk shake" OR Graphes AND orange
 ( Lines read: 3 )

See perldoc Text::ParseWords on your system or obtain from cpan as noted. It takes care of the quoted strings, placing all the tokens in a list.

If the output is not what you desire, feel free to modify or adapt the code as necessary ... cheers, drl
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Substitute in vi

I know in vi you can do :%s/replaceme/withthis/ but if i want to find all lines say without a # at the begining and I want to put it in how would that command be formatted? I can't figure it out for the life of me. #comment blah1 hey1 grrr1 #comment #blah1 #hey1 #grrr1 (5 Replies)
Discussion started by: kingdbag
5 Replies

2. UNIX for Dummies Questions & Answers

Substitute Variable

Can anyone tell me what is the purpose of a substitute variable in the unix programming language and give an example where it may be used? Thanks! (0 Replies)
Discussion started by: mmg2711
0 Replies

3. UNIX for Dummies Questions & Answers

rm substitute with blacklist

hi! first i want to apologize for two things. my English and (possible) posting in the wrong sub forum (i couldn't find one that fits my question) I am needing a script that can substitute rm. the idea is that the operator can/must delete some folders and files from time to time, when the free... (2 Replies)
Discussion started by: broli
2 Replies

4. Shell Programming and Scripting

In Help, Substitute Text ...

i'm writing a script that will extract and substitute a certain part of a data. i'm having trouble with the substituting part ... Here's my data looks like: 01/01/08-001-23:46:18-01/01/08-23:50:43 01/01/08-003-23:45:19-01/01/08-23:55:49 01/01/08-005-23:52:18-01/01/08-23:58:52 i want to... (6 Replies)
Discussion started by: solidhelix08
6 Replies

5. UNIX for Dummies Questions & Answers

substitute file name

correct file names are: *_0.txt *_1.txt incorrect file names are: *_12.txt *_0123.txt *_04321.txt all files that are incorrect need to replace the ending with *_1.txt therefore need to create a loop to find the wrong files in a dir ->that dont end in _1.txt or _0.txt and then... (3 Replies)
Discussion started by: sigh2010
3 Replies

6. Shell Programming and Scripting

vi substitute

My question is how would I substitute for ceratain number of occurences in a line? If this is my input rjohns BFSTDBS01 Standard Silver NPRO30DINCR 2 Client Is it possible to change the first 3 occurences of space " " to a comma? (7 Replies)
Discussion started by: reggiej
7 Replies

7. UNIX for Dummies Questions & Answers

substitute (')

I usually use : Code: awk '{gsub(/xxx/,"yyy");print}' to substitute xxx with yyy. I have a problem substitute an expression like Code: x ' y Because of the ( ' ) Any idea on how to get over this problem? Thanks (2 Replies)
Discussion started by: cosmologist
2 Replies

8. UNIX for Dummies Questions & Answers

Substitute in VI

Hi there, i am updating a file on UNIX and have many lines as per below : listen:x:37:4:Network Admin:/usr/net/nls: i would like to substitute from the :/usr to the end of the line. so at the moment im using this : :s/"\/$/ /g but i get an error.can anyone help? thank you (3 Replies)
Discussion started by: brian112
3 Replies

9. Shell Programming and Scripting

Grep and substitute?

I have to parse ASCII files, output the relevant data to a comma-delimited file and load it into a database table. The specs for the file format have been recently updated and one section is causing problems. This is the original layout for that section. ... (2 Replies)
Discussion started by: alan
2 Replies

10. How to Post in the The UNIX and Linux Forums

Prstat substitute

Hello All, I have a shell script where with the lines below: echo "${v_sd_dateTime},${RUN_QUEUE_SIZE},${LOAD_AVERAGE},${v_sd_load_list},${v_sd_thread_count_list}" >> ${v_sd_file} Format of the output : 01/05/2005 08:00:00, RUN_QUEUE_SIZE, LOAD_AVG, CPU_PROD1, CPU_PROD2, THREADS_PROD1,... (1 Reply)
Discussion started by: Supriya_84
1 Replies
Text::Shellwords::Cursor(3pm)				User Contributed Perl Documentation			     Text::Shellwords::Cursor(3pm)

NAME
Text::Shellwords::Cursor - Parse a string into tokens SYNOPSIS
use Text::Shellwords::Cursor; my $parser = Text::Shellwords::Cursor->new(); my $str = 'ab cdef "ghi" j"k"l "'; my ($tok1) = $parser->parse_line($str); $tok1 = ['ab', 'cdef', 'ghi', 'j', 'k"l '] my ($tok2, $tokno, $tokoff) = $parser->parse_line($str, cursorpos => 6); as above, but $tokno=1, $tokoff=3 (under the 'f') DESCRIPTION This module is very similar to Text::Shellwords and Text::ParseWords. However, it has one very significant difference: it keeps track of a character position in the line it's parsing. For instance, if you pass it ("zq fmgb", cursorpos=>6), it would return (['zq', 'fmgb'], 1, 3). The cursorpos parameter tells where in the input string the cursor resides (just before the 'b'), and the result tells you that the cursor was on token 1 ('fmgb'), character 3 ('b'). This is very useful when computing command-line completions involving quoting, escaping, and tokenizing characters (like '(' or '='). A few helper utilities are included as well. You can escape a string to ensure that parsing it will produce the original string (parse_escape). You can also reassemble the tokens with a visually pleasing amount of whitespace between them (join_line). This module started out as an integral part of Term::GDBUI using code loosely based on Text::ParseWords. However, it is now basically a ground-up reimplementation. It was split out of Term::GDBUI for version 0.8. METHODS
new Creates a new parser. Takes named arguments on the command line. keep_quotes Normally all unescaped, unnecessary quote marks are stripped. If you specify "keep_quotes=>1", however, they are preserved. This is useful if you need to know whether the string was quoted or not (string constants) or what type of quotes was around it (affecting variable interpolation, for instance). token_chars This argument specifies the characters that should be considered tokens all by themselves. For instance, if I pass token_chars=>'=', then 'ab=123' would be parsed to ('ab', '=', '123'). Without token_chars, 'ab=123' remains a single string. NOTE: you cannot change token_chars after the constructor has been called! The regexps that use it are compiled once (m//o). Also, until the Gnu Readline library can accept "=[]," without diving into an endless loop, we will not tell history expansion to use token_chars (it uses " fIen()<>;&|" by default). debug Turns on rather copious debugging to try to show what the parser is thinking at every step. space_none space_before space_after These variables affect how whitespace in the line is normalized and it is reassembled into a string. See the join_line routine. error This is a reference to a routine that should be called to display a parse error. The routine takes two arguments: a reference to the parser, and the error message to display as a string. parsebail(msg) If the parsel routine or any of its subroutines runs into a fatal error, they call parsebail to present a very descriptive diagnostic. parsel This is the heinous routine that actually does the parsing. You should never need to call it directly. Call parse_line instead. parse_line(line, named args) This is the entrypoint to this module's parsing functionality. It converts a line into tokens, respecting quoted text, escaped characters, etc. It also keeps track of a cursor position on the input text, returning the token number and offset within the token where that position can be found in the output. This routine originally bore some resemblance to Text::ParseWords. It has changed almost completely, however, to support keeping track of the cursor position. It also has nicer failure modes, modular quoting, token characters (see token_chars in "new"), etc. This routine now does much more. Arguments: line This is a string containing the command-line to parse. This routine also accepts the following named parameters: cursorpos This is the character position in the line to keep track of. Pass undef (by not specifying it) or the empty string to have the line processed with cursorpos ignored. Note that passing undef is not the same as passing some random number and ignoring the result! For instance, if you pass 0 and the line begins with whitespace, you'll get a 0-length token at the beginning of the line to represent the cursor in the middle of the whitespace. This allows command completion to work even when the cursor is not near any tokens. If you pass undef, all whitespace at the beginning and end of the line will be trimmed as you would expect. If it is ambiguous whether the cursor should belong to the previous token or to the following one (i.e. if it's between two quoted strings, say "a""b" or a token_char), it always gravitates to the previous token. This makes more sense when completing. fixclosequote Sometimes you want to try to recover from a missing close quote (for instance, when calculating completions), but usually you want a missing close quote to be a fatal error. fixclosequote=>1 will implicitly insert the correct quote if it's missing. fixclosequote=>0 is the default. messages parse_line is capable of printing very informative error messages. However, sometimes you don't care enough to print a message (like when calculating completions). Messages are printed by default, so pass messages=>0 to turn them off. This function returns a reference to an array containing three items: tokens A the tokens that the line was separated into (ref to an array of strings). tokno The number of the token (index into the previous array) that contains cursorpos. tokoff The character offet into tokno of cursorpos. If the cursor is at the end of the token, tokoff will point to 1 character past the last character in tokno, a non-existant character. If the cursor is between tokens (surrounded by whitespace), a zero-length token will be created for it. parse_escape(lines) Escapes characters that would be otherwise interpreted by the parser. Will accept either a single string or an arrayref of strings (which will be modified in-place). join_line(tokens) This routine does a somewhat intelligent job of joining tokens back into a command line. If token_chars (see "new") is empty (the default), then it just escapes backslashes and quotes, and joins the tokens with spaces. However, if token_chars is nonempty, it tries to insert a visually pleasing amount of space between the tokens. For instance, rather than 'a ( b , c )', it tries to produce 'a (b, c)'. It won't reformat any tokens that aren't found in $self->{token_chars}, of course. To change the formatting, you can redefine the variables $self->{space_none}, $self->{space_before}, and $self->{space_after}. Each variable is a string containing all characters that should not be surrounded by whitespace, should have whitespace before, and should have whitespace after, respectively. Any character found in token_chars, but non in any of these space_ variables, will have space placed both before and after. BUGS
None known. LICENSE
Copyright (c) 2003-2011 Scott Bronson, all rights reserved. This program is covered by the MIT license. AUTHOR
Scott Bronson <bronson@rinspin.com> perl v5.14.2 2012-02-03 Text::Shellwords::Cursor(3pm)
All times are GMT -4. The time now is 06:54 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy