Sponsored Content
Full Discussion: Reg Ex question
Top Forums Shell Programming and Scripting Reg Ex question Post 302263919 by vgersh99 on Tuesday 2nd of December 2008 04:42:26 PM
Old 12-02-2008
echo 'This "sentence is" a combination of "multiple words"' | nawk -f garric.awk

garric.awk:
Code:
# setcsv(str, sep) - parse CSV (MS specification) input
# str, the string to be parsed. (Most likely $0.)
# sep, the separator between the values.
#
# After a call to setcsv the parsed fields are found in $1 to $NF.
# setcsv returns 1 on sucess and 0 on failure.
#
# By Peter Str\366mberg aka PEZ.
# Based on setcsv by Adrian Davis. Modified to handle a separator
# of choice and embedded newlines. The basic approach is to take the
# burden off of the regular expression matching by replacing ambigious
# characters with characters unlikely to be found in the input. For
# this the characters "\035".
#
# Note 1. Prior to calling setcsv you must set FS to a character which
#         can never be found in the input. (Consider SUBSEP.)
# Note 2. If setcsv can't find the closing double quote for the string
#         in str it will consume the next line of input by calling
#         getline and call itself until it finds the closing double
#         qoute or no more input is available (considered a failiure).
# Note 3. Only the "" representation of a literal quote is supported.
# Note 4. setcsv will probably missbehave if sep used as a regular
#         expression can match anything else than a call to index()
#         would match.
BEGIN { FS=SUBSEP; OFS="|" }

{
  result = setcsv($0, " ")
  for(i=1;i<=NF;i++)
    printf("result[%d] = %s\n", i-1, $i)
  #print
}

function setcsv(str, sep, i) {
  gsub(/""/, "\035", str)
  gsub(sep, FS, str)

  while (match(str, /"[^"]*"/)) {
    middle = substr(str, RSTART+1, RLENGTH-2)
    gsub(FS, sep, middle)
    str = sprintf("%.*s%s%s", RSTART-1, str, middle,
      substr(str, RSTART+RLENGTH))
  }

  if (index(str, "\"")) {
    return ((getline) > 0) ? setcsv(str (RT != "" ? RT : RS) $0, sep) : !setcsv(str "\"", sep)
  } else {
    gsub(/\035/, "\"", str)
    $0 = str

    for (i = 1; i <= NF; i++)
      if (match($i, /^"+$/))
        $i = substr($i, 2)

    $1 = $1 ""
    return 1
  }
}

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

reg files

Dear all, One of our jobs retrieves data from tables and writes it to files. This job was running for around 15 minutes for the past 8 months. Now, this job is runnig for 45-50 minutes. I checked with the DBA's and found no issues with database. The time taken by to job to write to the file is... (5 Replies)
Discussion started by: ranj@chn
5 Replies

2. Shell Programming and Scripting

Reg: Gzip

Hi , I want gzip a folder te55 which has got 3 files test1.test2,test3 the name of the gzipped folder should be te55.gz with the 3 files as test1,test2,test3 itself... Is it possible... thanks in advance sam (5 Replies)
Discussion started by: sam99
5 Replies

3. Shell Programming and Scripting

need a help reg -d in shell

hi, I am using this to get previous month `date -d"1 month ago" "+%m"` But will it work for january?..will it return 12? Please advice. (2 Replies)
Discussion started by: vanathi
2 Replies

4. Shell Programming and Scripting

Reg expression For

HI system.sysUpTime.0 : Timeticks: (1519411311) 175 days, 20:35:13.11 From the above output i need only 175days in a perl script.. Please Help (2 Replies)
Discussion started by: Harikrishna
2 Replies

5. Shell Programming and Scripting

reg exp question

Hi, Should be a difference between ']]*' and ']+' ? I use them in bash with sed and grep. Thanks (1 Reply)
Discussion started by: ynir
1 Replies

6. UNIX for Dummies Questions & Answers

Reg: MAILX

Hi all, I am trying to send a mail by using MAILX option to my YAHOO-Id. It is giving the following error. Can any one help me to find what is the problem? Do i need to get any kind of settings in my UNIX box for using MAILX? The bounce mail is as below: Message 1: From MAILER-DAEMON Tue... (2 Replies)
Discussion started by: Raamc
2 Replies

7. Solaris

Reg. VXVM

Hi Guys, I have a doubt either to Reboot the server after Replacing the disk0. I have two disks under vxvm root mirrored and i had a problem with primary disk so i replace the disk0 failed primary disk and then mirrored. After mirroring is it reboot required ? (7 Replies)
Discussion started by: kurva
7 Replies

8. Shell Programming and Scripting

Sorting - Reg.

Hi masters, I have one doubt, lets's say file1 has the following contents, 1 2.0 3.1 5.5 7 5.10 5.9 How to sort these contents to get the o/p like 1 2.0 3.1 5.5 5.9 5.10 7 (8 Replies)
Discussion started by: ecearund
8 Replies

9. Windows & DOS: Issues & Discussions

Question regarding Reg entries

Since I cannot find a ffmpeg build that will automatically include a environment variable for the CMD ffmpeg command I'll probably have to do it myself. However I would like to do so by saving it inside a .reg file. For example if my path towards FFMPEG is: C:\RESOURCE\FFMPEG\ffmpeg.exe ... (5 Replies)
Discussion started by: pasc
5 Replies

10. Shell Programming and Scripting

REG Expression

Need your help in creating regular expression for particular set. let say I have given two dates 20130623 to 20140625. I need to create regular for the dates which fall in between above two dates. (4 Replies)
Discussion started by: gvkumar25
4 Replies
STRTOK(3)						   BSD Library Functions Manual 						 STRTOK(3)

NAME
strtok, strtok_r -- string tokens LIBRARY
Standard C Library (libc, -lc) SYNOPSIS
#include <string.h> char * strtok(char * restrict str, const char * restrict sep); char * strtok_r(char *str, const char *sep, char **lasts); DESCRIPTION
The strtok() function is used to isolate sequential tokens in a nul-terminated string, str. These tokens are separated in the string by at least one of the characters in sep. The first time that strtok() is called, str should be specified; subsequent calls, wishing to obtain further tokens from the same string, should pass a null pointer instead. The separator string, sep, must be supplied each time, and may change between calls. The strtok() function returns a pointer to the beginning of each subsequent token in the string, after replacing the separator character itself with a NUL character. Separator characters at the beginning of the string or at the continuation point are skipped so that zero length tokens are not returned. When no more tokens remain, a null pointer is returned. The strtok_r() function implements the functionality of strtok() but is passed an additional argument, lasts, which points to a user-provided pointer which is used by strtok_r() to store state which needs to be kept between calls to scan the same string; unlike strtok(), it is not necessary to limit tokenizing to a single string at a time when using strtok_r(). EXAMPLES
The following will construct an array of pointers to each individual word in the string s: #define MAXTOKENS 128 char s[512], *p, *tokens[MAXTOKENS]; char *last; int i = 0; snprintf(s, sizeof(s), "cat dog horse cow"); for ((p = strtok_r(s, " ", &last)); p; (p = strtok_r(NULL, " ", &last)), i++) { if (i < MAXTOKENS - 1) tokens[i] = p; } tokens[i] = NULL; That is, tokens[0] will point to "cat", tokens[1] will point to "dog", tokens[2] will point to "horse", and tokens[3] will point to "cow". SEE ALSO
index(3), memchr(3), rindex(3), strchr(3), strcspn(3), strpbrk(3), strrchr(3), strsep(3), strspn(3), strstr(3) STANDARDS
The strtok() function conforms to ANSI X3.159-1989 (``ANSI C89''). The strtok_r() function conforms to IEEE Std 1003.1c-1995 (``POSIX.1''). BUGS
The System V strtok(), if handed a string containing only delimiter characters, will not alter the next starting point, so that a call to strtok() with a different (or empty) delimiter string may return a non-NULL value. Since this implementation always alters the next starting point, such a sequence of calls would always return NULL. BSD
August 11, 2002 BSD
All times are GMT -4. The time now is 04:50 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy