Awk: split and gensub query Post: 303037765

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Query In awk

Is it possible to have a pattern as RS in awk. For Example pl. go through the statement; " Account Serial Number: 88888888 TT00X000000XXXXXXXXXXXXX SS00X000000XX.000,XXXXXXXXXXXXXXXXXX0000XXXXXXX0000000000 WW00X0000000XX000000000000MMMMMMM MMMMMMM0000AA11110000000000000000000000000...

2. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each...

3. UNIX for Dummies Questions & Answers

some query in awk

Hi , I want to assign a value to variable which will have size of the file that is we have following files for eg: ls -ltr -rw-rw-r-- 1 dsadmin dstage 34 Oct 29 12:14 some.txt -rw-rw-r-- 1 dsadmin dstage 0 Oct 29 14:52 eg.txt -rwxrwxr-x 1 dsadmin dstage 1453 Oct...

4. Shell Programming and Scripting

gawk and gensub

Hi, $ echo "Hellooo" | gawk '{print gensub(/o{3}/, "z", 1)}' doesn't return "Hellz" as expected while: $ echo "Hellooo" | awk '{print gensub(/o+/, "z", 1)}' produces "Hellz" correctly. Are the {m,n} quantifiers not supported in gensub? I know that sub or gsub could do the job. It's just an...

5. UNIX for Dummies Questions & Answers

gensub and arraywith awk

Hi Unix.com ! I would need some help for something I don't understand :confused: input: 111|2 Y Z blue. 333|4 W X blue.; 5 Y Z red. 666|7 W X red.; 8 Y Z blue. 999|10 U V red.; 11 W X blue.; 12 Y Z red. From $2, I would like to remove the sub-strings containing "blue" (and the...

6. Shell Programming and Scripting

awk to split one field and print the last two fields within the split part.

Hello; I have a file consists of 4 columns separated by tab. The problem is the third fields. Some of the them are very long but can be split by the vertical bar "|". Also some of them do not contain the string "UniProt", but I could ignore it at this moment, and sort the file afterwards. Here is...

7. Shell Programming and Scripting

awk query

Hi, I have a sample file in the following format. 000013560240|000013560240|001|P|155|99396|0||SS00325665| 000013560240|000013560240|002|P|17|99000|0||SS00325665| 000013560240|000013560240|002|F|-17|99000|0|R|SS00325665| 000013560240|000013560240|003|P|20|82270|0||SS00325665|...

8. Shell Programming and Scripting

Gawk gensub, match capital words and lowercase words

Hi I have strings like these : Vengeance mitt Men Vengeance gloves Women Quatro Windstopper Etip gloves Quatro Windstopper Etip gloves Girls Thermobite hooded jacket Thermobite Triclimate snow jacket Boys Thermobite Triclimate snow jacket and I would like to get the lower case words at...

9. Shell Programming and Scripting

awk split and awk calculation in the same command

I am trying to run the awk below. My question is when I split the input, then run anotherawk to perform a calculation using that splitas the input there are no issues. When I try to combine them the output is not correct, is the split not working or did I do it wrong? Thank you :). input ...

10. Programming

Need sql query to string split and normalize data

Hello gurus, I have data in one of the oracle tables as as below: Column 1 Column 2 1 NY,NJ,CA 2 US,UK, 3 AS,EU,NA fyi, Column 2 above has data delimited with a comma as shown. I need a sql query the produce the below output in two columns...

LEARN ABOUT REDHAT

regexp

regexp(n)						       Tcl Built-In Commands							 regexp(n)

__________________________________________________________________________________________________________________________________________________

NAME

       regexp - Match a regular expression against a string

SYNOPSIS

       regexp ?switches? exp string ?matchVar? ?subMatchVar subMatchVar ...?
_________________________________________________________________

DESCRIPTION

       Determines  whether  the  regular expression exp matches part or all of string and returns 1 if it does, 0 if it doesn't, unless -inline is
       specified (see below).  (Regular expression matching is described in the re_syntax reference page.)

       If additional arguments are specified after string then they are treated as the names of variables in which  to	return	information  about
       which part(s) of string matched exp.  MatchVar will be set to the range of string that matched all of exp.  The first subMatchVar will con-
       tain the characters in string that matched the leftmost parenthesized subexpression within exp, the next subMatchVar will contain the char-
       acters that matched the next parenthesized subexpression to the right in exp, and so on.

       If the initial arguments to regexp start with - then they are treated as switches.  The following switches are currently supported:

       -about	      Instead  of  attempting to match the regular expression, returns a list containing information about the regular expression.
		      The first element of the list is a subexpression count.  The second element is a list of property names that describe  vari-
		      ous attributes of the regular expression. This switch is primarily intended for debugging purposes.

       -expanded      Enables use of the expanded regular expression syntax where whitespace and comments are ignored.	This is the same as speci-
		      fying the (?x) embedded option (see METASYNTAX, below).

       -indices       Changes what is stored in the subMatchVars.  Instead of storing the matching characters from string, each variable will con-
		      tain  a  list  of two decimal strings giving the indices in string of the first and last characters in the matching range of
		      characters.

       -line	      Enables newline-sensitive matching.  By default, newline is a completely ordinary character with no special  meaning.   With
		      this  flag,  `[^' bracket expressions and `.' never match newline, `^' matches an empty string after any newline in addition
		      to its normal function, and `$' matches an empty string before any newline in addition to its normal function.  This flag is
		      equivalent to specifying both -linestop and -lineanchor, or the (?n) embedded option (see METASYNTAX, below).

       -linestop      Changes  the behavior of `[^' bracket expressions and `.' so that they stop at newlines.	This is the same as specifying the
		      (?p) embedded option (see METASYNTAX, below).

       -lineanchor    Changes the behavior of `^' and `$' (the ``anchors'') so they match the beginning and end of a line respectively.   This	is
		      the same as specifying the (?w) embedded option (see METASYNTAX, below).

       -nocase	      Causes upper-case characters in string to be treated as lower case during the matching process.				   |

       -all																	   |
		      Causes  the  regular expression to be matched as many times as possible in the string, returning the total number of matches |
		      found.  If this is specified with match variables, they will continue information for the last match only.		   |

       -inline																	   |
		      Causes the command to return, as a list, the data that would otherwise be placed in match variables.   When  using  -inline, |
		      match  variables may not be specified.  If used with -all, the list will be concatenated at each iteration, such that a flat |
		      list is always returned.	For each match iteration, the command will append the overall match data,  plus  one  element  for |
		      each subexpression in the regular expression.  Examples are:								   |
			  regexp -inline -- {w(w)} " inlined "										   |
		       => {in n}														   |
			  regexp -all -inline -- {w(w)} " inlined "										   |
		       => {in n li i ne e}													   |

       -start index																   |
		      Specifies a character index offset into the string to start matching the regular expression at.  When using this switch, `^' |
		      will not match the beginning of the line, and A will still match the start of the string at index.  If -indices	is  speci- |
		      fied,  the  indices  will be indexed starting from the absolute beginning of the input string.  index will be constrained to |
		      the bounds of the input string.

       --	      Marks the end of switches.  The argument following this one will be treated as exp even if it starts with a -.

       If there are more subMatchVar's than parenthesized subexpressions within exp, or if a particular subexpression in  exp  doesn't	match  the
       string  (e.g.  because  it  was in a portion of the expression that wasn't matched), then the corresponding subMatchVar will be set to ``-1
       -1'' if -indices has been specified or to an empty string otherwise.

SEE ALSO

       re_syntax(n), regsub(n)

KEYWORDS

       match, regular expression, string

Tcl									8.3								 regexp(n)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Query In awk

Discussion started by: raguramtgr

2. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

Discussion started by: madhunk

3. UNIX for Dummies Questions & Answers

some query in awk

Discussion started by: Amey Joshi

4. Shell Programming and Scripting

gawk and gensub

Discussion started by: ripat