Sponsored Content
Full Discussion: Awk: split and gensub query
Top Forums UNIX for Beginners Questions & Answers Awk: split and gensub query Post 303037765 by RudiC on Tuesday 13th of August 2019 06:31:59 AM
Old 08-13-2019
You are right - the pos variable is split into the var array. Then, in all input lines ($0), gensub (a specific gawk function) replaces the characters (no matter what they are!) at the positions in var by the pipe symbol, as said in man gawk:

Quote:
gensub(regexp, replacement, how [, target])
Search the target string target for matches of the regular expression regexp. If how is a string beginning with ‘g' or ‘G' (short for “global”¯), then replace all matches of regexp with replacement. Otherwise, treat how as a number indicating which match of regexp to replace. Treat numeric values less than one as if they were one. If no target is supplied, use $0. Return the modified string as the result of the function. The original target string is not changed.
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Query In awk

Is it possible to have a pattern as RS in awk. For Example pl. go through the statement; " Account Serial Number: 88888888 TT00X000000XXXXXXXXXXXXX SS00X000000XX.000,XXXXXXXXXXXXXXXXXX0000XXXXXXX0000000000 WW00X0000000XX000000000000MMMMMMM MMMMMMM0000AA11110000000000000000000000000... (1 Reply)
Discussion started by: raguramtgr
1 Replies

2. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

3. UNIX for Dummies Questions & Answers

some query in awk

Hi , I want to assign a value to variable which will have size of the file that is we have following files for eg: ls -ltr -rw-rw-r-- 1 dsadmin dstage 34 Oct 29 12:14 some.txt -rw-rw-r-- 1 dsadmin dstage 0 Oct 29 14:52 eg.txt -rwxrwxr-x 1 dsadmin dstage 1453 Oct... (2 Replies)
Discussion started by: Amey Joshi
2 Replies

4. Shell Programming and Scripting

gawk and gensub

Hi, $ echo "Hellooo" | gawk '{print gensub(/o{3}/, "z", 1)}' doesn't return "Hellz" as expected while: $ echo "Hellooo" | awk '{print gensub(/o+/, "z", 1)}' produces "Hellz" correctly. Are the {m,n} quantifiers not supported in gensub? I know that sub or gsub could do the job. It's just an... (2 Replies)
Discussion started by: ripat
2 Replies

5. UNIX for Dummies Questions & Answers

gensub and arraywith awk

Hi Unix.com ! I would need some help for something I don't understand :confused: input: 111|2 Y Z blue. 333|4 W X blue.; 5 Y Z red. 666|7 W X red.; 8 Y Z blue. 999|10 U V red.; 11 W X blue.; 12 Y Z red. From $2, I would like to remove the sub-strings containing "blue" (and the... (4 Replies)
Discussion started by: beca123456
4 Replies

6. Shell Programming and Scripting

awk to split one field and print the last two fields within the split part.

Hello; I have a file consists of 4 columns separated by tab. The problem is the third fields. Some of the them are very long but can be split by the vertical bar "|". Also some of them do not contain the string "UniProt", but I could ignore it at this moment, and sort the file afterwards. Here is... (5 Replies)
Discussion started by: yifangt
5 Replies

7. Shell Programming and Scripting

awk query

Hi, I have a sample file in the following format. 000013560240|000013560240|001|P|155|99396|0||SS00325665| 000013560240|000013560240|002|P|17|99000|0||SS00325665| 000013560240|000013560240|002|F|-17|99000|0|R|SS00325665| 000013560240|000013560240|003|P|20|82270|0||SS00325665|... (3 Replies)
Discussion started by: nua7
3 Replies

8. Shell Programming and Scripting

Gawk gensub, match capital words and lowercase words

Hi I have strings like these : Vengeance mitt Men Vengeance gloves Women Quatro Windstopper Etip gloves Quatro Windstopper Etip gloves Girls Thermobite hooded jacket Thermobite Triclimate snow jacket Boys Thermobite Triclimate snow jacket and I would like to get the lower case words at... (2 Replies)
Discussion started by: louisJ
2 Replies

9. Shell Programming and Scripting

awk split and awk calculation in the same command

I am trying to run the awk below. My question is when I split the input, then run anotherawk to perform a calculation using that splitas the input there are no issues. When I try to combine them the output is not correct, is the split not working or did I do it wrong? Thank you :). input ... (8 Replies)
Discussion started by: cmccabe
8 Replies

10. Programming

Need sql query to string split and normalize data

Hello gurus, I have data in one of the oracle tables as as below: Column 1 Column 2 1 NY,NJ,CA 2 US,UK, 3 AS,EU,NA fyi, Column 2 above has data delimited with a comma as shown. I need a sql query the produce the below output in two columns... (5 Replies)
Discussion started by: calredd
5 Replies
regexp(n)						       Tcl Built-In Commands							 regexp(n)

__________________________________________________________________________________________________________________________________________________

NAME
regexp - Match a regular expression against a string SYNOPSIS
regexp ?switches? exp string ?matchVar? ?subMatchVar subMatchVar ...? _________________________________________________________________ DESCRIPTION
Determines whether the regular expression exp matches part or all of string and returns 1 if it does, 0 if it doesn't, unless -inline is specified (see below). (Regular expression matching is described in the re_syntax reference page.) If additional arguments are specified after string then they are treated as the names of variables in which to return information about which part(s) of string matched exp. MatchVar will be set to the range of string that matched all of exp. The first subMatchVar will con- tain the characters in string that matched the leftmost parenthesized subexpression within exp, the next subMatchVar will contain the char- acters that matched the next parenthesized subexpression to the right in exp, and so on. If the initial arguments to regexp start with - then they are treated as switches. The following switches are currently supported: -about Instead of attempting to match the regular expression, returns a list containing information about the regular expression. The first element of the list is a subexpression count. The second element is a list of property names that describe vari- ous attributes of the regular expression. This switch is primarily intended for debugging purposes. -expanded Enables use of the expanded regular expression syntax where whitespace and comments are ignored. This is the same as speci- fying the (?x) embedded option (see METASYNTAX, below). -indices Changes what is stored in the subMatchVars. Instead of storing the matching characters from string, each variable will con- tain a list of two decimal strings giving the indices in string of the first and last characters in the matching range of characters. -line Enables newline-sensitive matching. By default, newline is a completely ordinary character with no special meaning. With this flag, `[^' bracket expressions and `.' never match newline, `^' matches an empty string after any newline in addition to its normal function, and `$' matches an empty string before any newline in addition to its normal function. This flag is equivalent to specifying both -linestop and -lineanchor, or the (?n) embedded option (see METASYNTAX, below). -linestop Changes the behavior of `[^' bracket expressions and `.' so that they stop at newlines. This is the same as specifying the (?p) embedded option (see METASYNTAX, below). -lineanchor Changes the behavior of `^' and `$' (the ``anchors'') so they match the beginning and end of a line respectively. This is the same as specifying the (?w) embedded option (see METASYNTAX, below). -nocase Causes upper-case characters in string to be treated as lower case during the matching process. | -all | Causes the regular expression to be matched as many times as possible in the string, returning the total number of matches | found. If this is specified with match variables, they will continue information for the last match only. | -inline | Causes the command to return, as a list, the data that would otherwise be placed in match variables. When using -inline, | match variables may not be specified. If used with -all, the list will be concatenated at each iteration, such that a flat | list is always returned. For each match iteration, the command will append the overall match data, plus one element for | each subexpression in the regular expression. Examples are: | regexp -inline -- {w(w)} " inlined " | => {in n} | regexp -all -inline -- {w(w)} " inlined " | => {in n li i ne e} | -start index | Specifies a character index offset into the string to start matching the regular expression at. When using this switch, `^' | will not match the beginning of the line, and A will still match the start of the string at index. If -indices is speci- | fied, the indices will be indexed starting from the absolute beginning of the input string. index will be constrained to | the bounds of the input string. -- Marks the end of switches. The argument following this one will be treated as exp even if it starts with a -. If there are more subMatchVar's than parenthesized subexpressions within exp, or if a particular subexpression in exp doesn't match the string (e.g. because it was in a portion of the expression that wasn't matched), then the corresponding subMatchVar will be set to ``-1 -1'' if -indices has been specified or to an empty string otherwise. SEE ALSO
re_syntax(n), regsub(n) KEYWORDS
match, regular expression, string Tcl 8.3 regexp(n)
All times are GMT -4. The time now is 05:33 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy