Sponsored Content
Full Discussion: Awk: split and gensub query
Top Forums UNIX for Beginners Questions & Answers Awk: split and gensub query Post 303037765 by RudiC on Tuesday 13th of August 2019 06:31:59 AM
Old 08-13-2019
You are right - the pos variable is split into the var array. Then, in all input lines ($0), gensub (a specific gawk function) replaces the characters (no matter what they are!) at the positions in var by the pipe symbol, as said in man gawk:

Quote:
gensub(regexp, replacement, how [, target])
Search the target string target for matches of the regular expression regexp. If how is a string beginning with ‘g' or ‘G' (short for “global”¯), then replace all matches of regexp with replacement. Otherwise, treat how as a number indicating which match of regexp to replace. Treat numeric values less than one as if they were one. If no target is supplied, use $0. Return the modified string as the result of the function. The original target string is not changed.
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Query In awk

Is it possible to have a pattern as RS in awk. For Example pl. go through the statement; " Account Serial Number: 88888888 TT00X000000XXXXXXXXXXXXX SS00X000000XX.000,XXXXXXXXXXXXXXXXXX0000XXXXXXX0000000000 WW00X0000000XX000000000000MMMMMMM MMMMMMM0000AA11110000000000000000000000000... (1 Reply)
Discussion started by: raguramtgr
1 Replies

2. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

3. UNIX for Dummies Questions & Answers

some query in awk

Hi , I want to assign a value to variable which will have size of the file that is we have following files for eg: ls -ltr -rw-rw-r-- 1 dsadmin dstage 34 Oct 29 12:14 some.txt -rw-rw-r-- 1 dsadmin dstage 0 Oct 29 14:52 eg.txt -rwxrwxr-x 1 dsadmin dstage 1453 Oct... (2 Replies)
Discussion started by: Amey Joshi
2 Replies

4. Shell Programming and Scripting

gawk and gensub

Hi, $ echo "Hellooo" | gawk '{print gensub(/o{3}/, "z", 1)}' doesn't return "Hellz" as expected while: $ echo "Hellooo" | awk '{print gensub(/o+/, "z", 1)}' produces "Hellz" correctly. Are the {m,n} quantifiers not supported in gensub? I know that sub or gsub could do the job. It's just an... (2 Replies)
Discussion started by: ripat
2 Replies

5. UNIX for Dummies Questions & Answers

gensub and arraywith awk

Hi Unix.com ! I would need some help for something I don't understand :confused: input: 111|2 Y Z blue. 333|4 W X blue.; 5 Y Z red. 666|7 W X red.; 8 Y Z blue. 999|10 U V red.; 11 W X blue.; 12 Y Z red. From $2, I would like to remove the sub-strings containing "blue" (and the... (4 Replies)
Discussion started by: beca123456
4 Replies

6. Shell Programming and Scripting

awk to split one field and print the last two fields within the split part.

Hello; I have a file consists of 4 columns separated by tab. The problem is the third fields. Some of the them are very long but can be split by the vertical bar "|". Also some of them do not contain the string "UniProt", but I could ignore it at this moment, and sort the file afterwards. Here is... (5 Replies)
Discussion started by: yifangt
5 Replies

7. Shell Programming and Scripting

awk query

Hi, I have a sample file in the following format. 000013560240|000013560240|001|P|155|99396|0||SS00325665| 000013560240|000013560240|002|P|17|99000|0||SS00325665| 000013560240|000013560240|002|F|-17|99000|0|R|SS00325665| 000013560240|000013560240|003|P|20|82270|0||SS00325665|... (3 Replies)
Discussion started by: nua7
3 Replies

8. Shell Programming and Scripting

Gawk gensub, match capital words and lowercase words

Hi I have strings like these : Vengeance mitt Men Vengeance gloves Women Quatro Windstopper Etip gloves Quatro Windstopper Etip gloves Girls Thermobite hooded jacket Thermobite Triclimate snow jacket Boys Thermobite Triclimate snow jacket and I would like to get the lower case words at... (2 Replies)
Discussion started by: louisJ
2 Replies

9. Shell Programming and Scripting

awk split and awk calculation in the same command

I am trying to run the awk below. My question is when I split the input, then run anotherawk to perform a calculation using that splitas the input there are no issues. When I try to combine them the output is not correct, is the split not working or did I do it wrong? Thank you :). input ... (8 Replies)
Discussion started by: cmccabe
8 Replies

10. Programming

Need sql query to string split and normalize data

Hello gurus, I have data in one of the oracle tables as as below: Column 1 Column 2 1 NY,NJ,CA 2 US,UK, 3 AS,EU,NA fyi, Column 2 above has data delimited with a comma as shown. I need a sql query the produce the below output in two columns... (5 Replies)
Discussion started by: calredd
5 Replies
Str(3o) 							   OCaml library							   Str(3o)

NAME
Str - Regular expressions and high-level string processing Module Module Str Documentation Module Str : sig end Regular expressions and high-level string processing === Regular expressions === type regexp The type of compiled regular expressions. val regexp : string -> regexp Compile a regular expression. The following constructs are recognized: - . Matches any character except newline. - * (postfix) Matches the preceding expression zero, one or several times - + (postfix) Matches the preceding expression one or several times - ? (postfix) Matches the preceding expression once or not at all - [..] Character set. Ranges are denoted with - , as in [a-z] . An initial ^ , as in [^0-9] , complements the set. To include a ] char- acter in a set, make it the first character of the set. To include a - character in a set, make it the first or the last character of the set. - ^ Matches at beginning of line (either at the beginning of the matched string, or just after a newline character). - $ Matches at end of line (either at the end of the matched string, or just before a newline character). - | (infix) Alternative between two expressions. - (..) Grouping and naming of the enclosed expression. - 1 The text matched by the first (...) expression ( 2 for the second expression, and so on up to 9 ). -  Matches word boundaries. - Quotes special characters. The special characters are $^.*+?[] . val regexp_case_fold : string -> regexp Same as regexp , but the compiled expression will match text in a case-insensitive way: uppercase and lowercase letters will be considered equivalent. val quote : string -> string Str.quote s returns a regexp string that matches exactly s and nothing else. val regexp_string : string -> regexp Str.regexp_string s returns a regular expression that matches exactly s and nothing else. val regexp_string_case_fold : string -> regexp Str.regexp_string_case_fold is similar to Str.regexp_string , but the regexp matches in a case-insensitive way. === String matching and searching === val string_match : regexp -> string -> int -> bool string_match r s start tests whether a substring of s that starts at position start matches the regular expression r . The first character of a string has position 0 , as usual. val search_forward : regexp -> string -> int -> int search_forward r s start searches the string s for a substring matching the regular expression r . The search starts at position start and proceeds towards the end of the string. Return the position of the first character of the matched substring, or raise Not_found if no sub- string matches. val search_backward : regexp -> string -> int -> int search_backward r s last searches the string s for a substring matching the regular expression r . The search first considers substrings that start at position last and proceeds towards the beginning of string. Return the position of the first character of the matched sub- string; raise Not_found if no substring matches. val string_partial_match : regexp -> string -> int -> bool Similar to Str.string_match , but also returns true if the argument string is a prefix of a string that matches. This includes the case of a true complete match. val matched_string : string -> string matched_string s returns the substring of s that was matched by the latest Str.string_match , Str.search_forward or Str.search_backward . The user must make sure that the parameter s is the same string that was passed to the matching or searching function. val match_beginning : unit -> int match_beginning() returns the position of the first character of the substring that was matched by Str.string_match , Str.search_forward or Str.search_backward . val match_end : unit -> int match_end() returns the position of the character following the last character of the substring that was matched by string_match , search_forward or search_backward . val matched_group : int -> string -> string matched_group n s returns the substring of s that was matched by the n th group (...) of the regular expression during the latest Str.string_match , Str.search_forward or Str.search_backward . The user must make sure that the parameter s is the same string that was passed to the matching or searching function. matched_group n s raises Not_found if the n th group of the regular expression was not matched. This can happen with groups inside alternatives | , options ? or repetitions * . For instance, the empty string will match (a)* , but matched_group 1 will raise Not_found because the first group itself was not matched. val group_beginning : int -> int group_beginning n returns the position of the first character of the substring that was matched by the n th group of the regular expres- sion. Raises Not_found if the n th group of the regular expression was not matched. Invalid_argument if there are fewer than n groups in the regular expression. val group_end : int -> int group_end n returns the position of the character following the last character of substring that was matched by the n th group of the regu- lar expression. Raises Not_found if the n th group of the regular expression was not matched. Invalid_argument if there are fewer than n groups in the regular expression. === Replacement === val global_replace : regexp -> string -> string -> string global_replace regexp templ s returns a string identical to s , except that all substrings of s that match regexp have been replaced by templ . The replacement template templ can contain 1 , 2 , etc; these sequences will be replaced by the text matched by the corresponding group in the regular expression. stands for the text matched by the whole regular expression. val replace_first : regexp -> string -> string -> string Same as Str.global_replace , except that only the first substring matching the regular expression is replaced. val global_substitute : regexp -> (string -> string) -> string -> string global_substitute regexp subst s returns a string identical to s , except that all substrings of s that match regexp have been replaced by the result of function subst . The function subst is called once for each matching substring, and receives s (the whole text) as argument. val substitute_first : regexp -> (string -> string) -> string -> string Same as Str.global_substitute , except that only the first substring matching the regular expression is replaced. val replace_matched : string -> string -> string replace_matched repl s returns the replacement text repl in which 1 , 2 , etc. have been replaced by the text matched by the correspond- ing groups in the most recent matching operation. s must be the same string that was matched during this matching operation. === Splitting === val split : regexp -> string -> string list split r s splits s into substrings, taking as delimiters the substrings that match r , and returns the list of substrings. For instance, split (regexp [ ]+ ) s splits s into blank-separated words. An occurrence of the delimiter at the beginning and at the end of the string is ignored. val bounded_split : regexp -> string -> int -> string list Same as Str.split , but splits into at most n substrings, where n is the extra integer parameter. val split_delim : regexp -> string -> string list Same as Str.split but occurrences of the delimiter at the beginning and at the end of the string are recognized and returned as empty strings in the result. For instance, split_delim (regexp ) abc returns [""; abc ; ] , while split with the same arguments returns ["abc"] . val bounded_split_delim : regexp -> string -> int -> string list Same as Str.bounded_split , but occurrences of the delimiter at the beginning and at the end of the string are recognized and returned as empty strings in the result. type split_result = | Text of string | Delim of string val full_split : regexp -> string -> split_result list Same as Str.split_delim , but returns the delimiters as well as the substrings contained between delimiters. The former are tagged Delim in the result list; the latter are tagged Text . For instance, full_split (regexp [{}] ) {ab} returns [Delim { ; Text ab ; Delim } ] . val bounded_full_split : regexp -> string -> int -> split_result list Same as Str.bounded_split_delim , but returns the delimiters as well as the substrings contained between delimiters. The former are tagged Delim in the result list; the latter are tagged Text . === Extracting substrings === val string_before : string -> int -> string string_before s n returns the substring of all characters of s that precede position n (excluding the character at position n ). val string_after : string -> int -> string string_after s n returns the substring of all characters of s that follow position n (including the character at position n ). val first_chars : string -> int -> string first_chars s n returns the first n characters of s . This is the same function as Str.string_before . val last_chars : string -> int -> string last_chars s n returns the last n characters of s . OCamldoc 2012-06-26 Str(3o)
All times are GMT -4. The time now is 03:53 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy