Sponsored Content
Top Forums Shell Programming and Scripting Filter uniq field values (non-substring) Post 302900799 by alister on Thursday 8th of May 2014 07:35:54 PM
Old 05-08-2014
Nevermind me. I didn't register the "next".

Regards,
Alister

---------- Post updated at 07:35 PM ---------- Previous update was at 06:59 PM ----------

Quote:
Originally Posted by vgersh99
see simplified version - with no deletes - just next-ing...
You are correct in correcting me; not every line is added. However, if, like he original problem, substrings can precede their superstrings, then your suggestion is inadequate.

Consider:
Code:
1 abcd    idx01    ijklm
2 abc    idx03    klm
3 abcd    idx05    jkl
4 cdef    idx06    ijklm
5 efgh    idx07    abcd
6 efg    idx09    abc
7 efx    idx11    abcd
8 fgh    idx12    bcd
9 fefx  blah  zabcdz

If, like the original data sample, substrings can precede their superstings, line 7 should be excluded because both its $2 and $4 are substrings of line 9. Your code won't catch that.

Again, I could be mistaken. yifangt has not been strictly comprehensive in describing the problem.

I hope my nitpicking isn't getting on your last nerve.

Regards,
Alister
This User Gave Thanks to alister For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Uniq using only the first field

Hi all, I have a file that contains a list of codes (shown below). I want to 'uniq' the file using only the first field. Anyone know an easy way of doing it? Cheers, Dave ##### Input File ##### 1xr1 1xws 1yxt 1yxu 1yxv 1yxx 2o3p 2o63 2o64 2o65 1xr1 1xws 1yxt 1yxv 1yxx 2o3p 2o63 2o64... (8 Replies)
Discussion started by: Digby
8 Replies

2. UNIX for Dummies Questions & Answers

How to uniq third field in a file

Hi ; I have a question regarding the uniq command in unix How do I uniq 3rd field in a file ? original file : zoom coord 39 18652 39 18652 zoom coord 39 18653 39 18653 zoom coord 39 18818 39 18818 zoom coord 39 18840 39 18840 zoom coord 41 15096 41 15096 zoom... (1 Reply)
Discussion started by: babycakes
1 Replies

3. Shell Programming and Scripting

How to use uniq on a certain field?

How can I use uniq on a certain field or what else could I use? If I want to use uniq on the second field and the output would remove one of the lines with a 5. bob 5 hand jane 3 leg jon 4 head chris 5 lungs (1 Reply)
Discussion started by: Bandit390
1 Replies

4. Shell Programming and Scripting

filter the uniq record problem

Anyone can help for filter the uniq record for below example? Thank you very much Input file 20090503011111|test|abc 20090503011112|tet1|abc|def 20090503011112|test1|bcd|def 20090503011131|abc|abc 20090503011131|bbc|bcd 20090503011152|bcd|abc 20090503011151|abc|abc... (8 Replies)
Discussion started by: bleach8578
8 Replies

5. Shell Programming and Scripting

Uniq based on first field

Hi New to unix. I want to display only the unrepeated lines from a file using first field. Ex: 1234 uname1 status1 1235 uname2 status2 1234 uname3 status3 1236 uname5 status5 I used sort filename | uniq -u output: 1234 uname1 status1 1235 uname2 status2 1234 uname3 status3 1236... (10 Replies)
Discussion started by: venummca
10 Replies

6. Shell Programming and Scripting

Sort field and uniq

I have a flatfile A.txt 2012/12/04 14:06:07 |trees|Boards 2, 3|denver|mekong|mekong12 2012/12/04 17:07:22 |trees|Boards 2, 3|denver|mekong|mekong12 2012/12/04 17:13:27 |trees|Boards 2, 3|denver|mekong|mekong12 2012/12/04 14:07:39 |rain|Boards 1|tampa|merced|merced11 How do i sort and get... (3 Replies)
Discussion started by: sabercats
3 Replies

7. Shell Programming and Scripting

Printing uniq first field with the the highest second field

Hi All, I am searching for a script which will produce an output file with the uniq first field with the second field having highest value among all the duplicates.. The output file will produce only the uniqs which are duplicate 3 times.. Input file X 9 B 5 A 1 Z 9 T 4 C 9 A 4... (13 Replies)
Discussion started by: ailnilanjan
13 Replies

8. Shell Programming and Scripting

Grok filter to extract substring from path and add to host field in logstash

Hii, I am reading data from files by defining path as *.log etc, Files names are like app1a_test2_heep.log , cdc2a_test3_heep.log etc How to configure logstash so that the part of string that is string before underscore (app1a, cdc2a..) should be grepped and added to host field and... (7 Replies)
Discussion started by: Ravi Kishore
7 Replies

9. Shell Programming and Scripting

HELP - uniq values per column

Hi All, I am trying to output uniq values per column. see file below. can you please assist? Thank you in advance. cat names joe allen ibm joe smith ibm joe allen google joe smith google rachel allen google desired output is: joe allen google rachel smith ibm (5 Replies)
Discussion started by: Apollo
5 Replies

10. Shell Programming and Scripting

awk to update field using matching value in file1 and substring in field in file2

In the awk below I am trying to set/update the value of $14 in file2 in bold, using the matching NM_ in $12 or $9 in file2 with the NM_ in $2 of file1. The lengths of $9 and $12 can be variable but what is consistent is the start pattern will always be NM_ and the end pattern is always ;... (2 Replies)
Discussion started by: cmccabe
2 Replies
Str(3o) 							   OCaml library							   Str(3o)

NAME
Str - Regular expressions and high-level string processing Module Module Str Documentation Module Str : sig end Regular expressions and high-level string processing === Regular expressions === type regexp The type of compiled regular expressions. val regexp : string -> regexp Compile a regular expression. The following constructs are recognized: - . Matches any character except newline. - * (postfix) Matches the preceding expression zero, one or several times - + (postfix) Matches the preceding expression one or several times - ? (postfix) Matches the preceding expression once or not at all - [..] Character set. Ranges are denoted with - , as in [a-z] . An initial ^ , as in [^0-9] , complements the set. To include a ] char- acter in a set, make it the first character of the set. To include a - character in a set, make it the first or the last character of the set. - ^ Matches at beginning of line (either at the beginning of the matched string, or just after a newline character). - $ Matches at end of line (either at the end of the matched string, or just before a newline character). - | (infix) Alternative between two expressions. - (..) Grouping and naming of the enclosed expression. - 1 The text matched by the first (...) expression ( 2 for the second expression, and so on up to 9 ). -  Matches word boundaries. - Quotes special characters. The special characters are $^.*+?[] . val regexp_case_fold : string -> regexp Same as regexp , but the compiled expression will match text in a case-insensitive way: uppercase and lowercase letters will be considered equivalent. val quote : string -> string Str.quote s returns a regexp string that matches exactly s and nothing else. val regexp_string : string -> regexp Str.regexp_string s returns a regular expression that matches exactly s and nothing else. val regexp_string_case_fold : string -> regexp Str.regexp_string_case_fold is similar to Str.regexp_string , but the regexp matches in a case-insensitive way. === String matching and searching === val string_match : regexp -> string -> int -> bool string_match r s start tests whether a substring of s that starts at position start matches the regular expression r . The first character of a string has position 0 , as usual. val search_forward : regexp -> string -> int -> int search_forward r s start searches the string s for a substring matching the regular expression r . The search starts at position start and proceeds towards the end of the string. Return the position of the first character of the matched substring, or raise Not_found if no sub- string matches. val search_backward : regexp -> string -> int -> int search_backward r s last searches the string s for a substring matching the regular expression r . The search first considers substrings that start at position last and proceeds towards the beginning of string. Return the position of the first character of the matched sub- string; raise Not_found if no substring matches. val string_partial_match : regexp -> string -> int -> bool Similar to Str.string_match , but also returns true if the argument string is a prefix of a string that matches. This includes the case of a true complete match. val matched_string : string -> string matched_string s returns the substring of s that was matched by the latest Str.string_match , Str.search_forward or Str.search_backward . The user must make sure that the parameter s is the same string that was passed to the matching or searching function. val match_beginning : unit -> int match_beginning() returns the position of the first character of the substring that was matched by Str.string_match , Str.search_forward or Str.search_backward . val match_end : unit -> int match_end() returns the position of the character following the last character of the substring that was matched by string_match , search_forward or search_backward . val matched_group : int -> string -> string matched_group n s returns the substring of s that was matched by the n th group (...) of the regular expression during the latest Str.string_match , Str.search_forward or Str.search_backward . The user must make sure that the parameter s is the same string that was passed to the matching or searching function. matched_group n s raises Not_found if the n th group of the regular expression was not matched. This can happen with groups inside alternatives | , options ? or repetitions * . For instance, the empty string will match (a)* , but matched_group 1 will raise Not_found because the first group itself was not matched. val group_beginning : int -> int group_beginning n returns the position of the first character of the substring that was matched by the n th group of the regular expres- sion. Raises Not_found if the n th group of the regular expression was not matched. Invalid_argument if there are fewer than n groups in the regular expression. val group_end : int -> int group_end n returns the position of the character following the last character of substring that was matched by the n th group of the regu- lar expression. Raises Not_found if the n th group of the regular expression was not matched. Invalid_argument if there are fewer than n groups in the regular expression. === Replacement === val global_replace : regexp -> string -> string -> string global_replace regexp templ s returns a string identical to s , except that all substrings of s that match regexp have been replaced by templ . The replacement template templ can contain 1 , 2 , etc; these sequences will be replaced by the text matched by the corresponding group in the regular expression. stands for the text matched by the whole regular expression. val replace_first : regexp -> string -> string -> string Same as Str.global_replace , except that only the first substring matching the regular expression is replaced. val global_substitute : regexp -> (string -> string) -> string -> string global_substitute regexp subst s returns a string identical to s , except that all substrings of s that match regexp have been replaced by the result of function subst . The function subst is called once for each matching substring, and receives s (the whole text) as argument. val substitute_first : regexp -> (string -> string) -> string -> string Same as Str.global_substitute , except that only the first substring matching the regular expression is replaced. val replace_matched : string -> string -> string replace_matched repl s returns the replacement text repl in which 1 , 2 , etc. have been replaced by the text matched by the correspond- ing groups in the most recent matching operation. s must be the same string that was matched during this matching operation. === Splitting === val split : regexp -> string -> string list split r s splits s into substrings, taking as delimiters the substrings that match r , and returns the list of substrings. For instance, split (regexp [ ]+ ) s splits s into blank-separated words. An occurrence of the delimiter at the beginning and at the end of the string is ignored. val bounded_split : regexp -> string -> int -> string list Same as Str.split , but splits into at most n substrings, where n is the extra integer parameter. val split_delim : regexp -> string -> string list Same as Str.split but occurrences of the delimiter at the beginning and at the end of the string are recognized and returned as empty strings in the result. For instance, split_delim (regexp ) abc returns [""; abc ; ] , while split with the same arguments returns ["abc"] . val bounded_split_delim : regexp -> string -> int -> string list Same as Str.bounded_split , but occurrences of the delimiter at the beginning and at the end of the string are recognized and returned as empty strings in the result. type split_result = | Text of string | Delim of string val full_split : regexp -> string -> split_result list Same as Str.split_delim , but returns the delimiters as well as the substrings contained between delimiters. The former are tagged Delim in the result list; the latter are tagged Text . For instance, full_split (regexp [{}] ) {ab} returns [Delim { ; Text ab ; Delim } ] . val bounded_full_split : regexp -> string -> int -> split_result list Same as Str.bounded_split_delim , but returns the delimiters as well as the substrings contained between delimiters. The former are tagged Delim in the result list; the latter are tagged Text . === Extracting substrings === val string_before : string -> int -> string string_before s n returns the substring of all characters of s that precede position n (excluding the character at position n ). val string_after : string -> int -> string string_after s n returns the substring of all characters of s that follow position n (including the character at position n ). val first_chars : string -> int -> string first_chars s n returns the first n characters of s . This is the same function as Str.string_before . val last_chars : string -> int -> string last_chars s n returns the last n characters of s . OCamldoc 2012-06-26 Str(3o)
All times are GMT -4. The time now is 05:26 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy