Sponsored Content
Top Forums Shell Programming and Scripting Highlighting duplicate string on a line Post 302916690 by brighty on Thursday 11th of September 2014 10:30:20 AM
Old 09-11-2014
Highlighting duplicate string on a line

Hi all

I have a grep written to pull out values; below (in the code snip-it) is an example of the output.
What I'm struggling to do, and looking for assistance on, is identifying the lines that have duplicate strings.
For example 74859915K74859915K in the below is 74859915K repeated twice but 32575310100014 is not a whole repeating value so I don't want to see it.

In my head (and what I'm unable to do) I want to do something like count it's length, split it in half and confirm the first half matches the second half... I'm open to suggestions as there may be a better way to do it.

Background - these values are in multiple files within an xml tag <foo></foo>. My grep is extracting them and removing the xml tags with sed leaving just the below output... it's the next step where I want to only have the true dupes.

Many thanks in advance.

Code:
74859915K74859915K
0B153858340B15385834
MUNS0-0000000001MUNS0-0000000001
10594556C10594556C
0B982730630B98273063
Q1818002FQ1818002F
78883385D78883385D
44871376D44871376D
B14513386B14513386
016797265C016797265C
0A120861950A12086195
025691290Z025691290Z
31262294G31262294G
B57312068B57312068
16803742B16803742B
723029268723029268
A50470772A50470772
B64841927B64841927
32575310100014
50836566B50836566B
499984

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

removing line and duplicate line

Hi, I have 3 lines in a text file that is similar to this (as a result of a diff between 2 files): 35,36d34 < DATA.EVENT.EVENT_ID.s = "3661208" < DATA.EVENT.EVENT_ID.s = "3661208" I am trying to get it down to just this: DATA.EVENT.EVENT_ID.s = "3661208" How can I do this?... (11 Replies)
Discussion started by: ocelot
11 Replies

2. Shell Programming and Scripting

How to remove duplicate sentence/string in perl?

Hi, I have two strings like this in an array: For example: @a=("Brain aging is associated with a progressive imbalance between intracellular concentration of Reactive Oxygen Species","Brain aging is associated with a progressive imbalance between intracellular concentration of Reactive... (9 Replies)
Discussion started by: vanitham
9 Replies

3. Shell Programming and Scripting

filtering out duplicate substrings, regex string from a string

My input contains a single word lines. From each line data.txt prjtestBlaBlatestBlaBla prjthisBlaBlathisBlaBla prjthatBlaBladpthatBlaBla prjgoodBlaBladpgoodBlaBla prjgood1BlaBla123dpgood1BlaBla123 Desired output --> data_out.txt prjtestBlaBla prjthisBlaBla... (8 Replies)
Discussion started by: kchinnam
8 Replies

4. Shell Programming and Scripting

Delete duplicate in certain number of string

Hi, do you have awk or sed sommand taht will delete duplicate lines like. sample: server1-log1-14 server1-log2-14 superserver-time-2 superserver-log-2 output: server-log1-14 superserver-time-2 thansk (2 Replies)
Discussion started by: kenshinhimura
2 Replies

5. Shell Programming and Scripting

find duplicate string in many different files

I have more than 100 files like this: SVEAVLTGPYGYT 2 SVEGNFEETQY 10 SVELGQGYEQY 28 SVERTGTGYT 6 SVGLADYNEQF 21 SVGQGYEQY 32 SVKTVLGYEQF 2 SVNNEQF 12 SVRDGLTNSPLH 3 SVRRDREGLEQF 11 SVRTSGSYEQY 17 SVSVSGSPLQETQY 78 SVVHSTSPEAF 59 SVVPGNGYT 75 (4 Replies)
Discussion started by: xshang
4 Replies

6. Shell Programming and Scripting

Remove not only the duplicate string but also the keyword of the string in Perl

Hi Perl users, I have another problem with text processing in Perl. I have a file below: Linux Unix Linux Windows SUN MACOS SUN SUN HP-AUX I want the result below: Unix Windows SUN MACOS HP-AUX so the duplicate string will be removed and also the keyword of the string on... (2 Replies)
Discussion started by: askari
2 Replies

7. Shell Programming and Scripting

Honey, I broke awk! (duplicate line removal in 30M line 3.7GB csv file)

I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code: awk... (34 Replies)
Discussion started by: Michael Stora
34 Replies

8. Red Hat

How to add a new string at the end of line by searching a string on the same line?

Hi, I have a file which is an extract of jil codes of all autosys jobs in our server. Sample jil code: ************************** permission:gx,wx date_conditions:yes days_of_week:all start_times:"05:00" condition: notrunning(appDev#box#ProductLoad)... (1 Reply)
Discussion started by: raghavendra
1 Replies

9. Shell Programming and Scripting

Highlighting duplicate string on a line

Hi all I have a grep written to pull out values; below (in the code snip-it) is an example of the output. What I'm struggling to do, and looking for assistance on, is identifying the lines that have duplicate strings. For example 74859915K74859915K in the below is 74859915K repeated twice but... (8 Replies)
Discussion started by: brighty
8 Replies

10. Shell Programming and Scripting

Shell script to get duplicate string

Hi All, I have a requirement where I have to get the duplicate string count and uniq error message. Below is my file: Rejected - Error on table TableA, column ColA. Error String 1. Rejected - Error on table TableA, column ColB. Error String 2. Rejected - Error on table TableA, column... (6 Replies)
Discussion started by: Deekhari
6 Replies
STAG-GREP(1p)						User Contributed Perl Documentation					     STAG-GREP(1p)

NAME
stag-grep - filters a stag file (xml, itext, sxpr) for nodes of interest SYNOPSIS
stag-grep person -q name=fred file1.xml stag-grep person 'sub {shift->get_name =~ /^A*/}' file1.xml stag-grep -p My::Foo -w sxpr record 'sub{..}' file2 USAGE
stag-grep [-p|parser PARSER] [-w|writer WRITER] NODE -q tag=val FILE stag-grep [-p|parser PARSER] [-w|writer WRITER] NODE SUB FILE stag-grep [-p|parser PARSER] [-w|writer WRITER] NODE -f PERLFILE FILE DESCRIPTION
parsers an input file using the specified parser (which may be a built in stag parser, such as xml) and filters the resulting stag tree according to a user-supplied subroutine, writing out only the nodes/elements that pass the test. the parser is event based, so it should be able to handle large files (although if the node you parse is large, it will take up more memory) ARGUMENTS
-p|parser FORMAT FORMAT is one of xml, sxpr or itext, or the name of a perl module xml assumed as default -w|writer FORMAT FORMAT is one of xml, sxpr or itext, or the name of a perl module -c|count prints the number of nodes that pass the test -filterfile|f a file containing a perl subroutine (in place of the SUB argument) -q|query TAG1=VAL1 -q|query TAG2=VAL2 ... -q|query TAGN=VALN filters based on the field TAG other operators can be used too - eg <, <=, etc multiple q arguments can be passed in for more complex operations, pass in your own subroutine, see below SUB a perl subroutine. this subroutine is evaluated evry time NODE is encountered - the stag object for NODE is passed into the subroutine. if the subroutine passes, the node will be passed to the writer for display NODE the name of the node/element we are filtering on FILE the file to be parser. If no parser option is supplied, this is assumed to a be a stag compatible syntax (xml, sxpr or itext); otherwise you should parse in a parser name or a parser module that throws stag events SEE ALSO
Data::Stag perl v5.10.0 2008-12-23 STAG-GREP(1p)
All times are GMT -4. The time now is 03:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy