Highlighting duplicate string on a line Post: 302916703

Login or Register to Ask a Question and Join Our Community

Sponsored Content

Top Forums Shell Programming and Scripting Highlighting duplicate string on a line Post 302916703 by rbatte1 on Thursday 11th of September 2014 11:50:52 AM

Old

09-11-2014

Moderator

Well good for you. We all learn better by trying, rather than being spoon-fed. With a nice pun like that, are you British?

You might need $1 in your awk rather than $0

It should still work though. This will give you the first half of each line, so you'd need to catch and compare that to the original, something like:-

Code:

while read line; do
   ((half=${#line}/2))
   halfline=`echo $line | awk '{print substr($0,1,$half)}'`
   if [ "${halfline}${halfline}" = "${line}" ]
   then
      echo "${line} is a duplicated entry"
   else
      echo "${line} is not repeated"
   fi
done < $TEMP_1 > logfile

Personally, I'd replace the awk with a substitution, so you are not calling awk over and again, something like this:-

Code:

while read line; do
   ((half=${#line}/2))
   h=1                                           # Set a counter
   mask=                                         # Null the variable
   until [ $h -gt $half ]                        # Loop until counter is right
   do
      mask="${mask}?"                            # Add a ? (single character wildcard)
      ((h=$h+1))
   done
   halfline="${line#${mask}}"                    # Split the line
   if [ "${halfline}${halfline}" = "${line}" ]   # Match twice the split line with the original
   then
      echo "${line} is a duplicated entry"
   else
      echo "${line} is not repeated"
   fi
done < $TEMP_1 > logfile

Does that suit? Does it work even......... Smilie

?

Robin

This User Gave Thanks to rbatte1 For This Post:

rbatte1

View Public Profile for rbatte1

Visit rbatte1's homepage!

Find all posts by rbatte1

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

removing line and duplicate line

Hi, I have 3 lines in a text file that is similar to this (as a result of a diff between 2 files): 35,36d34 < DATA.EVENT.EVENT_ID.s = "3661208" < DATA.EVENT.EVENT_ID.s = "3661208" I am trying to get it down to just this: DATA.EVENT.EVENT_ID.s = "3661208" How can I do this?...

2. Shell Programming and Scripting

How to remove duplicate sentence/string in perl?

Hi, I have two strings like this in an array: For example: @a=("Brain aging is associated with a progressive imbalance between intracellular concentration of Reactive Oxygen Species","Brain aging is associated with a progressive imbalance between intracellular concentration of Reactive...

3. Shell Programming and Scripting

filtering out duplicate substrings, regex string from a string

My input contains a single word lines. From each line data.txt prjtestBlaBlatestBlaBla prjthisBlaBlathisBlaBla prjthatBlaBladpthatBlaBla prjgoodBlaBladpgoodBlaBla prjgood1BlaBla123dpgood1BlaBla123 Desired output --> data_out.txt prjtestBlaBla prjthisBlaBla...

4. Shell Programming and Scripting

Delete duplicate in certain number of string

Hi, do you have awk or sed sommand taht will delete duplicate lines like. sample: server1-log1-14 server1-log2-14 superserver-time-2 superserver-log-2 output: server-log1-14 superserver-time-2 thansk

5. Shell Programming and Scripting

find duplicate string in many different files

I have more than 100 files like this: SVEAVLTGPYGYT 2 SVEGNFEETQY 10 SVELGQGYEQY 28 SVERTGTGYT 6 SVGLADYNEQF 21 SVGQGYEQY 32 SVKTVLGYEQF 2 SVNNEQF 12 SVRDGLTNSPLH 3 SVRRDREGLEQF 11 SVRTSGSYEQY 17 SVSVSGSPLQETQY 78 SVVHSTSPEAF 59 SVVPGNGYT 75

6. Shell Programming and Scripting

Remove not only the duplicate string but also the keyword of the string in Perl

Hi Perl users, I have another problem with text processing in Perl. I have a file below: Linux Unix Linux Windows SUN MACOS SUN SUN HP-AUX I want the result below: Unix Windows SUN MACOS HP-AUX so the duplicate string will be removed and also the keyword of the string on...

7. Shell Programming and Scripting

Honey, I broke awk! (duplicate line removal in 30M line 3.7GB csv file)

I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code: awk...

8. Red Hat

How to add a new string at the end of line by searching a string on the same line?

Hi, I have a file which is an extract of jil codes of all autosys jobs in our server. Sample jil code: ************************** permission:gx,wx date_conditions:yes days_of_week:all start_times:"05:00" condition: notrunning(appDev#box#ProductLoad)...

9. Shell Programming and Scripting

Highlighting duplicate string on a line

Hi all I have a grep written to pull out values; below (in the code snip-it) is an example of the output. What I'm struggling to do, and looking for assistance on, is identifying the lines that have duplicate strings. For example 74859915K74859915K in the below is 74859915K repeated twice but...

10. Shell Programming and Scripting

Shell script to get duplicate string

Hi All, I have a requirement where I have to get the duplicate string count and uniq error message. Below is my file: Rejected - Error on table TableA, column ColA. Error String 1. Rejected - Error on table TableA, column ColB. Error String 2. Rejected - Error on table TableA, column...

LEARN ABOUT OPENSOLARIS

fgrep

fgrep(1)							   User Commands							  fgrep(1)

NAME

       fgrep - search a file for a fixed-character string

SYNOPSIS

       /usr/bin/fgrep [-bchilnsvx] -e pattern_list [file...]

       /usr/bin/fgrep [-bchilnsvx] -f file [file...]

       /usr/bin/fgrep [-bchilnsvx] pattern [file...]

       /usr/xpg4/bin/fgrep [-bchilnqsvx] -e pattern_list [-f file]
	    [file...]

       /usr/xpg4/bin/fgrep [-bchilnqsvx] [-e pattern_list] -f file
	    [file...]

       /usr/xpg4/bin/fgrep [-bchilnqsvx] pattern [file...]

DESCRIPTION

       The  fgrep (fast grep) utility searches files for a character string and prints all lines that contain that string. fgrep is different from
       grep(1) and from egrep(1) because it searches for a string, instead of searching for a pattern that matches an  expression.  fgrep  uses  a
       fast and compact algorithm.

       The characters $, *, [, ^, |, (, ), and  are interpreted literally by fgrep, that is, fgrep does not recognize full regular expressions as
       does egrep. These characters have special meaning to the shell. Therefore, to be safe, enclose the entire string within single quotes (a').

       If no files are specified, fgrep assumes standard input. Normally, each line that is found is copied to the standard output. The file  name
       is printed before each line that is found if there is more than one input file.

OPTIONS

       The following options are supported for both /usr/bin/fgrep and /usr/xpg4/bin/fgrep:

       -b		  Precedes each line by the block number on which the line was found. This can be useful in locating block numbers by con-
			  text. The first block is 0.

       -c		  Prints only a count of the lines that contain the pattern.

       -e pattern_list	  Searches for a string in pattern-list. This is useful when the string begins with a -.

       -f pattern-file	  Takes the list of patterns from pattern-file.

       -h		  Suppresses printing of files when searching multiple files.

       -i		  Ignores upper/lower case distinction during comparisons.

       -l		  Prints the names of files with matching lines once, separated by new-lines. Does not repeat the names of files when  the
			  pattern is found more than once.

       -n		  Precedes each line by its line number in the file. The first line is 1.

       -s		  Works silently, that is, displays nothing except error messages. This is useful for checking the error status.

       -v		  Prints all lines except those that contain the pattern.

       -x		  Prints only lines that are matched entirely.

   /usr/xpg4/bin/fgrep
       The following options are supported for /usr/xpg4/bin/fgrep only:

       -q    Quiet.  Does  not	write  anything  to  the standard output, regardless of matching lines. Exits with zero status if an input line is
	     selected.

OPERANDS

       The following operands are supported:

       file    Specifies a path name of a file to be searched for the patterns. If no file operands are specified,  the  standard  input  will	be
	       used.

   /usr/bin/fgrep
       pattern	  Specifies a pattern to be used during the search for input.

   /usr/xpg4/bin/fgrep
       pattern	  Specifies  one  or  more  patterns to be used during the search for input. This operand is treated as if it were specified as -e
		  pattern_list.

USAGE

       See largefile(5) for the description of the behavior of fgrep when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes).

ENVIRONMENT VARIABLES

       See environ(5) for descriptions of the following environment variables that affect the execution of fgrep:  LC_COLLATE,	LC_CTYPE,  LC_MES-
       SAGES, and NLSPATH.

EXIT STATUS

       The following exit values are returned:

       0    If any matches are found

       1    If no matches are found

       2    For syntax errors or inaccessible files, even if matches were found.

ATTRIBUTES

       See attributes(5) for descriptions of the following attributes:

   /usr/bin/fgrep
       +-----------------------------+-----------------------------+
       |      ATTRIBUTE TYPE	     |	    ATTRIBUTE VALUE	   |
       +-----------------------------+-----------------------------+
       |Availability		     |SUNWcsu			   |
       +-----------------------------+-----------------------------+

   /usr/xpg4/bin/fgrep
       +-----------------------------+-----------------------------+
       |      ATTRIBUTE TYPE	     |	    ATTRIBUTE VALUE	   |
       +-----------------------------+-----------------------------+
       |Availability		     |SUNWxcu4			   |
       +-----------------------------+-----------------------------+
       |CSI			     |Enabled			   |
       +-----------------------------+-----------------------------+

SEE ALSO

       ed(1), egrep(1), grep(1), sed(1), sh(1), attributes(5), environ(5), largefile(5), XPG4(5)

NOTES

       Ideally, there should be only one grep command, but there is not a single algorithm that spans a wide enough range of space-time tradeoffs.

       Lines are limited only by the size of the available virtual memory.

   /usr/xpg4/bin/fgrep
       The  /usr/xpg4/bin/fgrep  utility  is identical to /usr/xpg4/bin/grep -F (see grep(1)). Portable applications should use /usr/xpg4/bin/grep
       -F.

SunOS 5.11							    24 Mar 2006 							  fgrep(1)