Sponsored Content
Top Forums Shell Programming and Scripting Highlighting duplicate string on a line Post 302916700 by brighty on Thursday 11th of September 2014 11:30:24 AM
Old 09-11-2014
Thanks for the reply Robin; you're on the same page as me.

Not being one to sit back and expect it to be written for me I've had a go with that pointer you gave me but I'm getting some strange results from it.
I thought I'd start simple and built up. I was half expecting (excuse the pun) that the below would output half the length of the string and store it in $half then using that combined with an awk sub string I would be able to just out put the first half of the string. The idea being I could store that in a variable do the same for the second half by getting the awk substr to start at $half for $half and I could compare the two. If they match then output if they don't then bin them.

Code:
while read line; do
((half=${#line}/2))
echo $line | awk '{print substr($0,1,$half)}'
done < $TEMP_1

That doesn't give me the output I was expecting. $TEMP_1 is a file name which contains the values one per line as per my previous post.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

removing line and duplicate line

Hi, I have 3 lines in a text file that is similar to this (as a result of a diff between 2 files): 35,36d34 < DATA.EVENT.EVENT_ID.s = "3661208" < DATA.EVENT.EVENT_ID.s = "3661208" I am trying to get it down to just this: DATA.EVENT.EVENT_ID.s = "3661208" How can I do this?... (11 Replies)
Discussion started by: ocelot
11 Replies

2. Shell Programming and Scripting

How to remove duplicate sentence/string in perl?

Hi, I have two strings like this in an array: For example: @a=("Brain aging is associated with a progressive imbalance between intracellular concentration of Reactive Oxygen Species","Brain aging is associated with a progressive imbalance between intracellular concentration of Reactive... (9 Replies)
Discussion started by: vanitham
9 Replies

3. Shell Programming and Scripting

filtering out duplicate substrings, regex string from a string

My input contains a single word lines. From each line data.txt prjtestBlaBlatestBlaBla prjthisBlaBlathisBlaBla prjthatBlaBladpthatBlaBla prjgoodBlaBladpgoodBlaBla prjgood1BlaBla123dpgood1BlaBla123 Desired output --> data_out.txt prjtestBlaBla prjthisBlaBla... (8 Replies)
Discussion started by: kchinnam
8 Replies

4. Shell Programming and Scripting

Delete duplicate in certain number of string

Hi, do you have awk or sed sommand taht will delete duplicate lines like. sample: server1-log1-14 server1-log2-14 superserver-time-2 superserver-log-2 output: server-log1-14 superserver-time-2 thansk (2 Replies)
Discussion started by: kenshinhimura
2 Replies

5. Shell Programming and Scripting

find duplicate string in many different files

I have more than 100 files like this: SVEAVLTGPYGYT 2 SVEGNFEETQY 10 SVELGQGYEQY 28 SVERTGTGYT 6 SVGLADYNEQF 21 SVGQGYEQY 32 SVKTVLGYEQF 2 SVNNEQF 12 SVRDGLTNSPLH 3 SVRRDREGLEQF 11 SVRTSGSYEQY 17 SVSVSGSPLQETQY 78 SVVHSTSPEAF 59 SVVPGNGYT 75 (4 Replies)
Discussion started by: xshang
4 Replies

6. Shell Programming and Scripting

Remove not only the duplicate string but also the keyword of the string in Perl

Hi Perl users, I have another problem with text processing in Perl. I have a file below: Linux Unix Linux Windows SUN MACOS SUN SUN HP-AUX I want the result below: Unix Windows SUN MACOS HP-AUX so the duplicate string will be removed and also the keyword of the string on... (2 Replies)
Discussion started by: askari
2 Replies

7. Shell Programming and Scripting

Honey, I broke awk! (duplicate line removal in 30M line 3.7GB csv file)

I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code: awk... (34 Replies)
Discussion started by: Michael Stora
34 Replies

8. Red Hat

How to add a new string at the end of line by searching a string on the same line?

Hi, I have a file which is an extract of jil codes of all autosys jobs in our server. Sample jil code: ************************** permission:gx,wx date_conditions:yes days_of_week:all start_times:"05:00" condition: notrunning(appDev#box#ProductLoad)... (1 Reply)
Discussion started by: raghavendra
1 Replies

9. Shell Programming and Scripting

Highlighting duplicate string on a line

Hi all I have a grep written to pull out values; below (in the code snip-it) is an example of the output. What I'm struggling to do, and looking for assistance on, is identifying the lines that have duplicate strings. For example 74859915K74859915K in the below is 74859915K repeated twice but... (3 Replies)
Discussion started by: brighty
3 Replies

10. Shell Programming and Scripting

Shell script to get duplicate string

Hi All, I have a requirement where I have to get the duplicate string count and uniq error message. Below is my file: Rejected - Error on table TableA, column ColA. Error String 1. Rejected - Error on table TableA, column ColB. Error String 2. Rejected - Error on table TableA, column... (6 Replies)
Discussion started by: Deekhari
6 Replies
COL(1)                                                      BSD General Commands Manual                                                     COL(1)

NAME
col -- filter reverse line feeds from input SYNOPSIS
col [-bfhpx] [-l num] DESCRIPTION
The col utility filters out reverse (and half reverse) line feeds so that the output is in the correct order with only forward and half for- ward line feeds, and replaces white-space characters with tabs where possible. This can be useful in processing the output of nroff(1) and tbl(1). The col utility reads from the standard input and writes to the standard output. The options are as follows: -b Do not output any backspaces, printing only the last character written to each column position. -f Forward half line feeds are permitted (``fine'' mode). Normally characters printed on a half line boundary are printed on the fol- lowing line. -h Do not output multiple spaces instead of tabs (default). -l num Buffer at least num lines in memory. By default, 128 lines are buffered. -p Force unknown control sequences to be passed through unchanged. Normally, col will filter out any control sequences from the input other than those recognized and interpreted by itself, which are listed below. -x Output multiple spaces instead of tabs. In the input stream, col understands both the escape sequences of the form escape-digit mandated by Version 2 of the Single UNIX Specification (``SUSv2'') and the traditional BSD format escape-control-character. The control sequences for carriage motion and their ASCII values are as follows: ESC-BELL reverse line feed (escape then bell). ESC-7 reverse line feed (escape then 7). ESC-BACKSPACE half reverse line feed (escape then backspace). ESC-8 half reverse line feed (escape then 8). ESC-TAB half forward line feed (escape than tab). ESC-9 half forward line feed (escape then 9). In -f mode, this sequence may also occur in the output stream. backspace moves back one column (8); ignored in the first column carriage return (13) newline forward line feed (10); also does carriage return shift in shift to normal character set (15) shift out shift to alternate character set (14) space moves forward one column (32) tab moves forward to next tab stop (9) vertical tab reverse line feed (11) All unrecognized control characters and escape sequences are discarded. The col utility keeps track of the character set as characters are read and makes sure the character set is correct when they are output. If the input attempts to back up to the last flushed line, col will display a warning message. ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of col as described in environ(7). EXIT STATUS
The col utility exits 0 on success, and >0 if an error occurs. SEE ALSO
colcrt(1), expand(1), nroff(1), tbl(1) STANDARDS
The col utility conforms to Version 2 of the Single UNIX Specification (``SUSv2''). HISTORY
A col command appeared in Version 6 AT&T UNIX. BSD May 10, 2015 BSD
All times are GMT -4. The time now is 11:55 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy