Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Identify duplicate words in a line using command Post 302115779 by srinivasan_85 on Friday 27th of April 2007 07:04:26 AM
Old 04-27-2007
Error Identify duplicate words in a line using command

Hi,
Let me explain the problem clearly:
Let the entries in my file be:
Code:
lion,tiger,bear
apple,mango,orange,apple,grape
unix,windows,solaris,windows,linux
red,blue,green,yellow
orange,maroon,pink,violet,orange,pink

Can we detect the lines in which one of the words(separated by field separator) occurs more than once, using a command (or command pipe)?
In this case, the command should detect the lines 2,3,5.

I accomplished it using a perl script (cited below), although i wonder whether this could be done through a command (the difficulty is that the no. of columns is not constant).

Perl program that I used:
Code:
$fname=<STDIN>;
chomp $fname;
open(file,"<$fname");
$found_dups=0;

for $line(<file>)
{
  chomp $line;
  @arr=split(/,/,$line);
  for($i=1;$i<=$#arr;$i++)
  {
     for($j=$i+1;$j<=$#arr;$j++)
     {
        if($arr[$i] eq $arr[$j])
        {
           print "tid $arr[0]\n";
           $found_dups++;
        }
     }
  }
}
print "Found $found_dups duplicates\n";

Thanks,
Srini
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

overlapping words on command line

i tried resize command , but it's not working...... (4 Replies)
Discussion started by: gaurav123
4 Replies

2. UNIX for Dummies Questions & Answers

how to extend words on a command line ?

within a unix window, how do you setup your session to extend a word, by hitting the "esc" key twice. e.g. ls -la scri (esc key, esc key) thankyou (6 Replies)
Discussion started by: venhart
6 Replies

3. Shell Programming and Scripting

remove duplicate words in a line

Hi, Please help! I have a file having duplicate words in some line and I want to remove the duplicate words. The order of the words in the output file doesn't matter. INPUT_FILE pink_kite red_pen ball pink_kite ball yellow_flower white no white no cloud nine_pen pink cloud pink nine_pen... (6 Replies)
Discussion started by: sam_2921
6 Replies

4. Shell Programming and Scripting

alias two words command line

Hello, i would like to alias aptitude install for sudo aptitude install, is it possible, and how ? i read the man alias page, but i think i have to use something with \ or { but i don't know exactly what. (3 Replies)
Discussion started by: harlock59
3 Replies

5. Shell Programming and Scripting

How to set mutliple words variable from command line

I'm writing a script (C shell) to search for a pattern in file. For example scriptname pattern file1 file2 filenN I use for loop to loop through arguments argv, and it does the job if all arguments are supplied. However if only one argument is supplied (in that case pattern ) it should ask to... (5 Replies)
Discussion started by: patryk44
5 Replies

6. Shell Programming and Scripting

how to identify duplicate columns in a row

Hi, How to identify duplicate columns in a row? Input data: may have 30 columns 9211480750 LK 120070417 920091030 9211480893 AZ 120070607 9205323621 O7 120090914 120090914 1420090914 2020090914 2020090914 9211479568 AZ 120070327 320090730 9211479571 MM 120070326 9211480892 MM 120070324... (3 Replies)
Discussion started by: suresh3566
3 Replies

7. UNIX for Dummies Questions & Answers

help to identify duplicate columns adjacent value

Hi friends, I have a xlsheet like below first column having id ABCfollowed by 7digit numbers and the next column have title against the ids. Titles are unique and duplicateboth, but ids are unique even for duplicate title.Now I need to identify those duplicate title having the highest id for... (9 Replies)
Discussion started by: umapearl
9 Replies

8. Shell Programming and Scripting

Scripting help to identify words count in lines

Hi everybody, i have this biological situation to fix: > Id.1 ACGTACANNNNNNNNNNNACGTGCNNNNNNNACTGTGGT >Id.2 ACGGGT >Id.3 ACGTNNNNNNNNNNNNACTGGGGG >Id.4 ACGTGCGNNNNNNNNGGTCANNNNNNNNCGTGCAAANNNNN ........ .... These are nucleotidic sequences with some "NNNN..." always of the same... (4 Replies)
Discussion started by: Giorgio C
4 Replies

9. Shell Programming and Scripting

Command line: add text wrapper around words

I am trying to build a sinkhole for BIND. I created a master zone file for malicious domains and created a separate conf file, but I am stuck. I have a list of known bd domains that is updated nightly. The file simply contains the list of domains, one on each line: Bad.com Bad2.com... (4 Replies)
Discussion started by: uuallan
4 Replies

10. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Hello, I have a dictionary which I am building for the Open Source Community. The data structure is as under HEADWORD=PARTOFSPEECH=ENGLISH MEANING as shown in the example below अ=m=Prefix signifying negation. अँहँ=ind=Interjection expressing disapprobation. अं=int=An interjection... (2 Replies)
Discussion started by: gimley
2 Replies
secolor.conf(8) 					      System Manager's Manual						   secolor.conf(8)

NAME
secolor.conf - The SELinux color configuration file DESCRIPTION
The /etc/selinux/{SELINUXTYPE}/secolor.conf configuation file controls the color to be associated to the context components associated to the raw context passed by selinux_raw_context_to_color(3), when context related information is to be displayed in color by an SELinux-aware application. selinux_raw_context_to_color(3) obtains this color information from the active policy secolor.conf file as returned by selinux_col- ors_path(3). FILE FORMAT
The file format is as follows: color color_name = #color_mask [...] context_component string = fg_color_name bg_color_name [...] Where: color The color keyword. Each color entry is on a new line. color_name A single word name for the color (e.g. red). color_mask A color mask starting with a hash (#) that describes the hexadecimal RGB colors with black being #000000 and white being #ffffff. context_component The context component name that must be one of the following: user, role, type or range Each context_component string ... entry is on a new line. string This is the context_component string that will be matched with the raw context component passed by selinux_raw_context_to_color(3). A wildcard '*' may be used to match any undefined string for the user, role and type context_component entries only. fg_color_name The color_name string that will be used as the foreground color. A color_mask may also be used. bg_color_name The color_name string that will be used as the background color. A color_mask may also be used. EXAMPLES
Example 1 entries are: color black = #000000 color green = #008000 color yellow = #ffff00 color blue = #0000ff color white = #ffffff color red = #ff0000 color orange = #ffa500 color tan = #D2B48C user * = black white role * = white black type * = tan orange range s0-s0:c0.c1023 = black green range s1-s1:c0.c1023 = white green range s3-s3:c0.c1023 = black tan range s5-s5:c0.c1023 = white blue range s7-s7:c0.c1023 = black red range s9-s9:c0.c1023 = black orange range s15:c0.c1023 = black yellow Example 2 entries are: color black = #000000 color green = #008000 color yellow = #ffff00 color blue = #0000ff color white = #ffffff color red = #ff0000 color orange = #ffa500 color tan = #d2b48c user unconfined_u = #ff0000 green role unconfined_r = red #ffffff type unconfined_t = red orange user user_u = black green role user_r = white black type user_t = tan red user xguest_u = black yellow role xguest_r = black red type xguest_t = black green user sysadm_u = white black range s0:c0.c1023 = black white user * = black white role * = black white type * = black white SEE ALSO
mcstransd(8), selinux_raw_context_to_color(3), selinux_colors_path(3) SELinux API documentation 08 April 2011 secolor.conf(8)
All times are GMT -4. The time now is 07:59 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy