Find duplicates in the first column of text file Post: 302432822

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to find the number of column in the text file...?

Hi, i have text file with ~ seperated columns. it is very huge size of file, in the file sompulsary supposed to has 20 columns with ~ seperated. so how can i find if the file has 20 column in the all rows...? Sample file: APA+VU~10~~~~~03~101~101~~~APA.N O 20081017 120.00...

2. UNIX for Dummies Questions & Answers

Remove duplicates based on a column in fixed width file

Hi, How to output the duplicate record to another file. We say the record is duplicate based on a column whose position is from 2 and its length is 11 characters. The file is a fixed width file. ex of Record: DTYU12333567opert tjhi kkklTRG9012 The data in bold is the key on which...

3. Shell Programming and Scripting

need to remove duplicates based on key in first column and pattern in last column

Given a file such as this I need to remove the duplicates. 00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt 00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt 0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt 0624-01 RUT CORPORATION ...

4. UNIX for Dummies Questions & Answers

CSV file:Find duplicates, save original and duplicate records in a new file

Hi Unix gurus, Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me. File format: CSV file File has four columns with no header...

5. Red Hat

How to find a garbage entry in a column wise text file in Linux?

Suppose I have a file containing :- 1 Apple $50 2 Orange $30 3 Banana $10 4 Guava $25 5 Pine@apple $12 6 Strawberry $21 7 Grapes $12 In the 5th row, @ character inserted. I want through sort command or by any other way this row should either on top or bottom. By sort command garbage...

6. Shell Programming and Scripting

Find duplicates in column 1 and merge their lines (awk?)

Hi, I have a file (sorted by sort) with 8 tab delimited columns. The first column contains duplicated fields and I need to merge all these identical lines. My input file: comp100002 aaa bbb ccc ddd eee fff ggg comp100003 aba aba aba aba aba aba aba comp100003 fff fff fff fff fff fff fff...

7. Shell Programming and Scripting

Find duplicates in 2 & 3rd column and their ID

with below given format, I have been trying to find out all IDs for those entries with duplicate names in 2nd and 3rd columns and their count like how many time duplication happened for any name if any, 0.237788 Aaban Aahva 0.291066 Aabheer Aahlaad 0.845814 Aabid Aahan 0.152208 Aadam...

8. Shell Programming and Scripting

awk to Sum columns when other column has duplicates and append one column value to another with Care

Hi Experts, Please bear with me, i need help I am learning AWk and stuck up in one issue. First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique. Second point : For...

9. UNIX for Beginners Questions & Answers

Find duplicates in file with line numbers

Hello All, This is a noob question. I tried searching for the answer but the answer found did not help me . I have a file that can have duplicates. 100 200 300 400 100 150 the number 100 is duplicated twice. I want to find the duplicate along with the line number. expected...

LEARN ABOUT SUSE

template::plugin::table

Template::Plugin::Table(3)				User Contributed Perl Documentation				Template::Plugin::Table(3)

NAME

       Template::Plugin::Table - Plugin to present data in a table

SYNOPSIS

	   [% USE table(list, rows=n, cols=n, overlap=n, pad=0) %]

	   [% FOREACH item IN table.row(n) %]
	      [% item %]
	   [% END %]

	   [% FOREACH item IN table.col(n) %]
	      [% item %]
	   [% END %]

	   [% FOREACH row IN table.rows %]
	      [% FOREACH item IN row %]
		 [% item %]
	      [% END %]
	   [% END %]

	   [% FOREACH col IN table.cols %]
	      [% col.first %] - [% col.last %] ([% col.size %] entries)
	   [% END %]

DESCRIPTION

       The "Table" plugin allows you to format a list of data items into a virtual table.  When you create a "Table" plugin via the "USE"
       directive, simply pass a list reference as the first parameter and then specify a fixed number of rows or columns.

	   [% USE Table(list, rows=5) %]
	   [% USE table(list, cols=5) %]

       The "Table" plugin name can also be specified in lower case as shown in the second example above.  You can also specify an alternative
       variable name for the plugin as per regular Template Toolkit syntax.

	   [% USE mydata = table(list, rows=5) %]

       The plugin then presents a table based view on the data set.  The data isn't actually reorganised in any way but is available via the
       "row()", "col()", "rows()" and "cols()" as if formatted into a simple two dimensional table of "n" rows x "n" columns.

       So if we had a sample "alphabet" list contained the letters '"a"' to '"z"', the above "USE" directives would create plugins that
       represented the following views of the alphabet.

	   [% USE table(alphabet, ... %]

	   rows=5		   cols=5
	   a  f  k  p  u  z	   a  g  m  s  y
	   b  g  l  q  v	   b  h  n  t  z
	   c  h  m  r  w	   c  i  o  u
	   d  i  n  s  x	   d  j  p  v
	   e  j  o  t  y	   e  k  q  w
				   f  l  r  x

       We can request a particular row or column using the "row()" and "col()" methods.

	   [% USE table(alphabet, rows=5) %]
	   [% FOREACH item = table.row(0) %]
	      # [% item %] set to each of [ a f k p u z ] in turn
	   [% END %]

	   [% FOREACH item = table.col(2) %]
	      # [% item %] set to each of [ m n o p q r ] in turn
	   [% END %]

       Data in rows is returned from left to right, columns from top to bottom.  The first row/column is 0.  By default, rows or columns that
       contain empty values will be padded with the undefined value to fill it to the same size as all other rows or columns.

       For example, the last row (row 4) in the first example would contain the values "[ e j o t y undef ]". The Template Toolkit will safely
       accept these undefined values and print a empty string. You can also use the IF directive to test if the value is set.

	  [% FOREACH item = table.row(4) %]
	     [% IF item %]
		Item: [% item %]
	     [% END %]
	  [% END %]

       You can explicitly disable the "pad" option when creating the plugin to returned shortened rows/columns where the data is empty.

	  [% USE table(alphabet, cols=5, pad=0) %]
	  [% FOREACH item = table.col(4) %]
	     # [% item %] set to each of 'y z'
	  [% END %]

       The "rows()" method returns all rows/columns in the table as a reference to a list of rows (themselves list references).  The "row()"
       methods when called without any arguments calls "rows()" to return all rows in the table.

       Ditto for "cols()" and "col()".

	   [% USE table(alphabet, cols=5) %]
	   [% FOREACH row = table.rows %]
	      [% FOREACH item = row %]
		 [% item %]
	      [% END %]
	   [% END %]

       The Template Toolkit provides the "first", "last" and "size" virtual methods that can be called on list references to return the first/last
       entry or the number of entries in a list. The following example shows how we might use this to provide an alphabetical index split into 3
       even parts.

	   [% USE table(alphabet, cols=3, pad=0) %]
	   [% FOREACH group = table.col %]
	      [ [% group.first %] - [% group.last %] ([% group.size %] letters) ]
	   [% END %]

       This produces the following output:

	   [ a - i (9 letters) ]
	   [ j - r (9 letters) ]
	   [ s - z (8 letters) ]

       We can also use the general purpose "join" virtual method which joins the items of the list using the connecting string specified.

	   [% USE table(alphabet, cols=5) %]
	   [% FOREACH row = table.rows %]
	      [% row.join(' - ') %]
	   [% END %]

       Data in the table is ordered downwards rather than across but can easily be transformed on output.  For example, to format our data in 5
       columns with data ordered across rather than down, we specify "rows=5" to order the data as such:

	   a  f  .  .
	   b  g  .
	   c  h
	   d  i
	   e  j

       and then iterate down through each column (a-e, f-j, etc.) printing the data across.

	   a  b  c  d  e
	   f  g  h  i  j
	   .  .
	   .

       Example code to do so would be much like the following:

	   [% USE table(alphabet, rows=3) %]
	   [% FOREACH cols = table.cols %]
	     [% FOREACH item = cols %]
	       [% item %]
	     [% END %]
	   [% END %]

       Output:

	   a  b  c
	   d  e  f
	   g  h  i
	   j  .  .
	   .

       In addition to a list reference, the "Table" plugin constructor may be passed a reference to a Template::Iterator object or subclass
       thereof. The Template::Iterator get_all() method is first called on the iterator to return all remaining items. These are then available
       via the usual Table interface.

	   [% USE DBI(dsn,user,pass) -%]

	   # query() returns an iterator
	   [% results = DBI.query('SELECT * FROM alphabet ORDER BY letter') %]

	   # pass into Table plugin
	   [% USE table(results, rows=8 overlap=1 pad=0) -%]

	   [% FOREACH row = table.cols -%]
	      [% row.first.letter %] - [% row.last.letter %]:
		 [% row.join(', ') %]
	   [% END %]

AUTHOR

       Andy Wardley <abw@wardley.org> <http://wardley.org/>

COPYRIGHT

       Copyright (C) 1996-2007 Andy Wardley.  All Rights Reserved.

       This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

       Template::Plugin

perl v5.12.1							    2009-05-20						Template::Plugin::Table(3)