Sponsored Content
Full Discussion: Duplicate files
Top Forums Shell Programming and Scripting Duplicate files Post 302720243 by DGPickett on Tuesday 23rd of October 2012 04:25:48 PM
Old 10-23-2012
Code:
unset k v
( sort in_file
  echo EOF # force last set out of while
 ) | while read k v
do
 if [ "$lk" = "" ]
 then
  lk="$k" lv="$v"
  continue
 fi
 if [ "$k" != "$lk" ]
 then
  echo "$lk $lv"
  lk="$k" lv="$v"
 fi
 lv="$lv $v"
done >out_file

Alternatively, without sorting you could put the values into a ksh93/bash associative array by key concatenating the values. Might not scale as well as the sort. Output is in hash order.
Code:
typeset -A ht
while read k v
do
 ht[$k]="${ht[$k]} $v"
done <in_file
for k in "${!ht[@]}"
do
 echo "$k${ht[$k]}"
done >out_file

Arrays (Learning the Korn Shell, 2nd Edition)

Last edited by DGPickett; 10-23-2012 at 06:00 PM..
This User Gave Thanks to DGPickett For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

getting rid of duplicate files

i have a bad problem with multiple occurances of the same file in different directories.. how this happened i am not sure! but I know that i can use awk to scan multiple directory trees to find an occurance of the same file... some of these files differ somwhat but that does not matter! the... (4 Replies)
Discussion started by: moxxx68
4 Replies

2. Shell Programming and Scripting

remove duplicate files in a directory

Hi ppl. I have to check for duplicate files in a directory . the directory has following files /the/folder /containing/the/file a1.yyyymmddhhmmss a1.yyyyMMddhhmmss b1.yyyymmddhhmmss b2.yyyymmddhhmmss c.yyyymmddhhmmss d.yyyymmddhhmmss d.yyyymmddhhmmss where the date time stamp can be... (1 Reply)
Discussion started by: asinha63
1 Replies

3. AIX

removinf files containing duplicate data

Hi ppl. I have to check for duplicate files in a directory . the directory has following files /the/folder /containing/the/file a1.yyyymmddhhmmss a1.yyyyMMddhhmmss b1.yyyymmddhhmmss b2.yyyymmddhhmmss c.yyyymmddhhmmss d.yyyymmddhhmmss d.yyyymmddhhmmss where the date time stamp can be... (1 Reply)
Discussion started by: asinha63
1 Replies

4. Shell Programming and Scripting

Finding Duplicate files

How do you delete and and find duplicate files? (1 Reply)
Discussion started by: Jicom4
1 Replies

5. Shell Programming and Scripting

Find Duplicate files, not by name

I have a directory with images: -rw-r--r-- 1 root root 26216 Mar 19 21:00 020109.210001.jpg -rw-r--r-- 1 root root 21760 Mar 19 21:15 020109.211502.jpg -rw-r--r-- 1 root root 23144 Mar 19 21:30 020109.213002.jpg -rw-r--r-- 1 root root 31350 Mar 20 00:45 020109.004501.jpg -rw-r--r-- 1 root... (2 Replies)
Discussion started by: Ikon
2 Replies

6. UNIX for Dummies Questions & Answers

Renaming duplicate files in a loop

Hello, I have a bunch of files whose names start with 'xx' The first line of each file looks something like: a|...|...|...|... , ... In order to rename all of these files to whatever's between the 4th | and the comma (in the first line of that particular file) , I have been using: for... (2 Replies)
Discussion started by: juliette salexa
2 Replies

7. Shell Programming and Scripting

Find duplicate files

What utility do you recommend for simply finding all duplicate files among all files? (4 Replies)
Discussion started by: kiasas
4 Replies

8. Shell Programming and Scripting

Remove duplicate files

Hi, In a directory, e.g. ~/corpus is a lot of files and subdirectories. Some of the files are named: 12345___PP___0902___AA.txt 12346___PP___0902___AA. txt 12347___PP___0902___AA. txt The amount of files varies. I need to keep the highest (12347___PP___0902___AA. txt) and remove... (5 Replies)
Discussion started by: corfuitl
5 Replies

9. Shell Programming and Scripting

Find duplicate files but with different extensions

Hi ! I wonder if anyone can help on this : I have a directory: /xyz that has the following files: chsLog.107.20130603.gz chsLog.115.20130603 chsLog.111.20130603.gz chsLog.107.20130603 chsLog.115.20130603.gz As you ca see there are two files that are the same but only with a minor... (10 Replies)
Discussion started by: fretagi
10 Replies

10. Shell Programming and Scripting

Finds all duplicate files

Hi, How would you write bash script that given a directory as an argument and finds all duplicate files (with same contents - by using bytewise comparison) there and prints their names? (6 Replies)
Discussion started by: elior
6 Replies
Tree::Simple::Visitor::Sort(3pm)			User Contributed Perl Documentation			  Tree::Simple::Visitor::Sort(3pm)

NAME
Tree::Simple::Visitor::Sort - A Visitor for sorting a Tree::Simple object heirarchy SYNOPSIS
use Tree::Simple::Visitor::Sort; # create a visitor object my $visitor = Tree::Simple::Visitor::Sort->new(); $tree->accept($visitor); # the tree is now sorted ascii-betically # set the sort function to # use a numeric comparison $visitor->setSortFunction($visitor->NUMERIC); $tree->accept($visitor); # the tree is now sorted numerically # set a custom sort function $visitor->setSortFunction(sub { my ($left, $right) = @_; lc($left->getNodeValue()->{name}) cmp lc($right->getNodeValue()->{name}); }); $tree->accept($visitor); # the tree's node are now sorted appropriately DESCRIPTION
This implements a recursive multi-level sort of a Tree::Simple heirarchy. I think this deserves some more explaination, and the best way to do that is visually. Given the tree: 1 1.3 1.2 1.2.2 1.2.1 1.1 4 4.1 2 2.1 3 3.3 3.2 3.1 A normal sort would produce the following tree: 1 1.1 1.2 1.2.1 1.2.2 1.3 2 2.1 3 3.1 3.2 3.3 4 4.1 A sort using the built-in REVERSE sort function would produce the following tree: 4 4.1 3 3.3 3.2 3.1 2 2.1 1 1.3 1.2 1.2.2 1.2.1 1.1 As you can see, no node is moved up or down from it's current depth, but sorted with it's siblings. Flexible customized sorting is possible within this framework, however, this cannot be used for tree-balancing or anything as complex as that. METHODS
new There are no arguments to the constructor the object will be in its default state. You can use the "setNodeFilter" and "setSortFunction" methods to customize its behavior. includeTrunk ($boolean) Based upon the value of $boolean, this will tell the visitor to include the trunk of the tree in the sort as well. setNodeFilter ($filter_function) This method accepts a CODE reference as it's $filter_function argument and throws an exception if it is not a code reference. This code reference is used to filter the tree nodes as they are sorted. This can be used to gather specific information from a more complex tree node. The filter function should accept a single argument, which is the current Tree::Simple object. setSortFunction ($sort_function) This method accepts a CODE reference as it's $sort_function argument and throws an exception if it is not a code reference. The $sort_function is used by perl's builtin "sort" routine to sort each level of the tree. The $sort_function is passed two Tree::Simple objects, and must return 1 (greater than), 0 (equal to) or -1 (less than). The sort function will override and bypass any node filters which have been applied (see "setNodeFilter" method above), they cannot be used together. Several pre-built sort functions are provided. All of these functions assume that calling "getNodeValue" on the Tree::Simple object will return a suitable sortable value. REVERSE This is the reverse of the normal sort using "cmp". NUMERIC This uses the numeric comparison operator "<=>" to sort. REVERSE_NUMERIC The reverse of the above. ALPHABETICAL This lowercases the node value before using "cmp" to sort. This results in a true alphabetical sorting. REVERSE_ALPHABETICAL The reverse of the above. If you need to implement one of these sorting routines, but need special handling of your Tree::Simple objects (such as would be done with a node filter), I suggest you read the source code and copy and modify your own sort routine. If it is requested enough I will provide this feature in future versions, but for now I am not sure there is a large need. visit ($tree) This is the method that is used by Tree::Simple's "accept" method. It can also be used on its own, it requires the $tree argument to be a Tree::Simple object (or derived from a Tree::Simple object), and will throw and exception otherwise. It should be noted that this is a destructive action, since the sort happens in place and does not produce a copy of the tree. BUGS
None that I am aware of. Of course, if you find a bug, let me know, and I will be sure to fix it. CODE COVERAGE
See the CODE COVERAGE section in Tree::Simple::VisitorFactory for more inforamtion. SEE ALSO
These Visitor classes are all subclasses of Tree::Simple::Visitor, which can be found in the Tree::Simple module, you should refer to that module for more information. ACKNOWLEDGEMENTS
Thanks to Vitor Mori for the idea and much of the code for this Visitor. AUTHORS
Vitor Mori, <vvvv767@hotmail.com> stevan little, <stevan@iinteractive.com> COPYRIGHT AND LICENSE
Copyright 2004, 2005 by Vitor Mori & Infinity Interactive, Inc. <http://www.iinteractive.com> This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.10.1 2005-07-14 Tree::Simple::Visitor::Sort(3pm)
All times are GMT -4. The time now is 01:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy