finding duplicate files by size and finding pattern matching and its count


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting finding duplicate files by size and finding pattern matching and its count
# 1  
Old 12-01-2006
finding duplicate files by size and finding pattern matching and its count

Hi,

I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern.

Note:These are the samples of two files,but i can have more duplicate and original pairs.

Input:
------
File_1 and File_2

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
???????????????????????????????????
Name=Jerome
City=chicago
Name/city:Jerome-Chicago
Address#???????????????????
Place:/Chicago
counry::/US

Name=John
City=Detroit
Name/city:John-Detroit
Address#???????????????????
Place:/Detroit
counry::/US

Name=Josephine
City=Chicago
Name/city:Josephine-Chicago
Address#???????????????????
counry::/US

Check1:
------------
-rwxrwxrwx 1 tstibill tstibill 374 Dec 1 13:03 File1
-rwxrwxrwx 1 tstibill tstibill 374 Dec 1 13:02 File2

374 bytes

Check 2:
-----------
take anyone file suppose File_1 and find the pattern and count for
Name/city:
Address#
Place:/
counry::/

Output
----------
pattern,count,filename
Name/city:,3,File_1
Address#,3,File_1
Place:/,2,File_1
counry::/,3,File_1


I hope,I didnt confuse anyone
# 2  
Old 12-01-2006
For finding duplicates not only by size by file naming convention too

Hi All,
sorry for rephrasing.
while finding duplicates I will use file naming convention(substring of files 1,4) and file size too.
# 3  
Old 12-01-2006
This will help:
man find (-name -size)
man grep (-c)
man diff
Regards.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding lines of specific size in files using sed

i am using sed to detect any lines that are not exactly 21. the following gives me the lines that ARE exactly 21. i want the opposite , i want the two lines that are not size 21 (shown in bold) type a.a 000008050110010201NNN 000008060810010201NNN 21212000008070110010201NNN... (5 Replies)
Discussion started by: boncuk
5 Replies

2. Shell Programming and Scripting

Finding matching patterns in two files

Hi, I have requirement to find the matching patterns of two files in Unix. One file is the log file and the other is the error list file. If any pattern in the log file matches the list of errors in the error list file, then I would need to find the counts of the match. For example, ... (5 Replies)
Discussion started by: Bobby_2000
5 Replies

3. Shell Programming and Scripting

Finding all files based on pattern

Hi All, I need to find all files in a directory which are containing specific pattern. Thing is that file name should not consider if pattern is only in commented area. all contents which are under /* */ are commented all lines which are starting with -- or if -- is a part of some sentence... (13 Replies)
Discussion started by: Lakshman_Gupta
13 Replies

4. Programming

Finding duplicate files in two base directories

Hello All, I have got some assignment to complete till this Monday and problem statement is as follow :- Problem :- Find duplicate files (especially .c and .cpp) from two project base directories with following requirement :- 1.Should be extendable to search in multiple base... (4 Replies)
Discussion started by: anand.shah
4 Replies

5. Shell Programming and Scripting

Finding size of files with spaces in their file names

I am running a UNIX script to get unused files and their sizes from the server. The issue is arising due to the spaces present in the filename/folder names.Due to this the du -k command doesn't work properly.But I need to calculate the size of all files including the ones which have spaces in them.... (4 Replies)
Discussion started by: INNSAV1
4 Replies

6. UNIX for Dummies Questions & Answers

Finding the files and count then

Hi i was trying to find the files which are not older than one day and copy them to other location . but i need to count the number of files and the copy them if the count is matches my number A=`find $SOURCE/* -type f -mtime -1 ` in the code above i need to count the number of file A has... (8 Replies)
Discussion started by: vikatakavi
8 Replies

7. UNIX for Dummies Questions & Answers

finding all files that do not match a certain pattern

I hope I'm asking this the right way -- I've been sending out a lot of resumes and some of them I saw on Craigslist -- so I named the file as 'Craigslist -- (filename)'. Well I noticed that at least one of the files was misspelled as 'Craigslit.' I want to eventually try to write a shell... (5 Replies)
Discussion started by: Straitsfan
5 Replies

8. Shell Programming and Scripting

Finding conserved pattern in different files

Hi power user, For examples, I have three different files: file 1: file2: file 3: AAA CCC ZZZ BBB BBB CCC CCC DDD DDD DDD TTT AAA EEE AAA XXX I... (8 Replies)
Discussion started by: anjas
8 Replies

9. Shell Programming and Scripting

Finding Duplicate files

How do you delete and and find duplicate files? (1 Reply)
Discussion started by: Jicom4
1 Replies

10. Shell Programming and Scripting

Finding cumulative size of files older than certain days

Hi All, I've got a ton of files in a particular directory. I want to find pdf files older than 30 days in that directory and then the cumulative size of those files. Ex: find /home/jk/a -name "*.pdf" -mtime +30 consider it finds the below 4 files. /home/jk/a/1.pdf /home/jk/a/2.pdf... (1 Reply)
Discussion started by: rohan076
1 Replies
Login or Register to Ask a Question