finding duplicate files by size and finding pattern matching and its count


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting finding duplicate files by size and finding pattern matching and its count
Prev   Next
# 1  
Old 12-01-2006
finding duplicate files by size and finding pattern matching and its count

Hi,

I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern.

Note:These are the samples of two files,but i can have more duplicate and original pairs.

Input:
------
File_1 and File_2

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
???????????????????????????????????
Name=Jerome
City=chicago
Name/city:Jerome-Chicago
Address#???????????????????
Place:/Chicago
counry::/US

Name=John
City=Detroit
Name/city:John-Detroit
Address#???????????????????
Place:/Detroit
counry::/US

Name=Josephine
City=Chicago
Name/city:Josephine-Chicago
Address#???????????????????
counry::/US

Check1:
------------
-rwxrwxrwx 1 tstibill tstibill 374 Dec 1 13:03 File1
-rwxrwxrwx 1 tstibill tstibill 374 Dec 1 13:02 File2

374 bytes

Check 2:
-----------
take anyone file suppose File_1 and find the pattern and count for
Name/city:
Address#
Place:/
counry::/

Output
----------
pattern,count,filename
Name/city:,3,File_1
Address#,3,File_1
Place:/,2,File_1
counry::/,3,File_1


I hope,I didnt confuse anyone
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding lines of specific size in files using sed

i am using sed to detect any lines that are not exactly 21. the following gives me the lines that ARE exactly 21. i want the opposite , i want the two lines that are not size 21 (shown in bold) type a.a 000008050110010201NNN 000008060810010201NNN 21212000008070110010201NNN... (5 Replies)
Discussion started by: boncuk
5 Replies

2. Shell Programming and Scripting

Finding matching patterns in two files

Hi, I have requirement to find the matching patterns of two files in Unix. One file is the log file and the other is the error list file. If any pattern in the log file matches the list of errors in the error list file, then I would need to find the counts of the match. For example, ... (5 Replies)
Discussion started by: Bobby_2000
5 Replies

3. Shell Programming and Scripting

Finding all files based on pattern

Hi All, I need to find all files in a directory which are containing specific pattern. Thing is that file name should not consider if pattern is only in commented area. all contents which are under /* */ are commented all lines which are starting with -- or if -- is a part of some sentence... (13 Replies)
Discussion started by: Lakshman_Gupta
13 Replies

4. Programming

Finding duplicate files in two base directories

Hello All, I have got some assignment to complete till this Monday and problem statement is as follow :- Problem :- Find duplicate files (especially .c and .cpp) from two project base directories with following requirement :- 1.Should be extendable to search in multiple base... (4 Replies)
Discussion started by: anand.shah
4 Replies

5. Shell Programming and Scripting

Finding size of files with spaces in their file names

I am running a UNIX script to get unused files and their sizes from the server. The issue is arising due to the spaces present in the filename/folder names.Due to this the du -k command doesn't work properly.But I need to calculate the size of all files including the ones which have spaces in them.... (4 Replies)
Discussion started by: INNSAV1
4 Replies

6. UNIX for Dummies Questions & Answers

Finding the files and count then

Hi i was trying to find the files which are not older than one day and copy them to other location . but i need to count the number of files and the copy them if the count is matches my number A=`find $SOURCE/* -type f -mtime -1 ` in the code above i need to count the number of file A has... (8 Replies)
Discussion started by: vikatakavi
8 Replies

7. UNIX for Dummies Questions & Answers

finding all files that do not match a certain pattern

I hope I'm asking this the right way -- I've been sending out a lot of resumes and some of them I saw on Craigslist -- so I named the file as 'Craigslist -- (filename)'. Well I noticed that at least one of the files was misspelled as 'Craigslit.' I want to eventually try to write a shell... (5 Replies)
Discussion started by: Straitsfan
5 Replies

8. Shell Programming and Scripting

Finding conserved pattern in different files

Hi power user, For examples, I have three different files: file 1: file2: file 3: AAA CCC ZZZ BBB BBB CCC CCC DDD DDD DDD TTT AAA EEE AAA XXX I... (8 Replies)
Discussion started by: anjas
8 Replies

9. Shell Programming and Scripting

Finding Duplicate files

How do you delete and and find duplicate files? (1 Reply)
Discussion started by: Jicom4
1 Replies

10. Shell Programming and Scripting

Finding cumulative size of files older than certain days

Hi All, I've got a ton of files in a particular directory. I want to find pdf files older than 30 days in that directory and then the cumulative size of those files. Ex: find /home/jk/a -name "*.pdf" -mtime +30 consider it finds the below 4 files. /home/jk/a/1.pdf /home/jk/a/2.pdf... (1 Reply)
Discussion started by: rohan076
1 Replies
Login or Register to Ask a Question
PTARGREP(1)						 Perl Programmers Reference Guide					       PTARGREP(1)

NAME
ptargrep - Apply pattern matching to the contents of files in a tar archive SYNOPSIS
ptargrep [options] <pattern> <tar file> ... Options: --basename|-b ignore directory paths from archive --ignore-case|-i do case-insensitive pattern matching --list-only|-l list matching filenames rather than extracting matches --verbose|-v write debugging message to STDERR --help|-? detailed help message DESCRIPTION
This utility allows you to apply pattern matching to the contents of files contained in a tar archive. You might use this to identify all files in an archive which contain lines matching the specified pattern and either print out the pathnames or extract the files. The pattern will be used as a Perl regular expression (as opposed to a simple grep regex). Multiple tar archive filenames can be specified - they will each be processed in turn. OPTIONS
--basename (alias -b) When matching files are extracted, ignore the directory path from the archive and write to the current directory using the basename of the file from the archive. Beware: if two matching files in the archive have the same basename, the second file extracted will overwrite the first. --ignore-case (alias -i) Make pattern matching case-insensitive. --list-only (alias -l) Print the pathname of each matching file from the archive to STDOUT. Without this option, the default behaviour is to extract each matching file. --verbose (alias -v) Log debugging info to STDERR. --help (alias -?) Display this documentation. COPYRIGHT
Copyright 2010 Grant McLean <grantm@cpan.org> This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.18.2 2018-08-17 PTARGREP(1)