Find distinct files in directory


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find distinct files in directory
# 1  
Old 01-14-2015
Blade Find distinct files in directory

Hi All,

I am working on one of the script developement for my project, where I need to find the distinct types of files from given directory based on pattern provided.

for e.g. directory listing is :

Code:
abc123.dat.20141212_021012
abc123.dat.20141312_041012
abc123.dat.20141112_031012
cdf023.dat.20141112_031012
cdf023.dat.20141112_031012
cdf023.dat.20141112_031012

Run the script : ksh find_distinct_files.ksh -Pattern *.dat.????????_??????

output should be :

Code:
1) abc123.dat
2) cdf023.dat

similarly, there can be any other pattern too for file in given directory , below are some more examples
e.g.
1) input pattern : ???_*.dat ==>
files in directory :
Code:
       123_abc.dat
        234_abc.dat

output : abc.dat



1) input pattern : *_????.dat ==>
files in directory :
Code:
        abc_2014.dat
        abc_2013.dat

output : abc.dat

My point is, there can be any pattern of the file may exists in the directory.. so I need to find the all files matching input pattern and keep distinct files type. May be in another worlds, from given String, remove portion matching the pattern and keep rest..


Can anybody give me some pointers / ideas of how this can be done.Smilie

Thanks in Advance .. Smilie

Thanks
Abhijeet R

Last edited by Corona688; 01-14-2015 at 01:03 PM.. Reason: code tags please
# 2  
Old 01-14-2015
Your specifications are vague, so I will make an assumption:
file names have 3 dot-separated fields in them - the first two fields are to be considered.

Code:
cd /some/directory
ls | awk -F '.' '{print $1 "." $2}' | sort -u

If you want a better answer, please provide better specifications.
# 3  
Old 01-14-2015
Thanks for quick reply, my bad that I did not clarify myself . . .

My point was, there can be any pattern of the file may exists in the directory.. so I need to find the all files matching input pattern and keep distinct files type. May be in another worlds, from given String, remove portion matching the pattern and keep rest..


e.g.
1) input pattern : ???_*.dat ==>
files in directory :
123_abc.dat
234_abc.dat
output : abc.dat

1) input pattern : *_????.dat ==>
files in directory :
abc_2014.dat
abc_2013.dat
output : abc.dat

Please tell me if this explains the problem statement. updating above in the original post

Thanks
Abhijeet R
# 4  
Old 01-14-2015
So - you want to ls the files matching the pattern and then remove the portinon matching one char wildcards ?, but keep the more generic wildcard(s) * matches? And you want to keep chars given literally, but remove single chars pre-, postfixed or interspersed in the ?s?
First, you'll have to quote the parameter when calling the script to prevent the shell from expanding it. find . -name "$1" might provide you with the desired file names. Let me think a while for the removal of the name portions.
This User Gave Thanks to RudiC For This Post:
# 5  
Old 01-14-2015
Thats appropriate requirement.. I could not think of any solution to this prob.
# 6  
Old 01-15-2015
For the first question, a possible solution is not too hard. Put
Code:
Y=\\\(.${1//\?/.} 
Y=${Y/../\\\)..}  
ls $1 | sed "s#$Y#\1#"

into a script and run it with your (quoted!) pattern as parameter 1. For above files, it will yield
Code:
abc123.dat
abc123.dat
abc123.dat
cdf023.dat
cdf023.dat
cdf023.dat

; you may want to pipe the result through a sort -u.

---------- Post updated at 14:58 ---------- Previous update was at 14:47 ----------

The second question is driving me crazy; I think it can't be resolved without making further assumptions on the structure of the filenames to be selected. It does not discriminate the two types given from each other. Run (combined with question 1 solution from above)
Code:
Y=${1//\?/.}
[[ "${1:1:1}" == "?" ]] &&
        Y=${Y%${1#*_}}\\\(.${1#*_}\\\) ||
        Y=\\\(.${Y/../\\\)..}

ls $1 | sed "s#$Y#\1#"

as a script with "???_*.dat" as parameter 1, and it will yield
Code:
abc.dat
abc.dat
2013.dat
2014.dat

The third one I don't even dare to tackle...

---------- Post updated at 15:41 ---------- Previous update was at 14:58 ----------

Simplified version anchoring solutions for 2 and 3 at the "_" char, still not resolving above issue:
Code:
case $1 in
        (*\?)   Y=${1#"${1%%.\?\?*}"}$ ;;
        (\?*)   Y=^${1%"${1#*_}"} ;; # ||
        (*)     Y=${1//[^\?_]/} ;;
esac
Y=${Y//\?/.}
ls $1 | sed "s#$Y##"


Last edited by RudiC; 01-15-2015 at 10:08 AM..
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

find Files in sub-directory

Hi Just want to ask, Is it possible to find a file from a directory up to its sub-directories? Thanks, cmarzan (10 Replies)
Discussion started by: cmarzan
10 Replies

2. Shell Programming and Scripting

Trying to find the distinct lines using uniq command

Platform :Oracle Linux 6.4 Shell : bash The below file has 7 lines , some of them are duplicates. There are only 3 distinct lines. But why is the uniq command still showing 7 ? I just want the distinct lines to be returned. $ cat test.txt SELECT FC.COORD_SET_ID FROM OM_ORDER_FLOW F, -... (2 Replies)
Discussion started by: kraljic
2 Replies

3. UNIX for Advanced & Expert Users

Find all files in the current directory excluding hidden files and directories

Find all files in the current directory only excluding hidden directories and files. For the below command, though it's not deleting hidden files.. it is traversing through the hidden directories and listing normal which should be avoided. `find . \( ! -name ".*" -prune \) -mtime +${n_days}... (7 Replies)
Discussion started by: ksailesh1
7 Replies

4. Shell Programming and Scripting

How to find DISTINCT rows and combine in one row?

Hi , i need to display only one of duplicated values and merged them in one record only when tag started with 3100.2.128.8 3100.2.97.1=192.168.0.12 3100.2.128.8=418/66/03e9/0044801 3100.2.128.8=418/66/03ea/0044601 3100.2.128.8=418/66/03e9/0044801 3100.2.128.8=418/66/03ea/0044601... (5 Replies)
Discussion started by: OTNA
5 Replies

5. Shell Programming and Scripting

Find distinct values

Hi, I have two files of the following format file1 chr1:345-456 chr2:123-456 chr2:455-678 chr3:456-789 chr3:444-555 file2 chr1:345-456 chr2:123-456 chr3:456-789 output (2 Replies)
Discussion started by: jacobs.smith
2 Replies

6. Shell Programming and Scripting

Search distinct files

Hello , Can anyone help me with my below query I am trying to find a text in directory of files via below command grep -i cmps_cgs_crs_rfnc_id * But it returns multiple times same file name i.e if the text found in a file 4 times the file name shown 4 times in the o/p Is... (1 Reply)
Discussion started by: Pratik4891
1 Replies

7. Solaris

Look for distinct files under a directory matching a pattern

Hi, I'm searching for a pattern 'java' under a directory but it is returning all the files containing 'java', but I want to have only distinct files not all. please help (2 Replies)
Discussion started by: b.paramanatti
2 Replies

8. UNIX for Dummies Questions & Answers

Find files and display only directory list containing those files

I have a directory (and many sub dirs beneath) on AIX system, containing thousands of file. I'm looking to get a list of all directory containing "*.pdf" file. I know basic syntax of find command, but it gives me list of all pdf files, which numbers in thousands. All I need to know is, which... (4 Replies)
Discussion started by: r7p
4 Replies

9. UNIX for Dummies Questions & Answers

Need help to find the files under a directory

Hi, I wanted to delete all the files under a directory "/apps/tmp/" which are two weeks older. But i should not delete the sub-directories and the contents of sub-directories. I also have searched in forum and found the following command, find . \( ! -name . -prune \) -mtime +13 -print ... (8 Replies)
Discussion started by: Sheethal
8 Replies

10. Shell Programming and Scripting

Find files in directory

Hi all I want to find a particular file type lets say .abc under /home/oracle/, the file name is start with 'D' and followed by ddmmyyyy date format, the file name should look like this D19092008.abc To my question, how can i perform the searching from the date 19/09/2008 to 29/09/2008. The... (3 Replies)
Discussion started by: coldstarhk
3 Replies
Login or Register to Ask a Question