The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Advanced & Expert Users
Google UNIX.COM


UNIX for Advanced & Expert Users Advanced UNIX and Linux questions go here. Expert-to-Expert.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
select a record from one file matching from second file using awk synmag Shell Programming and Scripting 7 06-11-2008 11:37 PM
urgent help need with file matching script rider29 Shell Programming and Scripting 4 05-21-2008 12:14 PM
Script to find file name for non matching pattern sujoy101 Shell Programming and Scripting 5 03-31-2008 06:10 AM
Reading lines in a file matching a pattern torenji Shell Programming and Scripting 4 10-25-2007 01:15 AM
getting file words as pattern matching arunkumar_mca High Level Programming 5 05-31-2005 12:28 AM

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 04-23-2008
Registered User
 

Join Date: Apr 2008
Location: Montreal, Quebec
Posts: 2
I need some help matching my file database to my filesystem.

Hi There I have a big problem, but maybe a simple question.

I'm attempting something that is turning out to be a huge job, but maybe it could be simpler if I knew some more advanced commands or techniques.

My problem is this: I work for a company that has a primary file server with 27T capacity. I have a database keeping track of the files on this server that reports that only ~10T total when I sum the file sizes.

I need to recover the wasted space by deleting files that are not referenced in the database, and then prevent this storage leak from occuring again. I know that my predecessor has "lost" the database before on at least on occasion and restored an out of date backup.

Some details:
The file ids are stored on the file server as the mysql autoincrement id like this:

id -> padded with leading 0's and broken into 2 digit folders, each file is stored in a unique folder.
1194649 -> /ifs/data/00/00/01/19/46/49/filename.ext

So far I have made a list of all directories on the file server with `find -type d` and outputted a list of all file id's. One id per line.

I tried to get unique ids that do not exist in the database and the file server directory tree by doing:

Code:
cat file_ids_from_db file_ids_from_filesystem | sort | uniq -u > list_of_unique_ids
This didn't work, I tested a sample of the ids on the database before trying to delete the files and I found matches.

My question is this: What is the best way to do a left join between these files, only getting ids that are in the filesystem and not the database?

Another question is why am I doing it this way, is there a better solution?

Thank you in advance for any help.

-Charles
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 04-23-2008
Registered User
 

Join Date: May 2007
Posts: 4
Angry Hi,

Pls check the following command

comm -13 <(sort file1) <(sort file2)

Thanks,
Thangaraju.

Last edited by rajx; 04-23-2008 at 05:15 PM.
Reply With Quote
  #3 (permalink)  
Old 04-23-2008
Registered User
 

Join Date: Apr 2008
Location: Montreal, Quebec
Posts: 2
Thanks, I'm trying the comm command out. I'll post tomorrow to let you know if I solved this problem.
Reply With Quote
Google UNIX.COM
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 06:40 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0