Help with parsing mailbox folder list (identify similar folders)
List sample:
I'm helping to migrate a mail server from a case sensitive folder name space to a case insensitive one.
The case sensitive space was able to accommodate folders like "test", "TEST" and "Test" as different folders,
the new system will only allow one of these (on the same lever per user).
The current system will _not_ allow the same level of folder to have an identical name - that the output below appears to show the opposite e.g. Trainees,
means Trainees is a parent folder:
given this behavior, I don't care about Trainees being the same - that it is (identical, and on the same level) indicates it's a parent folder.
but I do care about Psychology externs and psychology eXterns given their shared parent.
In this example:
I don't care about the first "/Goose/" - it's a parent with children,
but "moose" I do care about because it's in the same container ("/Goose/") as "/Moose/" - so the new system will not allow this.
Similarly, I care about "Goose/Moose/goose" and "Goose/Moose/Goose" because "goose" and "Goose" are in the same "Goose/Moose/" container,
and again this is unacceptable to the new system.
For each user, I'd like to identify the folder path levels that are identical except in case - e.g.
Again, I don't care about folder paths that are completely identical (case included) as this will indicate it's a parent folder.
Any ideas or working pseudo code?
Thanks for any info. And I hope this was clear and I didn't miss any edge cases.
Bill
Last edited by Scott; 01-23-2011 at 11:53 AM..
Reason: Code tags
It's not clear what you are asking for.
Are you asking for a mapping strategy?
I would encourage users to rename everything that differs only by case themselves, and adopt a straightforward rule for those that ignore you. Maybe something like this:
* Keep everything the same until there is a conflict
* Resolve the first conflicting name by adding a trailing underscore
* Add a trailing digit after the underscore if there are multiple conflicts
We first create a list of all relevant directories.
Then extract all case-significant duplicates and re-search the original list for case-insignificant matches.
Reasonably efficient approach for large numbers of directories and a moderate numbers of case-significant duplicates.
Footnote:
It always helps to know what Operating System and version you have and what Shell you prefer.
The code posted should work with most versions of unix or Linux with Bourne-like Shell (sh, bash, ksh etc.).
For the benefit of the "UUOC" police, I prefer left-to-right processing and have yet to find anything faster than "cat" for placing text records on a pipeline.
Last edited by methyl; 01-23-2011 at 05:13 PM..
Reason: Layout
I use Bash on Solaris 10, OS X 10.6, or Red Hat Enterprise Linux 5 - if necessary.
tr '[:upper:]' '[:lower:]' | sort | uniq
was what I needed - from there it was pretty clear which were the duplicates.
Your UOC is fine by me - definitely not an egregious case
Thanks again!
Bill
I think the bit about identical parent folder paths was unnecessary and confusing - apologies - the paths still need to be unique and that is determined by their entire length, whatever their duplicate column paths may be.
Last edited by spacegoose; 01-25-2011 at 08:22 PM..
Glad the code works.
I had a comparable problem some years ago when consolidating multiple smaller servers into one large server where many users had accounts on more than one of the original computers ... and were not consistent in the upper/lower case naming of their directories.
Regularly we have questions like: i have an XML (C, C++, ...) file with this or that property and i want to extract the content of this or that tag (function, ...). How do i do it in sed?
Yes, in some (very limited) cases this is possible, but in general this can't be done. That is: you can do... (0 Replies)
Hi,
I need to write a script the has to copy the files from folders and subfolders to the same folder structure located in another location.
Ex:
mainfolder1
file1,file2,file3 subfolder1(file1,etc) subfolder2(file1,etc) to another folder location of same folder structure.
rsync is not... (7 Replies)
Dear unix-Community,
great to be here!
Actually i try to build a script to sort out my serials into an series-folder.
Reason is: plex cant handle mixed folder filled with other stuff than series only.
First shot was ls in combination with grep and regex.
Got no positiv result.
Then i... (3 Replies)
Hi,
I have a zip file created on a Linxux server that I need to extract on a Windows machine...
The zip file containing folders with the same name but they each have a different case, one if camel case and the other is just capitalised. When I extract using 7zip, I get prompted if I want to... (3 Replies)
Hi everyone I am new to the forums.
I haven't done much linux myself but I have been asked if I can do the following.
Write a linux script that needs to scan a certain folder every x amount of minutes and if there is a file in the folder then it needs to call a different script.
Is this... (2 Replies)
I have in directory /media/AUDIO/WAVE many .mp3 files with names like:
my filename_01of02.mp3
my filename_02of02.mp3
Your File_01of06.mp3
Your File_02of06.mp3
etc....
In the same directory, /media/AUDIO/WAVE, I have many folders with names like
9780743579490
9780743579491
etc..
Inside... (7 Replies)
Hi, I have installed ImapSync on Linux Debian.
I tried run command to copy from Server A to Server B. It's run but imapsync doesn't create mailbox folder structure.
I don't know if there is a command to force creation of mailbox's folders and subfolder.
My command is below
imapsync --host1... (0 Replies)
Hello Friends,
I have .tar files which exists under different directories after the below code is run:
find . -name "*" -type f -print | grep .tar > tmp.txt
cat tmp.txt
./dir1/subdir1/subdir2/database-db1_28112009.tar
./dir2/subdir3/database-db2_28112009.tar... (2 Replies)