First off, very well done so far. You worked most of it out, but you made it more complicated for you than necessary.
Quote:
Originally Posted by sudon't
OK, I numbered the directories by hand so that they would sort in the canonical order. Now, they had the files within the directories numbered using single digit enumeration, so naturally they don't sort correctly:
Actually they don't have to sort correctly - i gave you a two-step plan how to produce a filelist first and then work through that list with a loop:
The second line will work through the listfile (actually a list of filenames, one every line) sequentially, but "find" will probably not write the files into the list in the order you want. This is why i told you to reorder the listfile by reordering the files - you would just have to move the lines around.
At second thought, you don't even have to move the lines around, there is a utility for that: "sort". So, here is what you do:
1. Prepare the initial listfile:
The result will probably look like this:
2. Sort the listfile
Now this is not sorted canonically, because Mark and John come before all the letters. Use your editor to add an order number at the beginning of the line:
Never mind that the numbers will not have all the same number of digits. For the niftly little tool i show you now this is just peanuts: "sort". This, you guessed it, sorts things - not only alphabetically, but also numerically. Read the man page of "sort" and you will see how much it can do.
So, after you have added the numbers, use "sort" to sort the file:
Your file should now look like this:
Check the file again with an editor, to see if all worked out. Note, that you still have the listfile, so you can change the numbers in there and re-run the "sort" command if not everything is to your satisfaction.
3. Concatenate the files
Finally use the sorted listfile to create the output. As we have added numbers we need to modify the loop i showed you slightly:
If your files are well-formed you can remove the spacerfile from the call:
A few words about your solution:
Quote:
So I worked out a regex to place a zero in front of single digit filenames:
You shouldn't use perl for that. "perl" is a full-blown programming language - a full orchestra of its own. You don't invite a whole orchestra and then tell them you need only one triangle player, for the other instruments you have an orchestra of your own. You can use perl to do all you want to do and if you prefer "perl" above shell code that is ok. But don't write shell code and then use "perl" as a simple regex machine. The shell has its own regexp machines for that (sed, awk, ...).
Quote:
No matter, because I want things to behave in the real world, too.
So, now that I have the magic regex in hand, how can I use it to change the actual filenames?
The usual way is to use the regexp to create teh modified name, store this information in a variable and then use this variable content to change the filename. See below.
Quote:
What I've been able to glean from the web is that there is a system call called "rename" which somehow should work with perl. But there is no mention of "rename" in the perl man page. On the other hand, there is a man page for rename, but it doesn't contain anything that I found illuminating. I'm guessing this is something that has to be called from a script?
"rename" is probably a "perl"-command and internal to this language. In shell code you use "mv", which is short for "move".
The sketch for renaming files would look like this:
Yes, I do find this all extremely helpful and enlightening. Believe it or not, using sort did occur to me.
But you are right - for the immediate job at hand, renaming files is an unnecessary distraction. Unfortunately, I often get distracted with trying to order things - a symptom of my illness. On the other hand, I thought to keep the originals, and would like them to sort properly. But yes, let's leave that exercise for another time.
OK, since all directories are sorted into canonical order, and since all files have been renumbered with my little regex, this was all that was needed:
They are now all in perfect order, so let's take a moment to grab a beer out of the fridge, and go back to your original instructions....
---------- Post updated at 02:22 AM ---------- Previous update was at 01:05 AM ----------
As you can see, it lost Ecclesiastes12.txt because there's an unescaped space between Old and Testament. And it sees Song of Solomon as three different (non-existent) directories.
Also, changing the filenames in the list files was a bad idea. And in retrospect, it is clear why. So, find does not escape any spaces in filenames in it's print output. It's funny, if I just drag a file onto the Terminal, it shows the path with all spaces escaped. You would expect the opposite since drag & drop is such a Mac thing, while find is a real unix program.
Is it possible to simply pipe the stdout of find directly to cat? Perhaps that could eliminate the problem of how it prints paths? Or, better yet, pipe find to sort to cat? Am I over-estimating the omnipotence of unix? I have to admit, it's powerful one-liners that get me excited. It's what really drew me into wanting to learn unix in the first place.
Then again, it may pay to go ahead and fix the actual filenames first. Since it's 02:00 where I'm at, it may be best if I come back to it tomorrow.
The problem is in the listfile find generates. I need to find an app that will output properly escaped filenames, or fix the actual filenames. find's output to the list looks like this:
I need it to look like this:
Notice that the space between "Old" and "Testament" is not escaped, and so it breaks down. I was thinking the -d{n} flag might get it, (by skipping the directories altogether), then I realized cat probably needs the full path to find the files. I couldn't find a flag that would 'fix' the output of find, either.
I have to fix the list file, first.
Since the original filenames are predictable (identical to the containing directory followed by an incrementing index and the .txt extension), we can just build them until we construct one that doesn't exist. There is no need to sort.
The only information any solution to this problem needs to know is the sequence of books and where to find them.
The following script takes two arguments, $1, the path to the old testament books and, $2, the path to the new testament books. The sequence of book names is embedded in the script. The script begins looking for books in the old testament until a blank line in the embedded list signals it to switch to the new testament.
NOTE: Each book's name in the embedded list must be identical to the directory basename ("Genesis" in the case of "/home/your/Desktop/Bible/Old Testament/Genesis"). Same case. Same spacing.
Note the blank line before Matthew (iirc, beginning of the NT); it's critical.
If the script were stored in a file named bible.sh, the following would generate a single text file bible (using pathnames derived from your posts):
Regards,
Alister
I knew that, eventually, someone reading this thread would get frustrated and whip up a script to solve all my problems. It must be the same feeling I get when I meet someone who can barely read or write. Script writing is so far beyond my capabilities that it feels like cheating, somehow. ; )
The way they have the files set up might be a problem for your script. Indeed, it is thee problem.
Each book constitutes a directory, while each chapter constitutes a numbered file. Genesis, for instance, is broken up into fifty separate files. Correct me if I'm wrong, but it seems like your script is expecting each book to be one file. Could your embedded list contain a wildcard character? Even so, it seems to me we still have the problem of sorting. As you can see, they used single digit enumeration. But I'm going to try to fix the actual filenames, first.
---------- Post updated at 03:04 PM ---------- Previous update was at 02:36 PM ----------
OK, found out that rename is a perl script someone made up. Downloaded the code, et voila!
This should give us properly sorted lists. Now, a little find/replace to eliminate spaces.... I am now having fun.
Each book constitutes a directory, while each chapter constitutes a numbered file. Genesis, for instance, is broken up into fifty separate files.
Understood. That is exactly what my script expects.
Quote:
Originally Posted by sudon't
Correct me if I'm wrong, but it seems like your script is expecting each book to be one file. Could your embedded list contain a wildcard character?
You are wrong. Wildcards are not necessary.
Quote:
Originally Posted by sudon't
Even so, it seems to me we still have the problem of sorting. As you can see, they used single digit enumeration. But I'm going to try to fix the actual filenames, first.
My script does not require filenames to be modified, even though they do not sort properly because the numeric indices are not of equal digits. The inner while-loop generates the filenames itself.
My script is intended to work with the original filenames, unmodified.
- Concatenate files and delete source files. Also have to add a comment.
- I need to concatenate 3 files which have the same characters in the beginning and have to remove those files and add a comment and the end.
Example:
cat REJ_FILE_ABC.txt REJ_FILE_XYZ.txt REJ_FILE_PQR.txt >... (0 Replies)
Hi
I am trying to learn linux step by step an i am wondering
can i use cat command for concatenate files but i want to place context of file1 to a specific position in file2 place of file 2 and not at the end as it dose on default?
Thank you. (3 Replies)
Hi All,
Need your help.
I will need to concatenate around 100 files but each end of the file I will need to insert my name DIRT1228 on each of the file and before the next file is added and arrived with just one file for all the 100files.
Appreciate your time.
Dirt (6 Replies)
I have a file named "file1" which has the following data
10000
20000
30000
And I have a file named "file2" which has the following data
ABC
DEF
XYZ
My output should be
10000ABC
20000DEF (3 Replies)
Hi, I want to create a batch(bash) file to combine 23 files together. These files have the same extension. I want the final file is save to a given folder. Once it is done it will delete the 23 files.
Thanks for help. Need script. (6 Replies)
I have directory structure sales_only under which i have multiple directories for each dealer
example:
../../../Sales_Only/xxx_Dealer
../../../Sales_Only/yyy_Dealer
../../../Sales_Only/zzz_Dealer
Every day i have one file produce under each directory when the process runs.
The requirement... (3 Replies)
I have 2 files
FILEA
1232342
1232342
2344767
4576823
2325642
FILEB
3472328
2347248
1237123
1232344
8787890
I want the output to go into a 3rd file and look like:
FILEC
1232342 3472328 (1 Reply)
I need a script to concatenate several files in one step, I have 3 header files say file.S, file.X and file.R, I need to concatenate these 3 header files to data files, say file1.S, file1.R, file1.X so that the header file "file.S" will be concatenated to all data files with .S extentions and so on... (3 Replies)
Hi, I'm totally new to Unix. I'm an MVS mainframer but ran into a situation where a Unix server I have available will help me. I want to be able to remotely connect to another server using FTP, login and MGET all files from it's root or home directory, logout, then login as a different user and do... (1 Reply)
Hi there,
I have numerous files in a directory (approx 2500) that I want to delete although I get the following:-
Server> rm *.*
Arguments too long
Is there a proper way of deleting this rather than breaking it down further through the list of files
rm *10.*
rm *11.*
rm *12.*
... (10 Replies)