Sponsored Content
Full Discussion: Concatenate Numerous Files
Operating Systems Linux Fedora Concatenate Numerous Files Post 302722875 by bakunin on Monday 29th of October 2012 12:30:45 AM
Old 10-29-2012
First off, very well done so far. You worked most of it out, but you made it more complicated for you than necessary.

Quote:
Originally Posted by sudon't
OK, I numbered the directories by hand so that they would sort in the canonical order. Now, they had the files within the directories numbered using single digit enumeration, so naturally they don't sort correctly:
Code:
./01_Old Testament/01_Genesis/Genesis1.txt
./01_Old Testament/01_Genesis/Genesis10.txt
./01_Old Testament/01_Genesis/Genesis11.txt

Actually they don't have to sort correctly - i gave you a two-step plan how to produce a filelist first and then work through that list with a loop:

Code:
find ~/Desktop/New Testament -name "*txt" -type f -print > listfile
rm resultfile ; while read file ; do cat $file spacerfile >> resultfile ; done <listfile

The second line will work through the listfile (actually a list of filenames, one every line) sequentially, but "find" will probably not write the files into the list in the order you want. This is why i told you to reorder the listfile by reordering the files - you would just have to move the lines around.

At second thought, you don't even have to move the lines around, there is a utility for that: "sort". So, here is what you do:

1. Prepare the initial listfile:

Code:
find ~/Desktop/New Testament -name "*txt" -type f -print > listfile

The result will probably look like this:

Code:
/home/user/Desktop/New Testament/Colossians/Colossians1.txt
/home/user/Desktop/New Testament/Colossians/Colossians2.txt
/home/user/Desktop/New Testament/Colossians/Colossians3.txt
/home/user/Desktop/New Testament/Colossians/Colossians4.txt
/home/user/Desktop/New Testament/John/John1.txt
/home/user/Desktop/New Testament/Mark/Mark1.txt
...

2. Sort the listfile

Now this is not sorted canonically, because Mark and John come before all the letters. Use your editor to add an order number at the beginning of the line:

Code:
3 /home/user/Desktop/New Testament/Colossians/Colossians1.txt
4 /home/user/Desktop/New Testament/Colossians/Colossians2.txt
5 /home/user/Desktop/New Testament/Colossians/Colossians3.txt
6 /home/user/Desktop/New Testament/Colossians/Colossians4.txt
2 /home/user/Desktop/New Testament/John/John1.txt
1 /home/user/Desktop/New Testament/Mark/Mark1.txt
...

Never mind that the numbers will not have all the same number of digits. For the niftly little tool i show you now this is just peanuts: "sort". This, you guessed it, sorts things - not only alphabetically, but also numerically. Read the man page of "sort" and you will see how much it can do.

So, after you have added the numbers, use "sort" to sort the file:

Code:
sort -nk1 listfile > listfile.sorted

Your file should now look like this:

Code:
1 /home/user/Desktop/New Testament/Mark/Mark1.txt
2 /home/user/Desktop/New Testament/John/John1.txt
3 /home/user/Desktop/New Testament/Colossians/Colossians1.txt
4 /home/user/Desktop/New Testament/Colossians/Colossians2.txt
5 /home/user/Desktop/New Testament/Colossians/Colossians3.txt
6 /home/user/Desktop/New Testament/Colossians/Colossians4.txt
..

Check the file again with an editor, to see if all worked out. Note, that you still have the listfile, so you can change the numbers in there and re-run the "sort" command if not everything is to your satisfaction.

3. Concatenate the files

Finally use the sorted listfile to create the output. As we have added numbers we need to modify the loop i showed you slightly:

Code:
rm resultfile ; while read num file ; do cat $file spacerfile >> resultfile ; done <listfile.sorted

If your files are well-formed you can remove the spacerfile from the call:

Code:
rm resultfile ; while read num file ; do cat $file >> resultfile ; done <listfile.sorted

A few words about your solution:

Quote:
So I worked out a regex to place a zero in front of single digit filenames:
Code:
perl -pi -e 's/(?<=[a-z])(?=[0-9]\.txt)/0/g' ./OTfilelist.txt

You shouldn't use perl for that. "perl" is a full-blown programming language - a full orchestra of its own. You don't invite a whole orchestra and then tell them you need only one triangle player, for the other instruments you have an orchestra of your own. You can use perl to do all you want to do and if you prefer "perl" above shell code that is ok. But don't write shell code and then use "perl" as a simple regex machine. The shell has its own regexp machines for that (sed, awk, ...).

Quote:
No matter, because I want things to behave in the real world, too.
So, now that I have the magic regex in hand, how can I use it to change the actual filenames?
The usual way is to use the regexp to create teh modified name, store this information in a variable and then use this variable content to change the filename. See below.

Quote:
What I've been able to glean from the web is that there is a system call called "rename" which somehow should work with perl. But there is no mention of "rename" in the perl man page. On the other hand, there is a man page for rename, but it doesn't contain anything that I found illuminating. I'm guessing this is something that has to be called from a script?
"rename" is probably a "perl"-command and internal to this language. In shell code you use "mv", which is short for "move".

The sketch for renaming files would look like this:

Code:
<some pipeline providing a list of filenames> | while read filename ; do
     filenew="$(echo "$filename" | sed 's/\([a-z]\)\([0-9]\)\.txt/\10\2.txt/')
     mv "$filename" "$filenew"
done

I hope this helps.

bakunin
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Deleting numerous files

Hi there, I have numerous files in a directory (approx 2500) that I want to delete although I get the following:- Server> rm *.* Arguments too long Is there a proper way of deleting this rather than breaking it down further through the list of files rm *10.* rm *11.* rm *12.* ... (10 Replies)
Discussion started by: Hayez
10 Replies

2. UNIX for Dummies Questions & Answers

How to concatenate all files.

Hi, I'm totally new to Unix. I'm an MVS mainframer but ran into a situation where a Unix server I have available will help me. I want to be able to remotely connect to another server using FTP, login and MGET all files from it's root or home directory, logout, then login as a different user and do... (1 Reply)
Discussion started by: s80bob
1 Replies

3. Shell Programming and Scripting

Script to concatenate several files

I need a script to concatenate several files in one step, I have 3 header files say file.S, file.X and file.R, I need to concatenate these 3 header files to data files, say file1.S, file1.R, file1.X so that the header file "file.S" will be concatenated to all data files with .S extentions and so on... (3 Replies)
Discussion started by: docaia
3 Replies

4. Shell Programming and Scripting

Concatenate rows in to 2 files

I have 2 files FILEA 1232342 1232342 2344767 4576823 2325642 FILEB 3472328 2347248 1237123 1232344 8787890 I want the output to go into a 3rd file and look like: FILEC 1232342 3472328 (1 Reply)
Discussion started by: unxusr123
1 Replies

5. Shell Programming and Scripting

Concatenate files

I have directory structure sales_only under which i have multiple directories for each dealer example: ../../../Sales_Only/xxx_Dealer ../../../Sales_Only/yyy_Dealer ../../../Sales_Only/zzz_Dealer Every day i have one file produce under each directory when the process runs. The requirement... (3 Replies)
Discussion started by: mohanmuthu
3 Replies

6. Shell Programming and Scripting

Concatenate files

Hi, I want to create a batch(bash) file to combine 23 files together. These files have the same extension. I want the final file is save to a given folder. Once it is done it will delete the 23 files. Thanks for help. Need script. (6 Replies)
Discussion started by: zhshqzyc
6 Replies

7. Shell Programming and Scripting

Concatenate files

I have a file named "file1" which has the following data 10000 20000 30000 And I have a file named "file2" which has the following data ABC DEF XYZ My output should be 10000ABC 20000DEF (3 Replies)
Discussion started by: bobby1015
3 Replies

8. UNIX for Dummies Questions & Answers

Concatenate Several Files to One

Hi All, Need your help. I will need to concatenate around 100 files but each end of the file I will need to insert my name DIRT1228 on each of the file and before the next file is added and arrived with just one file for all the 100files. Appreciate your time. Dirt (6 Replies)
Discussion started by: dirt1228
6 Replies

9. UNIX for Dummies Questions & Answers

Concatenate files

Hi I am trying to learn linux step by step an i am wondering can i use cat command for concatenate files but i want to place context of file1 to a specific position in file2 place of file 2 and not at the end as it dose on default? Thank you. (3 Replies)
Discussion started by: iliya24
3 Replies

10. UNIX for Dummies Questions & Answers

Concatenate files and delete source files. Also have to add a comment.

- Concatenate files and delete source files. Also have to add a comment. - I need to concatenate 3 files which have the same characters in the beginning and have to remove those files and add a comment and the end. Example: cat REJ_FILE_ABC.txt REJ_FILE_XYZ.txt REJ_FILE_PQR.txt >... (0 Replies)
Discussion started by: eskay
0 Replies
All times are GMT -4. The time now is 03:18 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy