Combining multiple files into one with the same name/different extension


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Combining multiple files into one with the same name/different extension
# 1  
Old 04-29-2010
Combining multiple files into one with the same name/different extension

I've been trying to find information in regard to creating a script that will generate HTML files. I currently have a series of files that contain code I need to surround with a <textarea> tag for easy viewing. I have about a thousand files that contain code, one file that contains the HTML code up to the textarea tag, and then a second file that contains everything from /textarea down. I need a script that will perform this process:

1) Concatenate the contents of file1.txt into an HTML file
2) Concatenate the contents of a code file into the HTML file
3) Concatenate the contents of file2.txt into the HTML
4) Set the HTML file name to the same name as the code file in step 2 with the .html extension
5) Move onto the next file

I looked around online and with a little help from another forum, I wound up with this:

Code:
#!/bin/bash

FILES="*"
for f in "$FILES"
do
        echo "Processing $f file..."
        cat file1.txt $f file2.txt > $f.html
        cat $f
done

Naturally, this doesn't work. Can anyone help me figure out why or perhaps offer a better way? I'm rather new to Linux, so I'd appreciate if you could be as specific and descriptive as possible. Thank you in advance.

Best regards.
# 2  
Old 04-29-2010
If I am understanding you correctly, you have:

1) A header file that contains HTML code that you wish to prepend to:

2) A series of 1,000 files that have text in them followed by:

3) A file with HTML in it that you want to append to the first fixed file and the second variable file.

So the processed file would be:

| FIXED HTML HEADER | + | file 1 - 1,000 files each in turn | + | FIXED HTML tails to first two |

Is this correct? If so, you bash script is close. How is it not working?
# 3  
Old 04-29-2010
Thank you for the reply. When I run the script, it creates one file. I'm looking for one HTML file per code file. Apparently I need the script to run equal to the number of files I have in the directory. So using your example:

| HTML HEADER | + | Firstfile.c | + | HTML tails to first two | into Firstfile.html
| HTML HEADER | + | Secondfile.c | + | HTML tails to first two | into Secondfile.html
...
...
| HTML HEADER | + | Lasttfile.c | + | HTML tails to first two | into Lastfile.html
# 4  
Old 04-29-2010
...deleted potentially malfunctioning code...

---------- Post updated at 02:16 ---------- Previous update was at 02:05 ----------

[/COLOR]
Quote:
Originally Posted by 12o
When I run the script, it creates one file.
Code:
for f in "$FILES"

If you quote $FILES than the "*" does not get "translated" into filenames.
Remove the quotes and it will work.
Besides I think the "cat $f" line is unnecessary.
When the script ends, you will need to manually delete file1.txt.html, file2.txt.html and nameofyourscript.txt.html file.

Last edited by pseudocoder; 04-29-2010 at 11:40 PM..
# 5  
Old 04-29-2010
Hi, pseudocoder:

Quote:
Originally Posted by pseudocoder
Code:
$ for i in `ls | egrep -v -e '(file1.txt|file2.txt)'`
> do
> echo "Processing $i file..."
> cat file1.txt $i file2.txt > $i.html
> done
$


I don't mean to nitpick your contribution, but I just wanted to point out an easy way to improve its reliability and efficiency.

That for loop list cannot handle filenames that include IFS characters (by default, this includes spaces, tabs, and newlines). Also, the egrep will not strictly match a file named "file1.txt" or "file2.txt"; "afile1.txt" would match as well.

I would suggest the following alternative, which remedies both issues and doesn't require external executables:

Code:
for i in *; do
    case $i in
        file[12].txt) continue;;
    esac
    ... further processing ...
done

Regards,
Alister
# 6  
Old 04-29-2010
First create the fixed header and tail files (text in my case, HTML in your case...)

Code:
$ echo "I am the header" > header.txt
$ echo "I am the tail" > tail.txt
$ cat header.txt
I am the header
$ cat tail.txt 
I am the tail
$ for f in {1..10}; do echo "file name=$f.txt">$f.txt; done
$ cat {1..10}.txt
file name=1.txt
file name=2.txt
file name=3.txt
file name=4.txt
file name=5.txt
file name=6.txt
file name=7.txt
file name=8.txt
file name=9.txt
file name=10.txt

Next combine the header, body (1-10), and tail into new files (1-10) with a new extension:

Code:
$ for f in {1..10}.txt; do cat header.txt "$f" tail.txt > "${f%.txt}.html"; done
$ cat {1..10}.html
I am the header
file name=1.txt
I am the tail
I am the header
file name=2.txt
I am the tail
...
I am the header
file name=10.txt
I am the tail
$

The line "${f%.txt}.html" changes the file name extension for the output. Bash substring replacement is documented HERE

---------- Post updated at 07:17 PM ---------- Previous update was at 05:41 PM ----------

Quote:
Originally Posted by pseudocoder
If you quote $FILES than the "*" does not get "translated" into filenames.
Remove the quotes and it will work.
Not exactly. Quoting does have special meaning, such as creating file names with spaces. Please follow along with these examples:
Code:
$ mkdir test
$ cd test
$ touch "one two three.txt" '3 two 1.txt' "1 2 3.txt" '9 10 11.txt'

But you can still glob and expand variables inside quotes, such as:

Code:
$ for i in "*two*.txt"; do ls -l $i; done
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:32 3 two 1.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:29 one two three.txt
# notice that the * glob expands inside the "*two*.txt"
# you can also use:  for i in *two*.txt; do ls -l "$i"; done

# or

$ string="one 2 three 4 five 6 seven ocho"
$ for i in $string; do echo $i; done
one
2
three
4
five
6
seven
ocho
# "$string" is broken on spaces (IFS) and the loop is executed 8 times
$ for i in "$string"; do echo $i; done
one 2 three 4 five 6 seven ocho
# $string is quoted so no break on IFS on the value of $string
for i in '$string'; do echo $i; done
$string
# single quotes, no variable expansion....

Finally, if you do not quote certain forms, you will get unexpected results entirely, such as with a regex:
Code:
$ for i in [[:digit:]]*.txt; do ls -l $i; done
ls: 1: No such file or directory
ls: 2: No such file or directory
ls: 3.txt: No such file or directory
ls: 1.txt: No such file or directory
ls: 3: No such file or directory
ls: two: No such file or directory
ls: 10: No such file or directory
ls: 11.txt: No such file or directory
ls: 9: No such file or directory
# the unquoted regex with POSIX [:digit:] character class found the files but then the shell breaks the return on IFS...
$ for i in "[[:digit:]]*"; do ls -l $i; done
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 1 2 3.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 3 two 1.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 9 10 11.txt
#alternatively:
$ for i in [[:digit:]]*; do ls -l "$i"; done
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 1 2 3.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 3 two 1.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 9 10 11.txt
# BUT! quoting both does not work:
$ for i in "[[:digit:]]*"; do ls -l "$i"; done
ls: [[:digit:]]*: No such file or directory


So long story short:

1) Globs [*?] glob inside double or single quotes at the shell to expand file names,
2) Variable expansion happens inside double quotes but not single quotes.

Alister: PLEASE correct that if incorrect!!! :-}}


Quote:
Originally Posted by pseudocoder
When the script ends, you will need to manually delete file1.txt.html, file2.txt.html and nameofyourscript.txt.html file.
Not if you do the string substitution in Bash using the % # %% ##

Last edited by drewk; 04-29-2010 at 10:12 PM.. Reason: Added link to bash doc...
# 7  
Old 04-29-2010
Alister,
you are completely right! Thank you for that hint(s).

---------- Post updated at 04:37 ---------- Previous update was at 04:26 ----------

drewk,
did you even run 12o's script?
Quote:
Originally Posted by drewk
Not exactly. Quoting does have special meaning, such as creating file names with spaces.
I only referred to 12o's script.
Quote:
Originally Posted by drewk
Not if you do the string substitution in Bash using the % # %% ##
You did not run 12o's script Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Combining certain columns of multiple files into one file

Hello Unix gurus, I have a large number of files (say X) each containing two columns of data and the same number of rows. I would like to combine these files to create a unique merged file containing X columns corresponding to the second column of each file (with a bonus of having the first... (3 Replies)
Discussion started by: ksennin
3 Replies

2. UNIX for Beginners Questions & Answers

Combining multiple files into one

Hello Everyone, I have 4 different files (one column in each) that I'm trying to combine into 1 file with four columns. Having issues trying to get the columns to format properly. I have tried the following: paste file1 file2 file3 file4 | column -s $'\t' -t > results.txt paste file1 file2... (1 Reply)
Discussion started by: malk71
1 Replies

3. Shell Programming and Scripting

Split a file into multiple files with an extension

Hi I have a file with 100 million rows. I want to split them into 1000 subfiles and name them from 1.xls to 1000.xls.. Can I do it in awk? Thanks, (8 Replies)
Discussion started by: Diya123
8 Replies

4. Shell Programming and Scripting

Combining columns from multiple files into one single output file

Hi, I have 3 files with one column value as shown File: a.txt ------------ Data_a1 Data_a2 File2: b.txt ------------ Data_b1 Data_b2 Data_b3 Data_b4 File3: c.txt ------------ Data_c1 Data_c2 Data_c3 Data_c4 Data_c5 (6 Replies)
Discussion started by: vfrg
6 Replies

5. Shell Programming and Scripting

Combining multiple files

I have 2 files. each having 3 coloums 1st field date as 20130322 2nd field time as 05:55 3rd field numberic value File 2 has entries missing for some date time. FILE1 20130322 05:35 2219 20130322 05:40 1809 20130322 05:45 1617 20130322 05:50 ... (2 Replies)
Discussion started by: sandeepkmehra
2 Replies

6. Shell Programming and Scripting

Combining multiple column files into one with file name as first row

Hello All, I have several column files like this $cat a_b_s1.xls 1wert 2tg 3asd 4asdf 5asdf $cat c_d_s2.xls 1wert 2tg 3asd 4asdf 5asdf desired put put $cat combined.txt s1 s2 (2 Replies)
Discussion started by: avatar_007
2 Replies

7. Shell Programming and Scripting

Combining columns from multiple files to one file

I'm trying to combine colums from multiple file to a single file but having some issues, appreciate your help. The filenames are the same except for the extension, path1.m0 --------- a b c d e f g h i path1.m1 --------- m n o p q r s t u File names are path1.m The... (3 Replies)
Discussion started by: rkmca
3 Replies

8. Shell Programming and Scripting

Merge text files while combining the multiple header/trailer records into one each.

Situation: Our system currently executes a job (COBOL Program) that generates an interface file to be sent to one of our vendors. Because this system processes information for over 100,000 employees/retirees (and growing), we'd like to multi-thread the job into processing-groups in order to... (4 Replies)
Discussion started by: oordonez
4 Replies

9. Shell Programming and Scripting

Combining Multiple files in one in a perl script

All, I want to combine multiple files in one file. Something like what we do on the commad line as follows -> cat file1 file2 file3 > Main_File. Can something like this be done in a perl script very efficiently? Thanks, Rahul. (1 Reply)
Discussion started by: rahulrathod
1 Replies

10. UNIX for Dummies Questions & Answers

Renaming multiple files, to get rid of extension

I have a good script to rename multiple files, but what's the best way I can remove some text from multiple filenames? Say I have a directory with 35 files with a .XLS at the end, how can I rename them to remove the .XLS but keep everything the same, without having to mv manually. Thanks. (6 Replies)
Discussion started by: nj78
6 Replies
Login or Register to Ask a Question