Combining multiple files into one with the same name/different extension


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Combining multiple files into one with the same name/different extension
# 8  
Old 04-30-2010
Quote:
Originally Posted by drewk
Please follow along with these examples:
Code:
$ mkdir test
$ cd test
$ touch "one two three.txt" '3 two 1.txt' "1 2 3.txt" '9 10 11.txt'

But you can still glob and expand variables inside quotes, such as:
Code:
$ for i in "*two*.txt"; do ls -l $i; done
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:32 3 two 1.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:29 one two three.txt
# notice that the * glob expands inside the "*two*.txt"
# you can also use:  for i in *two*.txt; do ls -l "$i"; done

Globbing (aka pathname expansion) never occurs within any quotes (single or double).

In that case, the string "*two*.txt" in the for loop is not expanded. The loop will always execute one time. That one time, the value of $i will be the literal string "*two*.txt". However, when the ls command is executed, the value of i is expanded to "*two*.txt" and then, at that time, since it is unquoted, that's where the globbing occurs (in short, file globbing will occur after variable expansion, if the variable expansion is unquoted).

Since globbing never occurs within quotes, that's why you notice that it never happens when you quote both the for's list and the variable in the loop.


Quote:
Originally Posted by drewk
Finally, if you do not quote certain forms, you will get unexpected results entirely, such as with a regex:
Code:
$ for i in [[:digit:]]*.txt; do ls -l $i; done
ls: 1: No such file or directory
ls: 2: No such file or directory
ls: 3.txt: No such file or directory
ls: 1.txt: No such file or directory
ls: 3: No such file or directory
ls: two: No such file or directory
ls: 10: No such file or directory
ls: 11.txt: No such file or directory
ls: 9: No such file or directory
# the unquoted regex with POSIX [:digit:] character class found the files but then the shell breaks the return on IFS...

That's correct, but let's be clear about where the field splitting is occuring. What is being split is the unquoted expansion of $i when the ls command is run. Pathname expansion, when it occurs, happens after field splitting (it actually happens after everything, except quote removal).


Quote:
Originally Posted by drewk
Code:
$ for i in "[[:digit:]]*"; do ls -l $i; done
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 1 2 3.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 3 two 1.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 9 10 11.txt
#alternatively:
$ for i in [[:digit:]]*; do ls -l "$i"; done
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 1 2 3.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 3 two 1.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 9 10 11.txt

In this particular instance, those two loops give the same result, but they are not interchangeable. Two very different things are happening. In the first instance, ls is called once, with three arguments. In the second, it is called three times, with one argument each time.

In the first, "[[:digit:]]*" is quoted, so no globbing occurs when the for loop's list is evaluated. The for loop will only execute once. After $i is expanded, the resulting "[[:digit:]]*" will not be split into fields since it does not contain any IFS characters (by default, space, tab, and newline). And then, finally, pathname expansion will occur and correctly deal with the whitespace that those filenames contain, because the globbing occurs after the field splitting step. But, remember, since the for loop list is quoted, it can only ever result in one word and the loop will only execute once (not much point in having a loop, in that case ... just write: ls [[:digit:]]*).

In the second version, since the for loop's list is not quoted, [[:digit:]]* will expand to all the filenames that match the pattern and the loop will execute once per filename. So, obviously, these two forms are not equivalent. Depending on what you are doing, you may need access to each individual filename, which is not possible with the first version.

Example that makes it clear (1 iteration versus 3):
Code:
$ j=0; for i in "[[:digit:]]*"; do printf '%d: ' $((++j)); ls $i; done
1: 1 2 3.txt    3 two 1.txt     9 10 11.txt
$ j=0; for i in [[:digit:]]*; do printf '%d: ' $((++j)); ls "$i"; done
1: 1 2 3.txt
2: 3 two 1.txt
3: 9 10 11.txt

Quote:
Originally Posted by drewk
Code:
# BUT! quoting both does not work:
$ for i in "[[:digit:]]*"; do ls -l "$i"; done
ls: [[:digit:]]*: No such file or directory

So long story short:

1) Globs [*?] glob inside double or single quotes at the shell to expand file names,
2) Variable expansion happens inside double quotes but not single quotes.

Alister: PLEASE correct that if incorrect!!! :-}}
Globs inside quoted strings are never expanded. Smilie

Regards,
Alister

Last edited by alister; 04-30-2010 at 01:24 AM..
# 9  
Old 04-30-2010
Thank you very much for all of the information!!!! That's awesome! I tried pseudocoder's first script that was deleted and it seems to have worked as far as I could tell. But the warning by Alister made me a bit paranoid. And everything after that is a bit over my head. I wasnt' sure just how to adapt this part:

Code:
for i in *; do
    case $i in
        file[12].txt) continue;;
    esac
    ... further processing ...
done

I would definitely like to know exactly how this works. I'm trying to understand the problem that arises with 'IFS characters' and how to control/manipulate text in files. How can this script be set up to perform what I need to happen? Thanks again.

Best regards.
# 10  
Old 04-30-2010
I think drewk's solution should fit your needs. Assuming that the files you want to process all have the extension .c, and that your html header file (everything up to textarea) is h.html, and that your html footer file (everything after textarea) is f.html:

Code:
for f in *.c; do
    cat h.html "$f" f.html > "${f%.c}.html"
done

# 11  
Old 04-30-2010
Alister you are the sh glob god. Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Combining certain columns of multiple files into one file

Hello Unix gurus, I have a large number of files (say X) each containing two columns of data and the same number of rows. I would like to combine these files to create a unique merged file containing X columns corresponding to the second column of each file (with a bonus of having the first... (3 Replies)
Discussion started by: ksennin
3 Replies

2. UNIX for Beginners Questions & Answers

Combining multiple files into one

Hello Everyone, I have 4 different files (one column in each) that I'm trying to combine into 1 file with four columns. Having issues trying to get the columns to format properly. I have tried the following: paste file1 file2 file3 file4 | column -s $'\t' -t > results.txt paste file1 file2... (1 Reply)
Discussion started by: malk71
1 Replies

3. Shell Programming and Scripting

Split a file into multiple files with an extension

Hi I have a file with 100 million rows. I want to split them into 1000 subfiles and name them from 1.xls to 1000.xls.. Can I do it in awk? Thanks, (8 Replies)
Discussion started by: Diya123
8 Replies

4. Shell Programming and Scripting

Combining columns from multiple files into one single output file

Hi, I have 3 files with one column value as shown File: a.txt ------------ Data_a1 Data_a2 File2: b.txt ------------ Data_b1 Data_b2 Data_b3 Data_b4 File3: c.txt ------------ Data_c1 Data_c2 Data_c3 Data_c4 Data_c5 (6 Replies)
Discussion started by: vfrg
6 Replies

5. Shell Programming and Scripting

Combining multiple files

I have 2 files. each having 3 coloums 1st field date as 20130322 2nd field time as 05:55 3rd field numberic value File 2 has entries missing for some date time. FILE1 20130322 05:35 2219 20130322 05:40 1809 20130322 05:45 1617 20130322 05:50 ... (2 Replies)
Discussion started by: sandeepkmehra
2 Replies

6. Shell Programming and Scripting

Combining multiple column files into one with file name as first row

Hello All, I have several column files like this $cat a_b_s1.xls 1wert 2tg 3asd 4asdf 5asdf $cat c_d_s2.xls 1wert 2tg 3asd 4asdf 5asdf desired put put $cat combined.txt s1 s2 (2 Replies)
Discussion started by: avatar_007
2 Replies

7. Shell Programming and Scripting

Combining columns from multiple files to one file

I'm trying to combine colums from multiple file to a single file but having some issues, appreciate your help. The filenames are the same except for the extension, path1.m0 --------- a b c d e f g h i path1.m1 --------- m n o p q r s t u File names are path1.m The... (3 Replies)
Discussion started by: rkmca
3 Replies

8. Shell Programming and Scripting

Merge text files while combining the multiple header/trailer records into one each.

Situation: Our system currently executes a job (COBOL Program) that generates an interface file to be sent to one of our vendors. Because this system processes information for over 100,000 employees/retirees (and growing), we'd like to multi-thread the job into processing-groups in order to... (4 Replies)
Discussion started by: oordonez
4 Replies

9. Shell Programming and Scripting

Combining Multiple files in one in a perl script

All, I want to combine multiple files in one file. Something like what we do on the commad line as follows -> cat file1 file2 file3 > Main_File. Can something like this be done in a perl script very efficiently? Thanks, Rahul. (1 Reply)
Discussion started by: rahulrathod
1 Replies

10. UNIX for Dummies Questions & Answers

Renaming multiple files, to get rid of extension

I have a good script to rename multiple files, but what's the best way I can remove some text from multiple filenames? Say I have a directory with 35 files with a .XLS at the end, how can I rename them to remove the .XLS but keep everything the same, without having to mv manually. Thanks. (6 Replies)
Discussion started by: nj78
6 Replies
Login or Register to Ask a Question