Combining multiple files into one with the same name/different extension

04-30-2010

Registered User

3,231, 978

Join Date: Dec 2009

Last Activity: 11 June 2014, 8:40 PM EDT

Posts: 3,231

Thanks Given: 179

Thanked 978 Times in 791 Posts

Quote:

Originally Posted by drewk

Please follow along with these examples:

Code:

$ mkdir test
$ cd test
$ touch "one two three.txt" '3 two 1.txt' "1 2 3.txt" '9 10 11.txt'

But you can still glob and expand variables inside quotes, such as:

Code:

$ for i in "*two*.txt"; do ls -l $i; done
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:32 3 two 1.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:29 one two three.txt
# notice that the * glob expands inside the "*two*.txt"
# you can also use:  for i in *two*.txt; do ls -l "$i"; done

Globbing (aka pathname expansion) never occurs within any quotes (single or double).

In that case, the string "*two*.txt" in the for loop is not expanded. The loop will always execute one time. That one time, the value of $i will be the literal string "*two*.txt". However, when the ls command is executed, the value of i is expanded to "*two*.txt" and then, at that time, since it is unquoted, that's where the globbing occurs (in short, file globbing will occur after variable expansion, if the variable expansion is unquoted).

Since globbing never occurs within quotes, that's why you notice that it never happens when you quote both the for's list and the variable in the loop.

Quote:

Originally Posted by drewk

Finally, if you do not quote certain forms, you will get unexpected results entirely, such as with a regex:

Code:

$ for i in [[:digit:]]*.txt; do ls -l $i; done
ls: 1: No such file or directory
ls: 2: No such file or directory
ls: 3.txt: No such file or directory
ls: 1.txt: No such file or directory
ls: 3: No such file or directory
ls: two: No such file or directory
ls: 10: No such file or directory
ls: 11.txt: No such file or directory
ls: 9: No such file or directory
# the unquoted regex with POSIX [:digit:] character class found the files but then the shell breaks the return on IFS...

That's correct, but let's be clear about where the field splitting is occuring. What is being split is the unquoted expansion of $i when the ls command is run. Pathname expansion, when it occurs, happens after field splitting (it actually happens after everything, except quote removal).

Quote:

Originally Posted by drewk

Code:

$ for i in "[[:digit:]]*"; do ls -l $i; done
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 1 2 3.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 3 two 1.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 9 10 11.txt
#alternatively:
$ for i in [[:digit:]]*; do ls -l "$i"; done
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 1 2 3.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 3 two 1.txt
-rw-r--r--  1 andrew  andrew  0 Apr 29 18:59 9 10 11.txt

In this particular instance, those two loops give the same result, but they are not interchangeable. Two very different things are happening. In the first instance, ls is called once, with three arguments. In the second, it is called three times, with one argument each time.

In the first, "[[:digit:]]*" is quoted, so no globbing occurs when the for loop's list is evaluated. The for loop will only execute once. After $i is expanded, the resulting "[[:digit:]]*" will not be split into fields since it does not contain any IFS characters (by default, space, tab, and newline). And then, finally, pathname expansion will occur and correctly deal with the whitespace that those filenames contain, because the globbing occurs after the field splitting step. But, remember, since the for loop list is quoted, it can only ever result in one word and the loop will only execute once (not much point in having a loop, in that case ... just write: ls [[:digit:]]*).

In the second version, since the for loop's list is not quoted, [[:digit:]]* will expand to all the filenames that match the pattern and the loop will execute once per filename. So, obviously, these two forms are not equivalent. Depending on what you are doing, you may need access to each individual filename, which is not possible with the first version.

Example that makes it clear (1 iteration versus 3):

Code:

$ j=0; for i in "[[:digit:]]*"; do printf '%d: ' $((++j)); ls $i; done
1: 1 2 3.txt    3 two 1.txt     9 10 11.txt
$ j=0; for i in [[:digit:]]*; do printf '%d: ' $((++j)); ls "$i"; done
1: 1 2 3.txt
2: 3 two 1.txt
3: 9 10 11.txt

Quote:

Originally Posted by drewk

Code:

# BUT! quoting both does not work:
$ for i in "[[:digit:]]*"; do ls -l "$i"; done
ls: [[:digit:]]*: No such file or directory

So long story short:

1) Globs [*?] glob inside double or single quotes at the shell to expand file names,
2) Variable expansion happens inside double quotes but not single quotes.

Alister: PLEASE correct that if incorrect!!! :-}}

Globs inside quoted strings are never expanded.

Regards,
Alister

Last edited by alister; 04-30-2010 at 01:24 AM..

alister

View Public Profile for alister

Find all posts by alister

04-30-2010

Registered User

3, 0

Join Date: Apr 2010

Last Activity: 30 April 2010, 12:32 AM EDT

Posts: 3

Thanks Given: 0

Thanked 0 Times in 0 Posts

Thank you very much for all of the information!!!! That's awesome! I tried pseudocoder's first script that was deleted and it seems to have worked as far as I could tell. But the warning by Alister made me a bit paranoid. And everything after that is a bit over my head. I wasnt' sure just how to adapt this part:

Code:

for i in *; do
    case $i in
        file[12].txt) continue;;
    esac
    ... further processing ...
done

I would definitely like to know exactly how this works. I'm trying to understand the problem that arises with 'IFS characters' and how to control/manipulate text in files. How can this script be set up to perform what I need to happen? Thanks again.

Best regards.

12o

View Public Profile for 12o

Find all posts by 12o

04-30-2010

Registered User

3,231, 978

Join Date: Dec 2009

Last Activity: 11 June 2014, 8:40 PM EDT

Posts: 3,231

Thanks Given: 179

Thanked 978 Times in 791 Posts

I think drewk's solution should fit your needs. Assuming that the files you want to process all have the extension .c, and that your html header file (everything up to textarea) is h.html, and that your html footer file (everything after textarea) is f.html:

Code:

for f in *.c; do
    cat h.html "$f" f.html > "${f%.c}.html"
done

alister

View Public Profile for alister

Find all posts by alister

04-30-2010

Registered User

100, 3

Join Date: Mar 2010

Last Activity: 9 January 2013, 12:08 PM EST

Location: la jolla, ca

Posts: 100

Thanks Given: 4

Thanked 3 Times in 3 Posts

Alister you are the sh glob god.

drewk

View Public Profile for drewk

Find all posts by drewk

Shell Programming and Scripting

Combining multiple files into one with the same name/different extension

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Combining certain columns of multiple files into one file

Discussion started by: ksennin

2. UNIX for Beginners Questions & Answers

Combining multiple files into one

Discussion started by: malk71

3. Shell Programming and Scripting

Split a file into multiple files with an extension

Discussion started by: Diya123

4. Shell Programming and Scripting

Combining columns from multiple files into one single output file

Discussion started by: vfrg

5. Shell Programming and Scripting

Combining multiple files

Discussion started by: sandeepkmehra

6. Shell Programming and Scripting

Combining multiple column files into one with file name as first row

Discussion started by: avatar_007

7. Shell Programming and Scripting

Combining columns from multiple files to one file

Discussion started by: rkmca

8. Shell Programming and Scripting

Merge text files while combining the multiple header/trailer records into one each.

Discussion started by: oordonez

9. Shell Programming and Scripting

Combining Multiple files in one in a perl script

Discussion started by: rahulrathod

10. UNIX for Dummies Questions & Answers

Renaming multiple files, to get rid of extension

Discussion started by: nj78