Parameter expansion not working for all strings...


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parameter expansion not working for all strings...
# 1  
Old 11-01-2011
Parameter expansion not working for all strings...

I'm trying to write a script that parses my music collection and hard link some filenames that my media player doesn't like to other names.

To do this I need to extract the name and remove alla non ASCII characters from that and do a cp -l with the result.

Problem is this:

Code:
22:16:58 $ find . -wholename "*" -print
./Simon & Garfunkel - The Essential Simon & Garfunkel (2003)/CD1/15 - Simon & Garfunkel - The Dangling Conversation (Album Version).flac
./José González - In Our Nature/06 Abram.flac
./Ane Brun (2004) - A Temporary Dive [FLAC]/09 Ane Brun - Song No. 6.flac

Code:
22:18:28 $ find . -wholename "*" -print| while read line; do echo ${line//[^a-z]/};done
SimonGarfunkelTheEssentialSimonGarfunkelCDSimonGarfunkelTheDanglingConversationAlbumVersionflac
./José González - In Our Nature/06 Abram.flac
AneBrunATemporaryDiveFLACAneBrunSongNoflac

Off cause I realize that those names are gibberish but what puzzels me is why the "./José González - In Our Nature/06 Abram.flac" line is unaffected.


Code:
22:21:12 $ bash --version
bash --version
GNU bash, version 4.2.10(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

My guess is that it has something to do with é but I wouldn't know.

Any ideas what could be the problem?

Thanks
# 2  
Old 11-01-2011
I'm guessing those "spaces" aren't, they're some weird unicode space-like character.

Try feeding the output of find there into hexdump -C so we can see what the hex bytes in that filename are.
# 3  
Old 11-01-2011
But they are, hexdump and translation to unicode gives
Code:
U+002E FULL STOP character (.)
U+002F SOLIDUS character (/)
U+004A LATIN CAPITAL LETTER J character
U+006F LATIN SMALL LETTER O character
U+0073 LATIN SMALL LETTER S character
U+00E9 LATIN SMALL LETTER E WITH ACUTE character (&#x00E9;)
U+0020 SPACE character
U+0047 LATIN CAPITAL LETTER G character
U+006F LATIN SMALL LETTER O character
U+006E LATIN SMALL LETTER N character
U+007A LATIN SMALL LETTER Z character
U+00E1 LATIN SMALL LETTER A WITH ACUTE character (&#x00E1;)
U+006C LATIN SMALL LETTER L character
U+0065 LATIN SMALL LETTER E character
U+007A LATIN SMALL LETTER Z character
U+0020 SPACE character
U+002D HYPHEN-MINUS character (-)
U+0020 SPACE character
U+0049 LATIN CAPITAL LETTER I character
U+006E LATIN SMALL LETTER N character
U+0020 SPACE character
U+004F LATIN CAPITAL LETTER O character
U+0075 LATIN SMALL LETTER U character
U+0072 LATIN SMALL LETTER R character
U+0020 SPACE character
U+004E LATIN CAPITAL LETTER N character
U+0061 LATIN SMALL LETTER A character
U+0074 LATIN SMALL LETTER T character
U+0075 LATIN SMALL LETTER U character
U+0072 LATIN SMALL LETTER R character
U+0065 LATIN SMALL LETTER E character
U+002F SOLIDUS character (/)
U+0030 DIGIT ZERO character (0)
U+0036 DIGIT SIX character (6)
U+0020 SPACE character
U+0041 LATIN CAPITAL LETTER A character
U+0062 LATIN SMALL LETTER B character
U+0072 LATIN SMALL LETTER R character
U+0061 LATIN SMALL LETTER A character
U+006D LATIN SMALL LETTER M character
U+002E FULL STOP character (.)
U+0066 LATIN SMALL LETTER F character
U+006C LATIN SMALL LETTER L character
U+0061 LATIN SMALL LETTER A character
U+0063 LATIN SMALL LETTER C character
U+000A <control> character

And even if they weren't, wouldn't they be changed by ${line//[^a-z]/} since they are not [a-z]?

:/


[edit]:

And by the way, if I use sed to do the substitution it works on the José... lines to... it even removes some of them completely.

Code:
22:56:50 $ find . -iname "*" -print| while read line; do echo $(line | sed -e 's/[^a-zA-Z]//g' );done
SimonGarfunkelTheEssentialSimonGarfunkelCDSimonGarfunkelTheDanglingConversationAlbumVersionflac
AneBrunATemporaryDiveFLACAneBrunToLetMyselfGoflac


Last edited by refuser; 11-01-2011 at 07:00 PM..
# 4  
Old 11-02-2011
In translating it to unicode, you've translated it to unicode...

What was it originally?
# 5  
Old 11-02-2011
Oh! =)

Hexdump gives:
2E 2F 4A 6F 73 C3 A9 20 47 6F 6E 7A C3 A1 6C 65 7A 20 2D 20 49 6E 20 4F 75 72 20 4E 61 74 75 72 65 2F 30 36 20 41 62 72 61 6D 2E 66 6C 61 63 0A

It looks to me as if spaces are the same (20) and that é are the only strange letters.

But shouldn't any wierd character be handled by ${line//[^a-z]/} since they would be NOT a-z, [^a-z] and should therefor be replaced with nothing?
# 6  
Old 11-02-2011
I do this to get that exact string:

Code:
STRING=$(echo " 2E 2F 4A 6F 73 C3 A9 20 47 6F 6E 7A C3 A1 6C 65 7A 20 2D 20 49 6E 20 4F 75 72 20 4E 61 74 75 72 65 2F 30 36 20 41 62 72 61 6D 2E 66 6C 61 63 0A" |
        sed 's/ /\\\\x/g' | xargs echo -e)

echo "${STRING//[^a-z]/}"
osonzleznuraturebramflac
$ bash --version
GNU bash, version 4.1.7(2)-release (i686-pc-linux-gnu)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
$

It ought to substitute, I don't know why yours doesn't. Perhaps a bug, or an older shell with limited features.
# 7  
Old 11-02-2011
Must be a bug then, since I have bash 4.2.10(1)... Guess thats what I get for updating to latest Ubuntu release =/

Thanks for your help

---------- Post updated at 09:40 PM ---------- Previous update was at 08:50 PM ----------

So, tested on Ubuntu 10.04, 11.04, 11.10 and Debian 6.0.3 and they all have their quirks.

Debian & Ubuntu 10.04/11.04: JoséGonzálesInOurNatureAbramflac
That is not correct, JéGáIONA should be removed

Ubuntu 11.10: ./José Gonzáles - In Our Nature/06 Abram.flac
Does not work at all.

Any idea how I could report this bug?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Use parameter expansion over a parameter expansion in bash.

Hello All, Could you please do help me here as I would like to perform parameter expansion in shell over a parameter expansion. Let's say I have following variable. path="/var/talend/nat/cdc" Now to get only nat I could do following. path1="${path%/*}" path1="${path1##*/}" Here... (8 Replies)
Discussion started by: RavinderSingh13
8 Replies

2. Shell Programming and Scripting

Bash : Parameter expansion ${var:-file*}

Example data $ ls *somehost* 10.10.10.10_somehost1.xyz.com.log 11.11.11.11_somehost2.xyz.com.log #!/bin/bash #FILES="*.log" FILES=${FILES:-*.log} for x in $FILES do ip="${x%%_*}" # isolate IP address x="${x##*_}" # isolate hostname hnam="${x%.*}" # Remove the ".log"... (2 Replies)
Discussion started by: popeye
2 Replies

3. Shell Programming and Scripting

Bash : More parameter expansion and IFS

I am trying to become more fluent with the interworking of bash and minimize the number of external calls. Sample Data. This will be the response of the snmp query. SNMPv2-MIB::sysName.0 = STRING: SomeHostName SNMPv2-MIB::sysObjectID.0 = OID: SNMPv2-SMI::enterprises.9.1.1745... (5 Replies)
Discussion started by: sumguy
5 Replies

4. Shell Programming and Scripting

Bash Parameter Expansion

#!/bin/bash SNMPW='/usr/bin/snmpwalk' while read h i do loc=$($SNMPW -v3 -u 'Myusername' -l authPriv -a SHA -A 'Password1' -x AES -X 'Password2' $i sysLocation.0 2>/dev/null) loc=${loc:-" is not snmpable."} loc=${loc##*: } loc=${loc//,/} echo "$i,$h,$loc" done < $1 My question is ... ... (1 Reply)
Discussion started by: sumguy
1 Replies

5. Shell Programming and Scripting

Bash Parameter Expansion

I have made the following examples that print various parameter expansions text: iv-hhz-sac/hpac/hhz.d/iv.hpac..hhz.d.2016.250.070018.sac (text%.*): iv-hhz-sac/hpac/hhz.d/iv.hpac..hhz.d.2016.250.070018 (text%%.*): iv-hhz-sac/hpac/hhz (text#*.): d/iv.hpac..hhz.d.2016.250.070018.sac... (2 Replies)
Discussion started by: kristinu
2 Replies

6. Shell Programming and Scripting

Expansion not working properly

I'm using an Ubuntu machine and expansion is not working properly. What would cause this? Do I need to check for any particular bash packages? $ ipcs -m | grep $USER | awk '{printf "%s ",$2}' $ ipcs -m | grep UNF | awk '{printf "%s ",$2}' 294912 1048577 425986 688131 786436 1245189... (14 Replies)
Discussion started by: cokedude
14 Replies

7. UNIX for Dummies Questions & Answers

Parameter Expansion with regular expression

Hello experts, I am exploring parameter expansion, and trying to cut the fields in a URL. Following is the requirement: I have // abc.nnt /dir1/dir2/dir3/dir4/somefile.java What i need to get is the path after dir3, and dir3 will be passed. output that i need is... (1 Reply)
Discussion started by: gjarms
1 Replies

8. Shell Programming and Scripting

Bash parameter expansion from a config file

Hi - I am trying to do a simple config file with known variable names in it, e.g.: contents of config file a.conf: -a -b $work -c $host simplified contents of bash script file: work='trunk' host='alaska' opts=$(tr '\n' ' ' < a.conf) opts="$opts $*" mycommand $opts arg1 arg2 The... (3 Replies)
Discussion started by: mrengert
3 Replies

9. Shell Programming and Scripting

Need help with parameter expansion

Say you have this numeric variable that can be set by the user but you never want it to leave a certain range when it gets printed. How could you use parameter expansion such that it will never expand outside of that boundary? Thanks ---------- Post updated at 11:09 PM ---------- Previous update... (3 Replies)
Discussion started by: stevenswj
3 Replies

10. Shell Programming and Scripting

removing html tags via parameter expansion

Hi all- I have a variable that contains a web page: echo $STUFF <html> <head> <title>my page</title></head> <body> blah blah etc.. Can I use the shell's parameter expansion abilities to remove just the tags? I thought that FIXHTML=${STUFF//<*>/} might do it, but it didn't seem to... (2 Replies)
Discussion started by: rev66
2 Replies
Login or Register to Ask a Question