Equivalence classes don't work


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Equivalence classes don't work
# 8  
Old 12-16-2019
Quote:
Originally Posted by jim mcnamara
Looks okay. Next, tr has problems with equivlence classes
Code:
[aªáàâãäå]

This is the long form of an equivalence class. Try it (use whatever letter is handy)
Code:
echo "aªáàâãäå" | sed 's/[aªáàâãäå...]/a/g'

On Linux this fails for me:
Code:
$ echo "aªáàâãäå" | sed 's/[=a=]/x/g'
xªáàâãäå

The tr man page I have:

Try sed and use full classes to get past GNU problems. For Solaris I have no good answers, my home version is Solaris 9, and it is not POSIX compliant.
I see. I was aware GNU sed had issues with multi-byte characters, like I mentioned in my first post. I was just confused why it didn't work on Solaris either.


Quote:
Originally Posted by Scrutinizer
On Solaris 10, I tried the following, using the POSIX compliant utilities which are in /usr/xpg[46]/bin:

Code:
$ export PATH=/usr/xpg6/bin:/usr/xpg4/bin:$PATH
$ printf "%s\n" Estrés Miraré http://ën.wikipedia.org | LC_CTYPE=es_MX.UTF-8 LC_COLLATE=es_MX.UTF-8 tr '[=ë=]' x
Estrés
Miraré
http://xn.wikipedia.org
$ printf "%s\n" Estrés Miraré http://ën.wikipedia.org | LC_CTYPE=es_MX.UTF-8 LC_COLLATE=es_MX.UTF-8 sed 's/[[=ë=]]/x/g'
xstrxs
Mirarx
http://xn.wikipxdia.org

So tr did not work, but sed did

On Linux I had the same experience, but tr also gave an error message, so it appears it only uses single byte characters and it does not understand equivalence classes, but sed worked:

Code:
$ printf "%s\n" Estrés Miraré http://ën.wikipedia.org | LC_CTYPE=es_MX.UTF-8 LC_COLLATE=es_MX.UTF-8 tr '[=ë=]' x
tr: \303\253: equivalence class operand must be a single character
$ printf "%s\n" Estrés Miraré http://ën.wikipedia.org | LC_CTYPE=es_MX.UTF-8 LC_COLLATE=es_MX.UTF-8 sed 's/[[=ë=]]/x/g'
xstrxs
Mirarx
http://xn.wikipxdia.org

This works! I forgot sed you can use globs with a replace script in sed.
Thank you all for your help!

Now, I assume that in this case the problem wasn't that equivalence classes didn't work, but it had something to do with tr. But I don't understand why they don't work in globs either:
Code:
$ ls -1
bin
Descargas
Documentos
Escritorio
Imágenes
Música
Plantillas
Público
Vídeos

$ printf '%s\n' *[[=u=]]*
Documentos
Estudio

Shouldn't Música and Público have appeared in the output of printf?
# 9  
Old 12-16-2019
It appears that it has been implemented in the system's regex engine, but that it does not work with globbing. On Linux, in bash 4 compare:
Code:
$ touch Miraré
$ for file in M*; do if [[ $file == M*[[=e=]]* ]]; then echo "$file"; fi; done
$ for file in M*; do if [[ $file =~ ^M.*[[=e=]] ]]; then echo "$file"; fi; done
Miraré
$

I found this for Linux Standard Base Core Specification 4.1: Pattern Matching Notation
Quote:
Utilities that perform filename pattern matching (also known as Filename Globbing) shall do it as specified in POSIX 1003.1-2001 (ISO/IEC 9945-2003), Pattern Matching Notation, with the following exceptions:

Pattern bracket expressions (such as [a-z]) can be based on code point order instead of collating element order.

Equivalence class expression (such as [=a=]) and multi-character collating element expression (such as [.ch.]) are optional.

Handling of a multi-character collating element is optional.

This affects at least the following utilities: cpio (cpio), find and tar (tar).
I do not know what is the case with Solaris 10. It may be that the equivalence classes were not specified in POSIX.1-2001 . Perhaps it was in Solaris 11, you would have to try that out...

Last edited by Scrutinizer; 12-16-2019 at 11:58 PM..
These 3 Users Gave Thanks to Scrutinizer For This Post:
# 10  
Old 12-19-2019
Quote:
Originally Posted by Scrutinizer
It appears that it has been implemented in the system's regex engine, but that it does not work with globbing. On Linux, in bash 4 compare:
Code:
$ touch Miraré
$ for file in M*; do if [[ $file == M*[[=e=]]* ]]; then echo "$file"; fi; done
$ for file in M*; do if [[ $file =~ ^M.*[[=e=]] ]]; then echo "$file"; fi; done
Miraré
$

I found this for Linux Standard Base Core Specification 4.1: Pattern Matching Notation


I do not know what is the case with Solaris 10. It may be that the equivalence classes were not specified in POSIX.1-2001 . Perhaps it was in Solaris 11, you would have to try that out...
I see. Well, on the one hand it's good they're available in regular expressions but it would be convenient to have them available as globs as well, especially for characters that I can't easily type using my keyboard without a long key combination and memorizing numbers.


Thanks for your help, Scrutinize!
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Solaris

Open Terminal Don't work

Hi, I installed solaris 10 x86 on my local system. it was working fine. today when i started the system, it started up without any problem. when i tried to open the terminal it didn't open any terminal. Plz help me (0 Replies)
Discussion started by: malikshahid85
0 Replies

2. HP-UX

awk don't work in hp-ux 11.11

Hello all! I have problem in hp-ux 11.11 in awk I want to grep sar -d 2 1 only 3 column, but have error in awk in hp-ux 11.11 Example: #echo 123 234 | awk '{print $2}' 123 234 The situattions in commands bdf | awk {print $5}' some... In hp-ux 11.31 - OK! How resolve problem (15 Replies)
Discussion started by: ostapv
15 Replies

3. Programming

why printf don't work?

I use Solaris 10, I use following code: #include <signal.h> int main(void){ printf("----------testing-----------"); if(signal(SIGUSR1,sig_usr)==SIG_ERR) err_sys("can't catch SIGUSR1"); for(;;) pause(); sig_user(int signo){ ..... } when I run above code,it print nothing... (3 Replies)
Discussion started by: konvalo
3 Replies

4. Programming

why daytime don't work?

Following code is detecting solaris daytime,when I run it,I can't get any result,code is follows: #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #define BUFFSIZE 150 int main(){ ... (2 Replies)
Discussion started by: konvalo
2 Replies

5. Shell Programming and Scripting

Use variable in sed don't work.

Hi all. I have a script as below: cutmth=`TZ=CST+2160 date +%b` export cutmth echo $cutmth >> date.log sed -n "/$cutmth/$p" alert_sbdev1.log > alert_summ.log My purpose is to run through the alert_sbdev1.log and find the 1st occurence of 'Jan' and send everything after that line to... (4 Replies)
Discussion started by: ahSher
4 Replies

6. Programming

why printf() function don't go work?

I use FreeBSD,and use signal,like follows: signal(SIGHUP,sig_hup); signal(SIGIO,sig_io); when I run call following code,it can run,but I find a puzzled question,it should print some information,such as printf("execute main()") will print execute main(),but in fact,printf fuction print... (2 Replies)
Discussion started by: konvalo
2 Replies

7. UNIX for Dummies Questions & Answers

Things in tutorials that don't work.

I am thankful for this site and for the many links provided. I have been going through one of the tutorials, but as I try some things, they don't seem to work. I am wondering if there is something I need first before being able to use a tutorial (like version number (HP-UX) or how I am getting... (1 Reply)
Discussion started by: arungavali
1 Replies

8. Shell Programming and Scripting

find options don't work in script

Hi, I'm trying to write a bash script to find some files. However it seems that the find command is not behaving the same way when the script is executed as it does when executed from the command line: Script extract: #!/bin/bash ... NEW="/usr/bin/find current/applications/ -name '*jar'... (3 Replies)
Discussion started by: mattd
3 Replies

9. Post Here to Contact Site Administrators and Moderators

How come sigs don't work?

They appear to be turned on, I entered mine in. The check boxes are all checked. And yet, no sigs? (4 Replies)
Discussion started by: l008com
4 Replies
Login or Register to Ask a Question