Sponsored Content
Top Forums Shell Programming and Scripting Unable to identify the special characters beyond the range of "[\x80-\xFF]" Post 302957752 by jim mcnamara on Wednesday 14th of October 2015 02:41:32 PM
Old 10-14-2015
I know this is not about python per se, but there are REGEX tools for extended character sets, unicode being one of those sets:

regex - matching unicode characters in python regular expressions - Stack Overflow

UNIX in general is not unicode centric so Corona's answer pretty much stands for most regex engines.

The PCRE supports a lot of encoded charsets. You can download it here:
PCRE - Browse /pcre/8.30 at SourceForge.net
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to split special characters "|" using awk

Hi friends I need to splict special character "|" here. Here is my script which giving error LINE=INVTRAN|cd /home/msgGoogle TraxFolderType=`awk -F"|" '{print $1}' $LINE` filePath=`awk -F"|" '{print $2}' $LINE` echo "TraxFolderType: "$TraxFolderType echo "filePath :"$filePath ... (3 Replies)
Discussion started by: krishna9
3 Replies

2. Shell Programming and Scripting

How to remove "New line characters" and "spaces" at a time

Dear friends, following is the output of a script from which I want to remove spaces and new-line characters. Example:- Line1 abcdefghijklmnopqrstuvwxyz Line2 mnopqrstuvwxyzabcdefghijkl Line3 opqrstuvwxyzabcdefdefg Here in above example, at every starting line there is a “tab” &... (4 Replies)
Discussion started by: anushree.a
4 Replies

3. SuSE

VMDB Failure" followed by "Unable to open snapshot file"

keep getting an error when I try to revert to a snapshot: "VMDB Failure" followed by "Unable to open snapshot file" Im using vmware server 1.0.4, host OS is windows xp and guest OS is SLES. Is there anything I can do to recover the snapshot or am I in trouble!?!?! (0 Replies)
Discussion started by: s_linux
0 Replies

4. Shell Programming and Scripting

Question about special variables: "-" and "$_"

both ksh/bash support this 2 special variables, Is there any document for reference? 1) "-" is $OLDPWD 2) "$_" is last argument of previous command. (4 Replies)
Discussion started by: honglus
4 Replies

5. Shell Programming and Scripting

How to print range of lines using sed when pattern has special character "["

Hi, My input has much more lines, but few of them are below pin(IDF) { direction : input; drc_pinsigtype : signal; pin(SELDIV6) { direction : input; drc_pinsigtype : ... (3 Replies)
Discussion started by: nehashine
3 Replies

6. Shell Programming and Scripting

if [ "variable" = "numerical-range" ]; then

been a while so i'm a bit rusty and need a little help. writing a script that needs to compare $EXECHOST(a number) against a numerical range and then set a value. below isn't working but should give you folks an idea of my goal: if ; then echo "This is a 32B machine, exiting..." if ;... (4 Replies)
Discussion started by: crimso
4 Replies

7. Shell Programming and Scripting

Need HELP with AWK split. Need to check for "special characters" in string before splitting the file

Hi Experts. I'm stuck with the below AWK code where i'm trying to move the records containing any special characters in the last field to a bad file. awk -F, '{if ($NF ~ /^|^/) print >"goodfile";else print >"badfile"}' filename sample data 1,abc,def,1234,A * 2,bed,dec,342,* A ... (6 Replies)
Discussion started by: shell_boy23
6 Replies

8. Shell Programming and Scripting

finding the strings beween 2 characters "/" & "/" in .txt file

Hi all. I have a .txt file that I need to sort it My file is like: 1- 88 chain0 MASTER (FF-TE) FFFF 1962510 /TCK T FD2TQHVTT1 /jtagc/jtag_instreg/updateinstr_reg_1 dff1 (TI,SO) 2- ... (10 Replies)
Discussion started by: Behrouzx77
10 Replies

9. UNIX for Dummies Questions & Answers

PuTTY displaying "special" characters

I'm not really sure which forum this question should go into, so I'm posting it here. I work with AIX and RHEL systems using PuTTY (Release 0.60_q1.129) from a Windows 7 workstation. Some of the files we get from z/OS use "special" characters as delimiters. These characters include Hex 18... (7 Replies)
Discussion started by: derndingle
7 Replies
GLOB(7) 					       BSD Miscellaneous Information Manual						   GLOB(7)

NAME
glob -- shell-style pattern matching DESCRIPTION
Globbing characters (wildcards) are special characters used to perform pattern matching of pathnames and command arguments in the csh(1), ksh(1), and sh(1) shells as well as the C library functions fnmatch(3) and glob(3). A glob pattern is a word containing one or more unquoted '?' or '*' characters, or ``[..]'' sequences. Globs should not be confused with the more powerful regular expressions used by programs such as grep(1). While there is some overlap in the special characters used in regular expressions and globs, their meaning is different. The pattern elements have the following meaning: ? Matches any single character. * Matches any sequence of zero or more characters. [..] Matches any of the characters inside the brackets. Ranges of characters can be specified by separating two characters by a '-' (e.g. ``[a0-9]'' matches the letter 'a' or any digit). In order to represent itself, a '-' must either be quoted or the first or last character in the character list. Similarly, a ']' must be quoted or the first character in the list if it is to represent itself instead of the end of the list. Also, a '!' appearing at the start of the list has special meaning (see below), so to represent itself it must be quoted or appear later in the list. Within a bracket expression, the name of a character class enclosed in '[:' and ':]' stands for the list of all characters belonging to that class. Supported character classes: alnum cntrl lower space alpha digit print upper blank graph punct xdigit These match characters using the macros specified in ctype(3). A character class may not be used as an endpoint of a range. [!..] Like [..], except it matches any character not inside the brackets. Matches the character following it verbatim. This is useful to quote the special characters '?', '*', '[', and '' such that they lose their special meaning. For example, the pattern ``\*[x]?'' matches the string ``*[x]?''. Note that when matching a pathname, the path separator '/', is not matched by a '?', or '*', character or by a ``[..]'' sequence. Thus, /usr/*/*/X11 would match /usr/X11R6/lib/X11 and /usr/X11R6/include/X11 while /usr/*/X11 would not match either. Likewise, /usr/*/bin would match /usr/local/bin but not /usr/bin. SEE ALSO
fnmatch(3), glob(3), re_format(7) HISTORY
In early versions of UNIX, the shell did not do pattern expansion itself. A dedicated program, /etc/glob, was used to perform the expansion and pass the results to a command. In Version 7 AT&T UNIX, with the introduction of the Bourne shell, this functionality was incorporated into the shell itself. BSD
November 30, 2010 BSD
All times are GMT -4. The time now is 05:26 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy