Sponsored Content
Top Forums Shell Programming and Scripting Regex issue with \s in character class. Post 303042635 by jim mcnamara on Wednesday 1st of January 2020 11:07:58 PM
Old 01-02-2020
It is a shorthand character class that mostly expands to [ \f\t\n\v] -- it is shorthand for a character class, like \d and others

Explained here:.

Regexp Tutorial - Shorthand Character Classes has this for \s:

Quote:
\s stands for “whitespace character”¯. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a line break, or a form feed. Most flavors also include the vertical tab, with Perl (prior to version 5.18) and PCRE (prior to version 8.34) being notable exceptions. In flavors that support Unicode, \s normally includes all characters from the Unicode “separator”¯ category. Java and PCRE are exceptions once again. But JavaScript does match all Unicode whitespace with \s.
This User Gave Thanks to jim mcnamara For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk and POSIX character class

can anyone tell me why this doesn't work? I've been trying to play with character classes and I seem to be missing something here..! echo "./comparecdna.summary" | awk '/^compare+]summary$/' # returns nothing echo "./compare_cdna.summary" | awk '/^compare_+]summary$/' # returns nothing echo... (5 Replies)
Discussion started by: anthalamus
5 Replies

2. Shell Programming and Scripting

regex to find font class

So, I need to find the instances of a certain font and remove it....so far in my testing I am using the find command with regex to find a font I want to pull out. However, I seem to be slightly stuck, and I am sure the beard stroking Unix geniuses here can help me. My example code: find... (7 Replies)
Discussion started by: tlarkin
7 Replies

3. Shell Programming and Scripting

perl regex issue

Hi, I find it really strange while writing a simple regex to match and print the matched string, dibyajyo@fwtest:~ #perl -e '$x = "root@rashmi>"; print "matched string:$1\n" if ($x =~ /(root@rashmi)/);' matched string:root dibyajyo@fwtest:~ #perl -e '$x = "root@rashmi>"; print... (1 Reply)
Discussion started by: rrd1986
1 Replies

4. Shell Programming and Scripting

Regex escape special character in AWK if statement

I am having issues escaping special characters in my AWK script as follows: for id in `cat file` do grep $id in file2 | awk '\ BEGIN {var=""} \ { if ( /stringwith+'|'+'50'chars/ ) { echo "do this" } else if ( /anotherString/ ) { echo "do that" } else { ... (4 Replies)
Discussion started by: purebc
4 Replies

5. Shell Programming and Scripting

Regex:search/replace but not for escaped character

Hi Input: - -- --- ---- aa-bb-cc aa--bb--cc aa---bb---cc aa----bb----cc Output: . - -. -- aa.bb.cc (7 Replies)
Discussion started by: chitech
7 Replies

6. UNIX for Advanced & Expert Users

Get pointer for existing device class (struct class) in Linux kernel module

Hi all! I am trying to register a device in an existing device class, but I am having trouble getting the pointer to an existing class. I can create a class in a module, get the pointer to it and then use it to register the device with: *cl = class_create(THIS_MODULE, className);... (0 Replies)
Discussion started by: hdaniel@ualg.pt
0 Replies

7. Shell Programming and Scripting

Regex space character

Hi, I have following regex condition, however it does not work with different logs having same visible string.I believe it is because of some difference with space character, is it possible to make it work everywhere. Can someone suggest a better string? /BIND dn=" uid=/ Thanks. (8 Replies)
Discussion started by: susankoperna1
8 Replies

8. Programming

Size of Derived class, upon virtual base class inheritance

I have the two class definition as follows. class A { public: int a; }; class B : virtual public A{ }; The size of class A is shown as 4, and size of class B is shown as 16. Why is this effect ?. (2 Replies)
Discussion started by: techmonk
2 Replies

9. Shell Programming and Scripting

Match string against character class in bash

Hello, I want to check whether string has only numeric characters. The following code doesn't work for me #!/usr/local/bin/bash if ]]; then echo "true" else echo "False" fi # ./yyy '346' False # ./yyy 'aaa' False I'm searching for solution using character classes, not regex.... (5 Replies)
Discussion started by: urello
5 Replies

10. Programming

C++ : Base class member function not accessible from derived class

Hello All, I am a learner in C++. I was testing my inheritance knowledge with following piece of code. #include <iostream> using namespace std; class base { public : void display() { cout << "In base display()" << endl; } void display(int k) {... (2 Replies)
Discussion started by: anand.shah
2 Replies
iso2022(5)							File Formats Manual							iso2022(5)

NAME
iso2022, iso-2022, ISO-2022 - A character encoding mechanism standardized by the International Standards Organization (ISO) DESCRIPTION
The ISO-2022 standard defines a mechanism for handling single-byte and multibyte characters. The standard specifies four classes of charac- ter sets: The 94-charset class, which contains character sets with 94 positions (single-byte characters). Examples are the ASCII and JIS X0201 character sets. The 96-charset class, which contains character sets with 96 positions (single-byte characters). Examples are the ISO Latin series of character sets. The 94x94-charset class, which contains character sets with 94x94 positions (2-byte characters). Examples are the GB 2312 and the CNS 11643 character sets. The 96x96-charset class, which contains character sets with 96x96 positions (2-byte characters). In the ISO-2022 standard, four registers, called G0, G1, G2 and G3, are used to reference a character set. Before a character set can be used, the character set must be assigned, or designated, to one of these registers. The designation of a character set is done by using an escape sequence in the following format: ESC [I] F In this format: Is an intermediate character that is used to designate a character set to one of the registers (G0, G1, G2, oR G3). Is a unique final character of a particular character set. The designation of a character set, whose final character is F, to different registers is as follows: Designates a multibyte character set (94x94 or 96x96) to G0. Designates a character set in the 94-charset class to G0. Designates a character set in the 94-charset class to G1. Designates a character set in the 94-charset class to G2. Designates a character set in the 94-charset class to G3. Designates a character set in the 96-charset class to G1. Designates a character set in the 96-charset class to G2. Designates a character set in the 96-charset class to G3. SEE ALSO
Commands: locale(1) Others: ascii(5), i18n_intro(5), iso2022jp(5), l10n_intro(5) iso2022(5)
All times are GMT -4. The time now is 05:40 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy