Enter your regex: [bcr]at
Enter input string to search: hat No match found.
在上面的例子中,在第一个字符匹配字符类中所定义字符中的一个时,整个匹配就是成功的。 3.1.1 否定返回目录
要匹配除那些列表之外所有的字符时,可以在字符类的开始处加上^元字符,这种就被称为否定(negation)。
Enter your regex: [^bcr]at
Enter input string to search: bat No match found.
Enter your regex: [^bcr]at
Enter input string to search: cat No match found.
Enter your regex: [^bcr]at
Enter input string to search: rat No match found.
Enter your regex: [^bcr]at
Enter input string to search: hat
I found the text \starting at index 0 and ending at index 3.
在输入的字符串中的第一个字符不包含在字符类中所定义字符中的一个时,匹配是成功的。 3.1.2 范围返回目录
有时会想要定义一个包含值范围的字符类,诸如,“a 到 h”的字母或者是“1 到 5”的数字。指定一个范围,只要在被匹配的首字符和末字符间插入-元字符,比如:[1-5]或者是[a-h]。也可以在类里每个的边上放置不同的范围来提高匹配的可能性,例如:[a-zA-Z]将会匹配 a 到 z(小写字母)或者 A 到 Z(大写字母)中的任何一个字符。 下面是一些范围和否定的例子:
Enter your regex: [a-c]
Enter input string to search: a
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [a-c]
Enter input string to search: b
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [a-c]
Enter input string to search: c
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [a-c]
Enter input string to search: d No match found.
Enter your regex: foo[1-5]
Enter input string to search: foo1 I found the text \starting at index 0 and ending at index 4.
Enter your regex: foo[1-5]
Enter input string to search: foo5 I found the text \starting at index 0 and ending at index 4.
Enter your regex: foo[1-5]
Enter input string to search: foo6 No match found.
Enter your regex: foo[^1-5]
Enter input string to search: foo1 No match found.
Enter your regex: foo[^1-5]
Enter input string to search: foo6 I found the text \starting at index 0 and ending at index 4.
3.1.3 并集返回目录
可以使用并集(union)来建一个由两个或两个以上字符类所组成的单字符类。构建一个并集,只要在一个字符类的边上嵌套另外一个,比如:[0-4[6-8]],这种奇特方式构建的并集字符类,可以匹配 0,1,2,3,4,6,7,8 这几个数字。
Enter your regex: [0-4[6-8]]
Enter input string to search: 0
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [0-4[6-8]] Enter input string to search: 5 No match found.
Enter your regex: [0-4[6-8]] Enter input string to search: 6
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [0-4[6-8]] Enter input string to search: 8
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [0-4[6-8]] Enter input string to search: 9 No match found.
3.1.4 交集返回目录
建一个仅仅匹配自身嵌套类中公共部分字符的字符类时,可以像[0-9&&[345]]中那样使用&&。这种方式构建出来的交集(intersection)简单字符类,仅仅以匹配两个字符类中的 3,4,5 共有部分。
Enter your regex: [0-9&&[345]]
Enter input string to search: 3
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[345]] Enter input string to search: 4
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[345]] Enter input string to search: 5
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[345]] Enter input string to search: 2 No match found.
Enter your regex: [0-9&&[345]] Enter input string to search: 6 No match found.
下面演示两个范围交集的例子:
Enter your regex: [2-8&&[4-6]]
Enter input string to search: 3 No match found.
Enter your regex: [2-8&&[4-6]] Enter input string to search: 4
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [2-8&&[4-6]] Enter input string to search: 5
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [2-8&&[4-6]] Enter input string to search: 6
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [2-8&&[4-6]] Enter input string to search: 7 No match found.
3.1.5 差集返回目录
最后,可以使用差集(subtraction)来否定一个或多个嵌套的字符类,比如:[0-9&&[^345]],这个是构建一个匹配除 3,4,5 之外所有 0 到 9 间数字的简单字符类。
Enter your regex: [0-9&&[^345]]
Enter input string to search: 2
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[^345]] Enter input string to search: 3
No match found.
Enter your regex: [0-9&&[^345]] Enter input string to search: 4 No match found.
Enter your regex: [0-9&&[^345]] Enter input string to search: 5 No match found.
Enter your regex: [0-9&&[^345]] Enter input string to search: 6
I found the text \starting at index 0 and ending at index 1.
Enter your regex: [0-9&&[^345]] Enter input string to search: 9
I found the text \starting at index 0 and ending at index 1.
到此为止,已经涵盖了如何建立字符类的部分。在继续下一节之前,可以试着回想一下那张字符类表。 4 预定义字符类返回目录
Pattern 的 API 包有许多有用的预定义字符类(predefined character classes),提供了常用正则表达式的简写形式。
预定义字符类
. 任何字符(匹配或者不匹配行结束符)
\\d 数字字符:[0-9] \\D 非数字字符:[^0-9] \\s 空白字符:[\\t\\n\\x0B\\f\\r] \\S 非空白字符:[^\\s] \\w 单词字符:[a-zA-Z_0-9] \\W 非单词字符:[^\\w]
上表中,左列是构造右列字符类的简写形式。例如:\\d指的是数字范围(0~9),\\w指的是单词字
符(任何大小写字母、下划线或者是数字)。无论何时都有可能使用预定义字符类,它可以使代码更易阅读,更易从难看的字符类中排除错误。
以反斜线(\\)开始的构造称为转义构造(escaped constructs)。回顾一下在 字符串 一节中的转义构造,在那里我们提及了使用反斜线,以及用于引用的\\Q和\\E。在字符串中使用转义构造,必须在一个反斜线前再增加一个反斜用于字符串的编译,例如:
001
private final String REGEX = \单个数字
这个例子中\\d是正则表达式,另外的那个反斜线是用于代码编译所必需的。但是测试用具读取的表
达式,是直接从控制台中输入的,因此不需要那个多出来的反斜线。
下面的例子说明了预字义字符类的用法:
Enter your regex: .
Enter input string to search: @
I found the text \starting at index 0 and ending at index 1.
Enter your regex: .
Enter input string to search: 1
I found the text \starting at index 0 and ending at index 1.
Enter your regex: .
Enter input string to search: a
I found the text \starting at index 0 and ending at index 1.
Enter your regex: \\d
Enter input string to search: 1
I found the text \starting at index 0 and ending at index 1.
Enter your regex: \\d
Enter input string to search: a No match found.
Enter your regex: \\D
Enter input string to search: 1 No match found.
Enter your regex: \\D
Enter input string to search: a
I found the text \starting at index 0 and ending at index 1.
Enter your regex: \\s
Enter input string to search:
I found the text \\starting at index 0 and ending at index 1.
Enter your regex: \\s
Enter input string to search: a No match found.
Enter your regex: \\S
Enter input string to search: No match found.
Enter your regex: \\S
Enter input string to search: a
I found the text \starting at index 0 and ending at index 1.
Enter your regex: \\w
Enter input string to search: a