chore(deps): update dependencies

This commit is contained in:
Christopher Allen Lane
2022-08-02 20:30:31 -04:00
parent 0737af2fec
commit 7406ebfb5e
25 changed files with 5681 additions and 42 deletions

View File

@@ -39,6 +39,24 @@ Group 0 is embedded in the Match. Group 0 is an automatically-assigned group th
The __last__ capture is embedded in each group, so `g.String()` will return the same thing as `g.Capture.String()` and `g.Captures[len(g.Captures)-1].String()`.
If you want to find multiple matches from a single input string you should use the `FindNextMatch` method. For example, to implement a function similar to `regexp.FindAllString`:
```go
func regexp2FindAllString(re *regexp2.Regexp, s string) []string {
var matches []string
m, _ := re.FindStringMatch(s)
for m != nil {
matches = append(matches, m.String())
m, _ = re.FindNextMatch(m)
}
return matches
}
```
`FindNextMatch` is optmized so that it re-uses the underlying string/rune slice.
The internals of `regexp2` always operate on `[]rune` so `Index` and `Length` data in a `Match` always reference a position in `rune`s rather than `byte`s (even if the input was given as a string). This is a dramatic difference between `regexp` and `regexp2`. It's advisable to use the provided `String()` methods to avoid having to work with indices.
## Compare `regexp` and `regexp2`
| Category | regexp | regexp2 |
| --- | --- | --- |
@@ -62,6 +80,8 @@ The default behavior of `regexp2` is to match the .NET regexp engine, however th
* add support for named ascii character classes (e.g. `[[:foo:]]`)
* add support for python-style capture groups (e.g. `(P<name>re)`)
* change singleline behavior for `$` to only match end of string (like RE2) (see [#24](https://github.com/dlclark/regexp2/issues/24))
* change the character classes `\d` `\s` and `\w` to match the same characters as RE2. NOTE: if you also use the `ECMAScript` option then this will change the `\s` character class to match ECMAScript instead of RE2. ECMAScript allows more whitespace characters in `\s` than RE2 (but still fewer than the the default behavior).
* allow character escape sequences to have defaults. For example, by default `\_` isn't a known character escape and will fail to compile, but in RE2 mode it will match the literal character `_`
```go
re := regexp2.MustCompile(`Your RE2-compatible pattern`, regexp2.RE2)
@@ -72,6 +92,10 @@ if isMatch, _ := re.MatchString(`Something to match`); isMatch {
This feature is a work in progress and I'm open to ideas for more things to put here (maybe more relaxed character escaping rules?).
## ECMAScript compatibility mode
In this mode the engine provides compatibility with the [regex engine](https://tc39.es/ecma262/multipage/text-processing.html#sec-regexp-regular-expression-objects) described in the ECMAScript specification.
Additionally a Unicode mode is provided which allows parsing of `\u{CodePoint}` syntax that is only when both are provided.
## Library features that I'm still working on
- Regex split