295 lines
No EOL
16 KiB
HTML
295 lines
No EOL
16 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta charset="utf-8">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||
<meta name="generator" content="rustdoc">
|
||
<meta name="description" content="API documentation for the Rust `regex_syntax` crate.">
|
||
<meta name="keywords" content="rust, rustlang, rust-lang, regex_syntax">
|
||
|
||
<title>regex_syntax - Rust</title>
|
||
|
||
<link rel="stylesheet" type="text/css" href="../normalize.css">
|
||
<link rel="stylesheet" type="text/css" href="../rustdoc.css"
|
||
id="mainThemeStyle">
|
||
|
||
<link rel="stylesheet" type="text/css" href="../dark.css">
|
||
<link rel="stylesheet" type="text/css" href="../light.css" id="themeStyle">
|
||
<script src="../storage.js"></script>
|
||
|
||
|
||
|
||
|
||
</head>
|
||
<body class="rustdoc mod">
|
||
<!--[if lte IE 8]>
|
||
<div class="warning">
|
||
This old browser is unsupported and will most likely display funky
|
||
things.
|
||
</div>
|
||
<![endif]-->
|
||
|
||
|
||
|
||
<nav class="sidebar">
|
||
<div class="sidebar-menu">☰</div>
|
||
|
||
<p class='location'>Crate regex_syntax</p><div class="sidebar-elems"><div class="block items"><ul><li><a href="#modules">Modules</a></li><li><a href="#structs">Structs</a></li><li><a href="#enums">Enums</a></li><li><a href="#functions">Functions</a></li><li><a href="#types">Type Definitions</a></li></ul></div><p class='location'></p><script>window.sidebarCurrent = {name: 'regex_syntax', ty: 'mod', relpath: '../'};</script></div>
|
||
</nav>
|
||
|
||
<div class="theme-picker">
|
||
<button id="theme-picker" aria-label="Pick another theme!">
|
||
<img src="../brush.svg" width="18" alt="Pick another theme!">
|
||
</button>
|
||
<div id="theme-choices"></div>
|
||
</div>
|
||
<script src="../theme.js"></script>
|
||
<nav class="sub">
|
||
<form class="search-form js-only">
|
||
<div class="search-container">
|
||
<input class="search-input" name="search"
|
||
autocomplete="off"
|
||
placeholder="Click or press ‘S’ to search, ‘?’ for more options…"
|
||
type="search">
|
||
</div>
|
||
</form>
|
||
</nav>
|
||
|
||
<section id='main' class="content"><h1 class='fqn'><span class='in-band'>Crate <a class="mod" href=''>regex_syntax</a></span><span class='out-of-band'><span id='render-detail'>
|
||
<a id="toggle-all-docs" href="javascript:void(0)" title="collapse all docs">
|
||
[<span class='inner'>−</span>]
|
||
</a>
|
||
</span><a class='srclink' href='../src/regex_syntax/lib.rs.html#11-228' title='goto source code'>[src]</a></span></h1><div class='docblock'><p>This crate provides a robust regular expression parser.</p>
|
||
<p>This crate defines two primary types:</p>
|
||
<ul>
|
||
<li><a href="ast/enum.Ast.html"><code>Ast</code></a> is the abstract syntax of a regular expression.
|
||
An abstract syntax corresponds to a <em>structured representation</em> of the
|
||
concrete syntax of a regular expression, where the concrete syntax is the
|
||
pattern string itself (e.g., <code>foo(bar)+</code>). Given some abstract syntax, it
|
||
can be converted back to the original concrete syntax (modulo some details,
|
||
like whitespace). To a first approximation, the abstract syntax is complex
|
||
and difficult to analyze.</li>
|
||
<li><a href="hir/struct.Hir.html"><code>Hir</code></a> is the high-level intermediate representation
|
||
("HIR" or "high-level IR" for short) of regular expression. It corresponds to
|
||
an intermediate state of a regular expression that sits between the abstract
|
||
syntax and the low level compiled opcodes that are eventually responsible for
|
||
executing a regular expression search. Given some high-level IR, it is not
|
||
possible to produce the original concrete syntax (although it is possible to
|
||
produce an equivalent conrete syntax, but it will likely scarcely resemble
|
||
the original pattern). To a first approximation, the high-level IR is simple
|
||
and easy to analyze.</li>
|
||
</ul>
|
||
<p>These two types come with conversion routines:</p>
|
||
<ul>
|
||
<li>An <a href="ast/parse/struct.Parser.html"><code>ast::parse::Parser</code></a> converts concrete
|
||
syntax (a <code>&str</code>) to an <a href="ast/enum.Ast.html"><code>Ast</code></a>.</li>
|
||
<li>A <a href="hir/translate/struct.Translator.html"><code>hir::translate::Translator</code></a>
|
||
converts an <a href="ast/enum.Ast.html"><code>Ast</code></a> to a <a href="hir/struct.Hir.html"><code>Hir</code></a>.</li>
|
||
</ul>
|
||
<p>As a convenience, the above two conversion routines are combined into one via
|
||
the top-level <a href="struct.Parser.html"><code>Parser</code></a> type. This <code>Parser</code> will first
|
||
convert your pattern to an <code>Ast</code> and then convert the <code>Ast</code> to an <code>Hir</code>.</p>
|
||
<h1 id="example" class="section-header"><a href="#example">Example</a></h1>
|
||
<p>This example shows how to parse a pattern string into its HIR:</p>
|
||
|
||
<pre class="rust rust-example-rendered">
|
||
<span class="kw">use</span> <span class="ident">regex_syntax</span>::<span class="ident">Parser</span>;
|
||
<span class="kw">use</span> <span class="ident">regex_syntax</span>::<span class="ident">hir</span>::{<span class="self">self</span>, <span class="ident">Hir</span>};
|
||
|
||
<span class="kw">let</span> <span class="ident">hir</span> <span class="op">=</span> <span class="ident">Parser</span>::<span class="ident">new</span>().<span class="ident">parse</span>(<span class="string">"a|b"</span>).<span class="ident">unwrap</span>();
|
||
<span class="macro">assert_eq</span><span class="macro">!</span>(<span class="ident">hir</span>, <span class="ident">Hir</span>::<span class="ident">alternation</span>(<span class="macro">vec</span><span class="macro">!</span>[
|
||
<span class="ident">Hir</span>::<span class="ident">literal</span>(<span class="ident">hir</span>::<span class="ident">Literal</span>::<span class="ident">Unicode</span>(<span class="string">'a'</span>)),
|
||
<span class="ident">Hir</span>::<span class="ident">literal</span>(<span class="ident">hir</span>::<span class="ident">Literal</span>::<span class="ident">Unicode</span>(<span class="string">'b'</span>)),
|
||
]));</pre>
|
||
<h1 id="concrete-syntax-supported" class="section-header"><a href="#concrete-syntax-supported">Concrete syntax supported</a></h1>
|
||
<p>The concrete syntax is documented as part of the public API of the
|
||
<a href="https://docs.rs/regex/%2A/regex/#syntax"><code>regex</code> crate</a>.</p>
|
||
<h1 id="input-safety" class="section-header"><a href="#input-safety">Input safety</a></h1>
|
||
<p>A key feature of this library is that it is safe to use with end user facing
|
||
input. This plays a significant role in the internal implementation. In
|
||
particular:</p>
|
||
<ol>
|
||
<li>Parsers provide a <code>nest_limit</code> option that permits callers to control how
|
||
deeply nested a regular expression is allowed to be. This makes it possible
|
||
to do case analysis over an <code>Ast</code> or an <code>Hir</code> using recursion without
|
||
worrying about stack overflow.</li>
|
||
<li>Since relying on a particular stack size is brittle, this crate goes to
|
||
great lengths to ensure that all interactions with both the <code>Ast</code> and the
|
||
<code>Hir</code> do not use recursion. Namely, they use constant stack space and heap
|
||
space proportional to the size of the original pattern string (in bytes).
|
||
This includes the type's corresponding destructors. (One exception to this
|
||
is literal extraction, but this will eventually get fixed.)</li>
|
||
</ol>
|
||
<h1 id="error-reporting" class="section-header"><a href="#error-reporting">Error reporting</a></h1>
|
||
<p>The <code>Display</code> implementations on all <code>Error</code> types exposed in this library
|
||
provide nice human readable errors that are suitable for showing to end users
|
||
in a monospace font.</p>
|
||
<h1 id="literal-extraction" class="section-header"><a href="#literal-extraction">Literal extraction</a></h1>
|
||
<p>This crate provides limited support for
|
||
<a href="hir/literal/struct.Literals.html">literal extraction from <code>Hir</code> values</a>.
|
||
Be warned that literal extraction currently uses recursion, and therefore,
|
||
stack size proportional to the size of the <code>Hir</code>.</p>
|
||
<p>The purpose of literal extraction is to speed up searches. That is, if you
|
||
know a regular expression must match a prefix or suffix literal, then it is
|
||
often quicker to search for instances of that literal, and then confirm or deny
|
||
the match using the full regular expression engine. These optimizations are
|
||
done automatically in the <code>regex</code> crate.</p>
|
||
</div><h2 id='modules' class='section-header'><a href="#modules">Modules</a></h2>
|
||
<table>
|
||
<tr class=' module-item'>
|
||
<td><a class="mod" href="ast/index.html"
|
||
title='mod regex_syntax::ast'>ast</a></td>
|
||
<td class='docblock-short'>
|
||
<p>Defines an abstract syntax for regular expressions.</p>
|
||
|
||
</td>
|
||
</tr>
|
||
<tr class=' module-item'>
|
||
<td><a class="mod" href="hir/index.html"
|
||
title='mod regex_syntax::hir'>hir</a></td>
|
||
<td class='docblock-short'>
|
||
<p>Defines a high-level intermediate representation for regular expressions.</p>
|
||
|
||
</td>
|
||
</tr></table><h2 id='structs' class='section-header'><a href="#structs">Structs</a></h2>
|
||
<table>
|
||
<tr class=' module-item'>
|
||
<td><a class="struct" href="struct.Parser.html"
|
||
title='struct regex_syntax::Parser'>Parser</a></td>
|
||
<td class='docblock-short'>
|
||
<p>A convenience parser for regular expressions.</p>
|
||
|
||
</td>
|
||
</tr>
|
||
<tr class=' module-item'>
|
||
<td><a class="struct" href="struct.ParserBuilder.html"
|
||
title='struct regex_syntax::ParserBuilder'>ParserBuilder</a></td>
|
||
<td class='docblock-short'>
|
||
<p>A builder for a regular expression parser.</p>
|
||
|
||
</td>
|
||
</tr></table><h2 id='enums' class='section-header'><a href="#enums">Enums</a></h2>
|
||
<table>
|
||
<tr class=' module-item'>
|
||
<td><a class="enum" href="enum.Error.html"
|
||
title='enum regex_syntax::Error'>Error</a></td>
|
||
<td class='docblock-short'>
|
||
<p>This error type encompasses any error that can be returned by this crate.</p>
|
||
|
||
</td>
|
||
</tr></table><h2 id='functions' class='section-header'><a href="#functions">Functions</a></h2>
|
||
<table>
|
||
<tr class=' module-item'>
|
||
<td><a class="fn" href="fn.escape.html"
|
||
title='fn regex_syntax::escape'>escape</a></td>
|
||
<td class='docblock-short'>
|
||
<p>Escapes all regular expression meta characters in <code>text</code>.</p>
|
||
|
||
</td>
|
||
</tr>
|
||
<tr class=' module-item'>
|
||
<td><a class="fn" href="fn.escape_into.html"
|
||
title='fn regex_syntax::escape_into'>escape_into</a></td>
|
||
<td class='docblock-short'>
|
||
<p>Escapes all meta characters in <code>text</code> and writes the result into <code>buf</code>.</p>
|
||
|
||
</td>
|
||
</tr>
|
||
<tr class=' module-item'>
|
||
<td><a class="fn" href="fn.is_meta_character.html"
|
||
title='fn regex_syntax::is_meta_character'>is_meta_character</a></td>
|
||
<td class='docblock-short'>
|
||
<p>Returns true if the give character has significance in a regex.</p>
|
||
|
||
</td>
|
||
</tr>
|
||
<tr class=' module-item'>
|
||
<td><a class="fn" href="fn.is_word_byte.html"
|
||
title='fn regex_syntax::is_word_byte'>is_word_byte</a></td>
|
||
<td class='docblock-short'>
|
||
<p>Returns true if and only if the given character is an ASCII word character.</p>
|
||
|
||
</td>
|
||
</tr>
|
||
<tr class=' module-item'>
|
||
<td><a class="fn" href="fn.is_word_character.html"
|
||
title='fn regex_syntax::is_word_character'>is_word_character</a></td>
|
||
<td class='docblock-short'>
|
||
<p>Returns true if and only if the given character is a Unicode word
|
||
character.</p>
|
||
|
||
</td>
|
||
</tr></table><h2 id='types' class='section-header'><a href="#types">Type Definitions</a></h2>
|
||
<table>
|
||
<tr class=' module-item'>
|
||
<td><a class="type" href="type.Result.html"
|
||
title='type regex_syntax::Result'>Result</a></td>
|
||
<td class='docblock-short'>
|
||
<p>A type alias for dealing with errors returned by this crate.</p>
|
||
|
||
</td>
|
||
</tr></table></section>
|
||
<section id='search' class="content hidden"></section>
|
||
|
||
<section class="footer"></section>
|
||
|
||
<aside id="help" class="hidden">
|
||
<div>
|
||
<h1 class="hidden">Help</h1>
|
||
|
||
<div class="shortcuts">
|
||
<h2>Keyboard Shortcuts</h2>
|
||
|
||
<dl>
|
||
<dt><kbd>?</kbd></dt>
|
||
<dd>Show this help dialog</dd>
|
||
<dt><kbd>S</kbd></dt>
|
||
<dd>Focus the search field</dd>
|
||
<dt><kbd>↑</kbd></dt>
|
||
<dd>Move up in search results</dd>
|
||
<dt><kbd>↓</kbd></dt>
|
||
<dd>Move down in search results</dd>
|
||
<dt><kbd>↹</kbd></dt>
|
||
<dd>Switch tab</dd>
|
||
<dt><kbd>⏎</kbd></dt>
|
||
<dd>Go to active search result</dd>
|
||
<dt><kbd>+</kbd></dt>
|
||
<dd>Expand all sections</dd>
|
||
<dt><kbd>-</kbd></dt>
|
||
<dd>Collapse all sections</dd>
|
||
</dl>
|
||
</div>
|
||
|
||
<div class="infos">
|
||
<h2>Search Tricks</h2>
|
||
|
||
<p>
|
||
Prefix searches with a type followed by a colon (e.g.
|
||
<code>fn:</code>) to restrict the search to a given type.
|
||
</p>
|
||
|
||
<p>
|
||
Accepted types are: <code>fn</code>, <code>mod</code>,
|
||
<code>struct</code>, <code>enum</code>,
|
||
<code>trait</code>, <code>type</code>, <code>macro</code>,
|
||
and <code>const</code>.
|
||
</p>
|
||
|
||
<p>
|
||
Search functions by type signature (e.g.
|
||
<code>vec -> usize</code> or <code>* -> vec</code>)
|
||
</p>
|
||
</div>
|
||
</div>
|
||
</aside>
|
||
|
||
|
||
|
||
<script>
|
||
window.rootPath = "../";
|
||
window.currentCrate = "regex_syntax";
|
||
</script>
|
||
<script src="../main.js"></script>
|
||
<script defer src="../search-index.js"></script>
|
||
</body>
|
||
</html> |