Crate combine [−] [src]
This crate contains parser combinators, roughly based on the Haskell library parsec.
A parser in this library can be described as a function which takes some input and if it
is succesful, returns a value together with the remaining input.
A parser combinator is a function which takes one or more parsers and returns a new parser.
For instance the many
parser can be used to convert a parser for single digits into one that
parses multiple digits. By modeling parsers in this way it becomes simple to compose complex
parsers in an almost declarative way.
Overview
combine
limits itself to creating LL(1) parsers
(it is possible to opt-in to LL(k) parsing using the try
combinator) which makes the
parsers easy to reason about in both function and performance while sacrificing
some generality. In addition to you being able to reason better about the parsers you
construct combine
the library also takes the knowledge of being an LL parser and uses it to
automatically construct good error messages.
extern crate combine; use combine::{Parser, State}; use combine::char::{digit, letter}; const MSG: &'static str = r#"Parse error at line: 1, column: 1 Unexpected `|` Expected `digit` or `letter` "#; fn main() { // Wrapping a `&str` with `State` provides automatic line and column tracking. If `State` // was not used the positions would instead only be pointers into the `&str` if let Err(err) = digit().or(letter()).parse(State::new("|")) { assert_eq!(MSG, format!("{}", err)); } }
This library currently contains five modules:
-
combinator
contains the before mentioned parser combinators and thus contains the main building exprs for creating any sort of complex parsers. It consists of free functions such asmany
andsatisfy
as well as a few methods on theParser
trait which provides a few functions such asor
which are more natural to use method calls. -
primitives
contains theParser
andStream
traits which are the core abstractions in combine as well as various structs dealing with input streams and errors. You usually only need to use this module if you want more control over parsing and input streams. -
char
andbyte
provides parsers specifically working with streams of characters (char
) and bytes (u8
) respectively. As a few examples it has parsers for accepting digits, letters or whitespace. -
range
provides some zero-copy parsers forRangeStream
s.
Examples
extern crate combine; use combine::char::{spaces, digit, char}; use combine::{many1, sep_by, Parser, ParseError}; fn main() { //Parse spaces first and use the with method to only keep the result of the next parser let integer = spaces() //parse a string of digits into an i32 .with(many1(digit()).map(|string: String| string.parse::<i32>().unwrap())); //Parse integers separated by commas, skipping whitespace let mut integer_list = sep_by(integer, spaces().skip(char(','))); //Call parse with the input to execute the parser let input = "1234, 45,78"; let result: Result<(Vec<i32>, &str), ParseError<&str>> = integer_list.parse(input); match result { Ok((value, _remaining_input)) => println!("{:?}", value), Err(err) => println!("{}", err) } }
If we need a parser that is mutually recursive we can define a free function which internally
can in turn be used as a parser by using the parser
function which turns a function with the
correct signature into a parser. In this case we define expr
to work on any type of Stream
which is combine's way of abstracting over different data sources such as array slices, string
slices, iterators etc. If instead you would only need to parse string already in memory you
could define expr
as fn expr(input: &str) -> ParseResult<Expr, &str>
#[macro_use] extern crate combine; use combine::char::{char, letter, spaces}; use combine::{between, many1, parser, sep_by, Parser}; use combine::primitives::{State, Stream, ParseResult}; #[derive(Debug, PartialEq)] pub enum Expr { Id(String), Array(Vec<Expr>), Pair(Box<Expr>, Box<Expr>) } // The `parser!` macro can be used to define parser producing functions in most cases // (for more advanced uses standalone functions can be defined to handle parsing) parser!{ fn expr[I]()(I) -> Expr where [I: Stream<Item=char>] { let word = many1(letter()); //Creates a parser which parses a char and skips any trailing whitespace let lex_char = |c| char(c).skip(spaces()); let comma_list = sep_by(expr(), lex_char(',')); let array = between(lex_char('['), lex_char(']'), comma_list); //We can use tuples to run several parsers in sequence //The resulting type is a tuple containing each parsers output let pair = (lex_char('('), expr(), lex_char(','), expr(), lex_char(')')) .map(|t| Expr::Pair(Box::new(t.1), Box::new(t.3))); word.map(Expr::Id) .or(array.map(Expr::Array)) .or(pair) .skip(spaces()) } } fn main() { let result = expr() .parse("[[], (hello, world), [rust]]"); let expr = Expr::Array(vec![ Expr::Array(Vec::new()) , Expr::Pair(Box::new(Expr::Id("hello".to_string())), Box::new(Expr::Id("world".to_string()))) , Expr::Array(vec![Expr::Id("rust".to_string())]) ]); assert_eq!(result, Ok((expr, ""))); }
Re-exports
pub extern crate byteorder; |
Modules
byte |
Module containing parsers specialized on byte streams. |
char |
Module containing parsers specialized on character streams. |
combinator |
Module containing all specific parsers. |
primitives |
Module containing the primitive types which is used to create and compose more advanced parsers. |
range |
Module containing zero-copy parsers. |
Macros
choice |
Takes a number of parsers and tries to apply them each in order. Fails if all the parsers fails or if an applied parser consumes input before failing. |
ctry | |
parser |
Declares a named parser which can easily be reused. |
struct_parser |
Sequences multiple parsers and builds a struct out of them. |
Structs
ParseError |
Struct which hold information about an error that occurred at a specific position.
Can hold multiple instances of |
State |
The |
Traits
Parser |
By implementing the |
Stream |
A stream of tokens which can be duplicated |
StreamOnce |
|
Functions
any |
Parses any token. |
between |
Parses |
chainl1 |
Parses |
chainr1 |
Parses |
choice |
Takes an array of parsers and tries to apply them each in order. Fails if all the parsers fails or if an applied parser consumes input before failing. |
count |
Parses |
count_min_max |
Parses |
env_parser |
Constructs a parser out of an environment and a function which needs the given environment to do the parsing. This is commonly useful to allow multiple parsers to share some environment while still allowing the parsers to be written in separate functions. |
eof |
Succeeds only if the stream is at end of input, fails otherwise. |
from_iter |
[ Deprecated ]
|
look_ahead |
|
many |
Parses |
many1 |
Parses |
none_of |
Extract one token and succeeds if it is not part of |
not_followed_by |
Succeeds only if |
one_of |
Extract one token and succeeds if it is part of |
optional |
Parses |
parser |
Wraps a function, turning it into a parser. |
position |
Parser which just returns the current position in the stream. |
satisfy |
Parses a token and succeeds depending on the result of |
satisfy_map |
Parses a token and passes it to |
sep_by |
Parses |
sep_by1 |
Parses |
sep_end_by |
Parses |
sep_end_by1 |
Parses |
skip_count |
Parses |
skip_count_min_max |
Parses |
skip_many |
Parses |
skip_many1 |
Parses |
token |
Parses a character and succeeds if the character is equal to |
tokens |
Parses multiple tokens. |
try |
|
unexpected |
Always fails with |
value |
Always returns the value |
Type Definitions
ConsumedResult |
A |
ParseResult |
A type alias over the specific |