Teaching Myself Rust, Part 2

Parsing some math

Beating my head against this problem for two days has finally borne fruit: A crate called prec.

Step 1 - The Data Structure

Let’s say you want to parse a math equation. It looks like this.

( 18 / 6 * 5 ) - 14 / 7

Given that string, we want to write a program that gives us back a number. Not the most esoteric of problems, there’s entire subsets of algorithms with imposing-sounding names like shunting-yard algorithm and stack-sortable permutation and a bunch of other extremely cool math problems that make way more sense to people way smarter than me. As a humble web developer, all knowledge of data structures exited my brain about when I finished Object Oriented Programming 101 and now it’s too full of CSS properties and the differences between null and undefined to make room for them again.

And yet I needed to create a math equation parser in Rust anyways.

My first strategy was the same as any self-respecting Node dev’s primal instinct: use somebody else’s package. Of course there’s plenty of those out there, but I had a few problems that none of these really handled:

  • I already have parsed tokens. Using these would mean re-converting these tokens into a string, then back out into my parsed data structure again. Yuck.
  • I need to include more than just numbers, and more than just the standard + - * / () operators.
  • I’m doing my math in integers, not floats. This means that division will always round down, but I want to supply another operator, /u, that rounds the result up.
  • …I also need to be able to parse dice notation inline.
  • This means creating a reusable Expression structure that I can roll over and over and over again, because there’s no guarantee the output will be the same every time….

All this is sounding like using an external crate is out of the question. So, time to roll my own.

Finding Some Reference Material

So my spec is:

  • Given some pre-parsed structure of tokens that represent an equation, return the equation’s result.
  • The actual parsing of the string is handled elsewhere.

The closest thing to this I found was the PrecClimber available in pest, a seriously cool crate for parsing language based on rules you write. It’s is a built-in utility class that does pretty much exactly what I want, using the operator-precedence parsing algorithm, right?

…But it’s not quite good enough.

PrecClimber takes an array of tokens that look something like this:

( 18 / 6 \* 5 ) - 14 / 7

[
  Operator (paren),
  Number (18),
  Operator (divide),
  Number (6),
  Operator (multiply),
  Number (5),
  Operator (paren),
  Operator (subtract),
  Number (14),
  Number (divide),
  Number (7)
]

A flat array of each piece of the equation, one after the other. This makes sense, but can’t we do better?

What if the array somehow ends up looking like this?

( ( 4 + / 8

[
  Operator (paren),
  Operator (paren),
  Number (4),
  Operator (add),
  Operator (divide),
  Number (8)
]

That’s technically a valid array, but that’s not at all a valid equation. That means more errors, which means handling more errors, which means more ways your code can fail…

Let’s use a better structure for an equation than that!

A valid expression, without any parentheses in it:

  1. Always starts with a number.
  2. After the first number, is always operator, number, operator, number, operator, number, . . . operator, number

A Rust struct that reflects these rules looks like… this!

struct Expression<Op, To> {
	pub first_token: To,
	pub pairs: Vec<(Op, To)>
}

And including parentheses is as easy as writing your operator / token structures like so:

enum Operator {
	Add,
	Sub,
	Mul,
	Div,
	Exp
}

struct Token {
	Number(i64),
	Parentheses(Box<Expression<Operator, Token>>)
}

Suddenly, we can write ( 18 / 6 * 5 ) - 14 / 7 like this:

use Operator::*;
use Token::*;

let expression = Expression {
	first_token: Parentheses(Box::new(Expression {
		first_token: Number(18),
		pairs: vec![(Div, Number(6)), (Mul, Number(5))]
	},
	pairs: vec![(Sub, Number(14)), (Div, Number(7))]
};

With a struct that reflects the rules of the problem you’re trying to solve, you can make it impossible to write invalid code. I never have to worry about handling errors related to the shape of the structure after the actual string-to-Expression parsing step.

We haven’t done anything with our cool struct yet, but that’s the next step:

Step 2 - The Algorithm

Just one problem: Every example of an operator-precedence climber I found online works off of a possibly-incorrect array, as shown further above. That means, yet again, rolling my own.

The next step happened during a two-day fugue state of random attempts at translating those examples I found into this new struct I’ve made, so I couldn’t tell you exactly how I did it. But I did do it, and the code’s there, so I must have written it?! It appears to be very heavily based on the code of PrecClimber.

I was the latter during this stage. No memory of this time exists.

End Result: Generic Operator-Precedence Parser

The final result: prec! A crate for extremely generic operator-precedence parsing. You supply everything around the actual algorithm itself:

  • A structure that represents a token
    • An implementation of Into that turns the token into some final value, likely f64 or i64, but it could be anything!
  • A structure that represents an operator
  • A function that determines what operators actually do
  • A set of rules to determine which operators take precedence over others (hence the name, operator precedence!)

Your end result might look something like this:

fn handler(lhs: f64, op: Operator, rhs: f64) {
	match op {
		Operator::Add => lhs + rhs,
		Operator::Sub => lhs - rhs,
		Operator::Mul => lhs * rhs,
		Operator::Div => lhs / rhs,
		Operator::Exp => lhs.powf(rhs)
	}
}

let climber = Climber::new(
	vec![
		Rule::new(Operator::Add, Assoc::Left) | Rule::new(Operator::Sub, Assoc::Right),
		Rule::new(Operator::Mul, Assoc::Left) | Rule::new(Operator::Div, Assoc::Right),
		Rule::new(Operator::Exp, Assoc::Right)
	],
	handler
);

// 2 + 2 * 3
// 2 + 6
// 8
let expression = Expression::new(
	2.0f64,
	vec![
		(Operator::Add, 2.0f64),
		(Operator::Mul, 3.0f64)
	]
);

assert_eq!(climber.process(&expression), 8.0f64);

You can use standard ol’ operators like add, multiply, divide, etc. or you can leverage what makes the crate so cool, and roll some operators of your own. You can conceivably use this to write some sort of operator-precedence string concatenation thing, or add in stuff like trigonometry. The included example already extends it to use parentheses, so the sky’s really the limit.

Check the documentation out, write your own parser for something!

prec - Rust

mcpar-land/prec

crates.io: Rust Package Registry

↑ Top


Teaching Myself Rust, Part 1
Baduk & Bevy
← Previous
Bevy Fly Camera
Pleasant utility for Bevy
Next →