Regular Expressions

I’ve released another gem: fabulator-grammar. This will be developed into a grammar engine modeled loosely on the Perl 6 grammars. Right now, it just provides the ‘g:match($regex, $string)’ function that returns a boolean.

I’m continuing to work on the Donne concordance and needed a way to pick out just the transcription pages from a manuscript so that I could put various views of the manuscript under the same parent page without mistaking them for a transcription. With the ‘g:match()’ function, I can sift them out when building the concordance database (or doing other work with the list of transcription pages).

Right now, the regular expressions support the basics: ( ) for grouping, [ ] for character classes, . for matching any character, and the basic counters +, *, ? and the minimizer ? (after a counter). There’s also support for the anchors ^ (beginning) and $ (end). Most character sequences will match themselves.

The long-term goal is to provide support for building grammars of rules and tokens that can have actions associated with them when they match. This enables parsers in the Fabulator environment without having to break out into Ruby. Since how we parse text can be an important part of an electronic editorial statement, it’s important that we capture this kind of processing in the Fabulator environment instead of hiding it in Ruby.

Published by

James

James is a software developer and self-published author. He received his B.S. in Math and Physics and his M.A. in English from Texas A&M University. After spending almost two decades in academia, he now works in the Washington, DC, start up world.