Presentation Management

Fabulator 0.0.9 is going to be full of changes. Some are relatively minor internal changes that only matter to people who have written tag libraries (i.e., me). The biggest change that will have an impact for you, the user, is the presentation layer management.

XML is at the heart of the Fabulator system. Applications are written in XML, libraries are written in XML. The views are written in whatever markup is comfortable, but the interactive elements (the forms) are easiest in XML.

Tag libraries will have two ways they can influence the system beyond defining types, functions, and actions. Tag libraries will be able to specify two different XSLTs: one that can be applied to the application to transform it, and one that can be applied in a presentation context to provide new presentation elements. The first can be used as a macro language for the applications. The second can be thought of as similar for presentations.

I’m considering using Fluid Infusion as the foundational JavaScript library for presentation. I’m not settled yet on which gem will get it, though it might be easiest to bundle it with the Radiant extension.

To add presentation transformations, you do something like the following in your Ruby tag library. This example is taken from the core action library.

Now the system knows which XSLT file to use when transforming the markup to HTML, it knows which elements in the tag library’s namespace should automatically be given default values and captions, and it knows which elements contribute their ‘id’ attribute to the path pointing to the default value in the application’s memory (and thus contribute to the name of the HTML input element as well).

You can have, for example, the following form in Radiant:

The name and email fields will be populated by the data in ‘/account/name’ and ‘/account/email’ if they exist. The HTML input fields will have the names ‘’ and ‘’.

The Radiant tag (<r:form />) automatically surrounds the enclosed XML in a <form xmlns=”http://…” /> element with the appropriate namespace declaration to allow simple form markup without prefixes.

Because different tag libraries may have structural elements that contribute to the path, the system will also inject a ‘path’ attribute into interactive elements so they know how to name the HTML elements. The tag library doesn’t have to know what other tag libraries are contributing.

It makes sense for one library to use another, so the XSLT associated with one library will be run before the XSLT associated with a library associated with a namespace mentioned in the library’s root element (for libraries defined in XML), or in the root element of the XSLT (for libraries defined in Ruby). This is for presentation and application XML transformations.

This is the start of a push away from explicit HTML in projects. We’ve done a lot to make the algorithms transportable through time — we don’t have to revisit projects to update code when we update the framework (or won’t once we hit 1.0), but there’s more to a project than just the algorithms. There’s also the HTML and CSS that tends to grow stale over time. By creating a presentation layer, we can capture the presentation intent of the project while providing a (hopefully) robust path forward.

Ultimately, a project may be able to expose the presentation and algorithm layers to advanced researchers and let them create their own algorithmic presentations built from humanities data. That’s a number of years off, but that’s the direction we’re headed.

Next on the todo list: come up with a reasonable set of presentation tags for building Exhibit displays. For those thinking this looks similar to things like Cacoon, you’re not too far off from the general idea as far as the implementation goes. Some details are a bit different, of course.


I’ve been working a little here and there on getting a workflow extension together. I still have a bit to do, but the skeleton is shaping up and passing some simple tests.

The plan right now is for workflows to act like grammars in that they augment a library instead of standing alone. Workflow definitions will attach themselves to the namespace of a library and have a name that can be used (when qualified by the library’s namespace) to reference the workflow definition.

Workflows are very similar to applications in that they are a collection of states and state transitions. The important differences are that you can add data over time to the workflow and test which transitions (actions) can be taken at any given time without automatically following a transition.

This allows an application to add data to a workflow and give feedback to the user as to which actions can be taken on the workflow. For example, I’m considering building a workflow that can manage the process from proposal submission to final approval. Such a system would allow anyone to upload a project proposal to a queue (resulting in a workflow in the ‘pending’ state), allow the advisory committee to review pending proposals and possibly approve them (moving them to the ‘approved’ state) or request more information (moving them to the ‘more-info-needed’ state).

The resulting workflow might look something like the following:

We still need ways to manage authentication and authorization. The final version of this will also require some framework-specific code to manage workflow storage. There’s a lot still to be figured out to try and get this close to right, but we’re getting there.

Gems updated

I went ahead and pushed out new gems for fabulator, fabulator-grammar, and radiant-fabulator-extension. The last requires a database migration run to add the library table.

This update brings to Radiant the ability to define libraries that can be referenced from applications through the namespace declarations.

I’m starting to put together some project management stuff using the fabulator setup in Radiant. I’ll use that to flesh out the next set of updates to the gems. Once I have something workable, I’ll see where I can post the XML for the libraries and some of the pages as examples of how to use the Fabulator+Radiant system.

We’re still bootstrapping, but we’re far enough along now that we can start doing meatier projects.

Libraries, some more

I think we’re getting very close to having libraries sufficiently developed that we can return to focusing on actual projects. Upcoming gem releases (tomorrow or early next week) will have a good start for building libraries and grammars. A lot of work remains, but there’s enough there that we can get real work done now.

Our current play library is the following:

In an application (assuming the ‘m’ prefix maps to ‘’), we can use the grammar rules ‘something’ and ‘something2’ as filters and constraints. We can also use them as functions for matching and parsing strings.

We also have the functions ‘m:double’ (a mapping) and ‘m:fctn’ (a regular function).

Finally, we have the action ‘m:actn’ which accepts enclosed elements as actions and returns thrice the value returned by running the actions. The ‘f:eval’ function evaluates the referenced code using the current node in the context.

In general, mappings will be called with their argument as the current node. Functions will be called with each argument as a separate variable ($1, $2, $3, …) and with $0 representing all of the arguments as one list. Reductions will be called with a long list in $0.

We still have a few more things we need to do to make these XML-based libraries as flexible as the Ruby ones, but that will come as we need that flexibility.

We also need to think about how we want to document libraries… but enough for one day.

Libraries and Grammars

We’ve released a new set of Fabulator gems. These give us the better ways of defining structural elements and implementing classes. We’ll be building on these in the next releases to provide a general library scheme in XML.

The goal is to allow the creation of extension tag libraries in XML without having to write Ruby code. We want to reserve Ruby (or whatever lower-level language is used to build the enabling framework) for those extensions that can not be written given existing extensions and language capabilities.

The library concept is part of the core Fabulator gem. It provides a separate namespace for the library ( With a few new class methods that will be in the next release of Fabulator (0.0.8), the grammar extension is able to place itself within the library container:

If we map the ‘m’ prefix to the ‘’ namespace (corresponding to the l:ns attribute value), then we can run ‘m:something?(‘a0A’)’ and get back ‘true’. We can also run ‘m:something(‘a0A’)/NUMBER’ and get back the string ‘0’ (not the integer 0 though). We won’t get anything though when we run at against the string ‘a0a’ because the second ‘a’ doesn’t match the LETTER token in the ‘upper’ mode.

Our next step is to provide a mechanism in the Radiant Fabulator extension for managing libraries. Hopefully we can have that by the end of this week. We’ll release all of the relevant gems at that point.

Having a library capability and the grammar extension will allow us to put in one location all of the text patterns that we use in a project. Regular expressions won’t be magic values embedded in programs.

Grammars and Structures

We’ve been busy getting a couple of different things done. These will show up in the next releases of the core Fabulator gem and the grammar gem.


The following test grammar works.

This is a little different than what I mentioned in the previous post. Tokens are the same, but the rules are very different. They borrow some ideas from the Perl Parse::RecDescent module with a little Perl 6 thrown in.

None of the rules are anchored, so they will skip any leading text until they find something that matches.

The ‘something’ rule will match any sequence of letter-number-letter and return a structure representing the matches. For example, matching ‘a1c’ will result in ./LETTER equalling ‘a’ and ‘c’ and ./NUMBER equalling ‘1’.

The ‘other’ rule will match the same sequence as the ‘something’ rule, but instead of naming the resulting values LETTER and NUMBER, it will use ‘a’, ‘b’, ‘c’ as the names (the := construct). In the case of matching against ‘a1c’, the result is ./a = ‘a’, ./b = ‘1’, ./c = ‘c’.

Things start getting interesting with the ‘or’ rule. This will match any sequence of one or more ‘other’ rules followed by one or more LETTER tokens.

Finally, the ‘ooor’ rule will match one or more ‘other’ rules that are separated by commas followed by one or more LETTER tokens.

We’ll start getting modes and additional quantifiers working next.


We’ve had the Fabulator::Action class for a while for defining new actions, but structural elements have still been hand-coded with a lot of duplicated code. Today’s work changes that.

The first thing is that in the action lib class that defines all of the components of the library (and we’ll probably be changing ‘ActionLib’ to ‘TagLib’), you specify that an element is structural like so (borrowed from the core library):

Each of these can be used in an action or structural class as a structural element instead of an action.

For example, the view element class is defined as follows (minus some run-time stuff):

Parameters (the ‘param’ element):

By default, the contained structural elements are stored in an instance variable named after the element, but pluralized (e.g., ‘constraint’ elements will be stored in @constraints). By default, this is an array. It can be a hash if the ‘:storage => :hash’ option is set. If no ‘:name => :method’ (for whichever method is appropriate) option is given, it defaults to the ‘name’ method to determine the key for the hash. If two structural elements are assigned to the same variable, they are combined instead of one overwriting the other.

This gets us a little closer to being able to have libraries or packages as useful concepts in the Fabulator+Radiant system.

More Regex Work

One of the overall guiding principles of my work is that anything that can be considered an editorial statement within the context of a particular DH project should be exposed to the project owner instead of being hidden away in Ruby code (or any other language of the day). The Fabulator+Radiant combination allows the researcher to see in the content management system everything that pertains to their project. Any tweaks in how they interpret their data are done within the CMS, not within Ruby code.

With that in mind, I’m slowly making progress on a general grammar engine that will let us write markup parsers within the CMS. This is useful when we are working with markup that was developed before TEI, such as in the Donne project. Right now, the parsing of the transcriptions is done in Ruby, so any modification in the parsing and possible translation to HTML or TEI is done in Ruby, away from the eyes of the people with an editorial interest in that translation.

First, I’ll mention the limitations of the Ruby 1.8 regex engine.

It doesn’t really get Unicode, or if it does, Ruby integers don’t. I can only create characters in the range of 0x00-0xff. Thus, the initial grammar engine will only work with ascii. I know this is a major limitation for projects using something other than simple Latin characters. I have good reason to believe that this limitation can be removed when Radiant moves to Ruby 1.9. On the other hand, patches are welcome once I push the current development code to github. :-)

Ruby doesn’t support conditional branching within a regular expression. This limits the complexity we can support. As a result, we’ve had to rethink how we handle character class algebra and will require the ‘bitset’ library. However, I think the resulting facility of specifying character sets is worth the extra installation effort.

We are currently working on two different regular expression languages. One is used to specify tokens and the other for rules. The major difference is that the atomic unit in a token is the character while the atomic unit in a rule is the token. Rules can also have actions associated with them while tokens do not.

Both tokens and rules will act like functions in a library. You can call them in an expression just like a function. If you call the token or rule without a trailing question mark, you’ll get any data structure that results from matching the token/rule against the provided string. If you call it with the trailing question mark, you’ll get a boolean response indicating the success/failure of trying to match the string.


As a test case, I’m starting to put together a grammar for the Donne markup. For tokens, I have:

You’ll notice that I’ve specified a mode for almost all of the tokens (using a g:context element as shorthand). This is similar to the mode attribute in XSLT in that the token is only active if the grammar engine is in that mode. If no mode is specified, then the token is active in all modes.

I’ve also used named character classes instead of specifying explicit ranges. This will help when the engine supports Unicode since the named classes will automatically expand to encompass the standard Unicode equivalents. It also makes the expressions a little more readable.

Character Set Algrebra

Character sets can be more than just simple expressions as in the above token definitions. You can add and subtract them to get exactly the set of characters you want.

Some examples of character set algebra (assuming strict 7-bit ascii):

  • [:xdigit:] == [:digit: + [a-f] + [A-F]]
  • [:alnum:] == [:alpha: + :digit:] == [:upper: + :lower: + :digit:]
  • consonants: [:alpha: – [aeiouAEIOU]]
  • everything except vowels: [ – [aeiouAEIOU] ]

Ordering is important. Set operations are evaluated left to right, so the following are not equivalent: [:alpha: – [aeiouAEIOU]] and [ -[aeiouAEIOU] + :alpha: ]. The first is the set of consonants (alpha from which we remove the vowels) while the second is everything (all characters from which we remove the vowels and then add all alphabetical characters).

Set operations can be parenthesized if it helps make things clearer. For example, the following two are equivalent: [:alpha: – [aeiou] – [AEIOU]] and [:alpha: – ([aeiou] + [AEIOU])].


Rules describe how tokens are related. For example, the top-level rule in the Donne transcription grammar is:

This says that a document consists of a line followed by zero or more lines separated by one or more new lines and ending with an optional newline. If this successfully matches, then the result of the match should be whatever was produced by matching the individual ‘line’ rules.

The ‘[NL]’ is just the token from before. The ‘line’ rule is defined as:

This rule is a bit more complex than the ‘lines’ rule. It also shows us how we might use the mode to change the token set we have available.

Here, we expect to begin in the ‘linenumber’ mode so that we don’t have to worry about the linetext tokens matching when we expect only the ‘linenumber’ rule and tokens to match. Once we are past the ‘linenumber’, we switch to expecting tokens in the ‘linetext’ mode. If we find an optional control character and text, then we successfully match a line and run the associated code that sets up the data for that line.

The linenumber rule is defined as:

You’ll also notice that the actual match is in a g:when element. A rule can match multiple patterns, each with its own associated code. Here, we are using the linenumber tokens defined above and expecting them to be separated by dots (.). You can think of the “’.’” notation as defining an anonymous token that matches a literal string. Spaces in the rule pattern are ignored. To match an explicit space in the string you’re parsing, you will need to include it explicitly in the pattern.

If we match a line number, we just translate the names of the items from uppercase to lowercase in the result that is passed on to the ‘line’ rule.


Grammars are collections of rules and tokens. The current vision is that grammars will live inside libraries (yet to be coded) and make available as functions those rules and tokens that are not attached to a particular mode.

The resulting system will provide named regular expressions (tokens) and parsers (rules). I’m debating how I want to construct the parser: either top-down or bottom-up. Top-down is easier in some ways because we don’t need to construct a state machine to manage building up more complex rules from less-complex sets of tokens/rules. Bottom-up is nice because it allows the higher-level patterns to emerge from the collection of tokens/rules produced from processing the given string without having to determine beforehand what our target is: we can stop processing as soon as we have a single rule encompassing everything we’ve seen.

Right now, we have a working token expression parser that gives us basic regular expression support. We’re developing the rule engine and the general library support in Fabulator in which we’ll embed the grammar definitions. Once we have libraries and grammars available in the Radiant+Fabulator combination, we’ll remove the ‘matches’ function from the grammar extension. Regular expressions shouldn’t be embedded in a program if they are subject to change. Instead, they can be documented by putting them in a project-specific library/grammar.

Presentations: Manuscript Viewer

As we wrap up the development of the core data transformation engine, we are starting to think about the presentation layer. How can we specify our presentations so that we don’t have to modify them when the JavaScript library APIs, HTML standards, etc., change?

One of the things we’ve worked on is a manuscript viewer that shows an image of a page alongside a transcription. This is a common need in textual digital humanities.

I’m thinking something along the lines of the following based on the work we’ve done for the Donne project (using Radiant to build up an XML description that can be transformed via XSLT):

This is based on the following skeleton that we’re using now:

Using the XML description frees us to use a different implementation without having to revisit the project as long as we are using the same semantics. We’ll be looking to a few more projects to figure out some patterns before we start making design decisions, but this is a good indication of how we’re planning on approaching the presentation layer of DH projects.


One of the fundamental concepts I’ve been working with in the Fabulator expression engine is that a set of items should be as easy to work with as a single item. However, if we know (or think) we are working with sets of items, then we can start asking different questions as soon as that set has more than one element.

Junctions are variables that act as if they are set to multiple values at the same time. The default junction (and the only one supported for now in the engine) is ‘any()’.


Instead of saying something like ‘/a[. = 1 or . = 2 or . = 3]’, a junction allows ‘/a[. = any(1,2,3)]’, which is more readable and maintainable.

You could also do something like ‘/a/*[f:name(.) = any($list-of-names)]’, though I might like ‘/a/{any($list-of-names)}’ better.

Given two sets of values, you could do comparisons such as ‘any($list-a) > any($list-b)’ to see if there’s a value in $list-a that is larger than a value in $list-b.


Another junction is ‘all()’. It is used to ensure that every value meets any requirement in a comparison. For example, ‘all($list-a) > any($list-b)’ would mean that every value in $list-a had to exceed a value in $list-b (possible a different value in $list-b for each value in $list-a). ‘all($list-a) > all($list-b)’ would be true only if the minimum value of $list-a exceeded the maximum value in $list-b.

The construct ‘/a/{all($list-of-names)}’ wouldn’t be very useful unless $list-of-names was single-valued.


The ‘none()’ junction is a bit of the opposite of ‘all()’. Instead of requiring complete agreement from all values, it requires complete non-agreement.

The expression ‘none($list-a) > none($list-b)’ would mean that no value from $list-a could exceed a value not in $list-b. Not very useful given the number of negative numbers that are less than the minimum value that could be in $list-b. Replacing one of them with an ‘all()’ or ‘any()’ would work though. ‘none($list-a) > any($list-b)’ would be true if no value in $list-a exceeded a value in $list-b (equivalent to ‘all($list-a) <= all($list-b)’).

The path ‘/a/{none($list-of-names)}’ would select all of the children of ‘/a’ that didn’t have a name in the $list-of-names.

We’re still figuring out how these will be worked into the language. They might be limited to use within predicates and tests or any ephemeral run-time processing. This is the beginning of the thinking out loud design phase.

Moving Towards Stability

I’ve been working with a rapid release cycle, but I think we’re getting close to a stable core that can support new work soon.

The current list of Ruby gems:

  • fabulator 0.0.5
  • fabulator-exhibit 0.0.2
  • fabulator-grammar 0.0.1
  • fabulator-xml 0.0.3
  • radiant-fabulator-extension 0.0.3
  • radiant-fabulator_exhibit-extension 0.0.1

With this set, you can make a mistake in the XML defining an application and Radiant will report a validation error and give you an opportunity to edit the page again. Before, it would just throw an error and give you an otherwise blank page with no options other than hitting the back button.

Predicates should be a bit more DWIMy (Do What I Mean). If the expression evaluates to a number, then it will be a match against the position in the set. If it’s a string, then it will be true if non-blank. If it’s boolean, then it’s just boolean. The f:position(), f:first(), and f:last() functions should now work in predicates. We will be working to extend it to other iterative contexts.

The next week or two should be continued bug fixes as we come to the close of the project season and launch into the next. We’ll post more later on what might come starting in September.

We’re also working on developing some web-based resources that will explain how to use all of these libraries to build a project site.