Accessing concordance data

I’m making good progress on the concordance front.  I can now do the following:

c:concordance(“some text”)/some/count

This will convert the string to a concordance object (implicitly compiling the concordance) and give the the frequency/count of the word “some”.

My goal now is to get the “with ./*/foo := bar” fragment working with an internal Ruby representation of a concordance.  This will allow me to annotate the words in a concordance.  The internal object already preserves annotations when combining concordances.

Once I have a good serialization format for the concordance, I will be able to persist the concordance in some form — perhaps RDF, but not necessarily so.  That combined with annotations of where a word appears will let me do searches of words:


to give me the lines on which a word appears.

c:concordance($doc)/*[f:matches(node-name(.), “^f”]/line

to give me the lines on which a word appears beginning with the letter “f”.

At that point, the challenge will be to optimize these idioms so they don’t take forever to run.