Accessing concordance data

I’m making good progress on the concordance front.  I can now do the following:

c:concordance(“some text”)/some/count

This will convert the string to a concordance object (implicitly compiling the concordance) and give the the frequency/count of the word “some”.

My goal now is to get the “with ./*/foo := bar” fragment working with an internal Ruby representation of a concordance.  This will allow me to annotate the words in a concordance.  The internal object already preserves annotations when combining concordances.

Once I have a good serialization format for the concordance, I will be able to persist the concordance in some form — perhaps RDF, but not necessarily so.  That combined with annotations of where a word appears will let me do searches of words:

c:concordance($doc)/foo/line

to give me the lines on which a word appears.

c:concordance($doc)/*[f:matches(node-name(.), “^f”]/line

to give me the lines on which a word appears beginning with the letter “f”.

At that point, the challenge will be to optimize these idioms so they don’t take forever to run.

Published by

James

James is a software developer and self-published author. He received his B.S. in Math and Physics and his M.A. in English from Texas A&M University. After spending almost two decades in academia, he now works in the Washington, DC, start up world.