Category Archives: Digital Humanities

Declarative Perl

I'm making rapid progress in getting OokOok to a stable programming scheme. I haven't made a lot of changes in its capabilities, though I did add the ability to archive themes and projects as Bagit files yesterday, I've been working on making the important stuff declarative. By hiding all the details behind a veneer of relationships, I can fiddle with how I manage those relationships without having to touch every relationship every time I make a change in the underlying schemas (and schemes).

For those used to an older style of Perl programming, this might come as a surprise. For those who have dealt with things like MooseX::Declare and CatalystX::Declare, you'll be shaking your head at my foolhardiness in jumping into making an OokOok:Declare that hides the details of how to construct certain types of classes.

Behind the scenes, OokOok consists of controllers, models, views, REST collections/resources, SQL result objects, a template engine, and tag libraries for the templates. Almost two hundred classes in all.

If I built all of these the usual Perl way, there'd be a lot of boilerplate code around. By moving to a declarative approach, I can isolate all the boilerplate in a few core meta-classes. When the boilerplate has to change, I only have to touch one place. Everything else comes along for the ride.

For the rest of this post, I want to walk through how I use some of these declarative constructions. I won't get into the scary details of how to make declarative constructions in Perl (at least, not in this post).

Continue Reading Declarative Perl

OokOok Progress

OokOok is coming along nicely. It's been a couple of months since the last update, so I'll outline a bit of what I've done since the last post.  I'm nowhere near being able to throw up a demonstration server for anyone to play with, but I'm getting closer. With a little more testing, a reasonably decent administrative interface, some simple themes, and full authorization management, we'll be good to go on a first demo. I'm aiming for the end of the year. I'm trying to think about what a good, simple demonstration project might be that is just text on-line. Perhaps a curated collection of creative-commons licensed works on a subject?

OokOok isn't meant to do everything for everyone. I'm designing it with opinions. I think they are well researched and thought out opinions, but they are opinions. I hope the pros can outweigh the cons, but that's something you'll need to decide when considering which platform to use for your project.

I'm designing the system to enable citation, reproduction, sustainability, and description. You should be able to point someone at exactly the version of the page that you saw (citation), be able to see the same content each time you view that version of the page (reproduction), see that content "forever" (sustainability), and leverage computation through description (composing the rules) instead of prescription (composing the ways). I've based all the opinionated choices in the system on trying to meet the needs of those four "axioms."

Continue Reading OokOok Progress

A Citation is Forever

“Miscellaneous fancy work.” From the Project G...
“Miscellaneous fancy work.” From the Project Gutenberg eBook of Encyclopedia of Needlework. (Photo credit: SWANclothing)

Last week, I talked about the basic model I'm considering for managing static web content in a way that lets us find it based on when we looked at it. The idea is that if I want to cite something, I should be able to point at what I'm citing and know that someone else following my citations will see the same thing I did.

Today, I want to explore what it means for something to be citable.

I come from the sciences, where citation is a shorthand for bringing in a body of work that you don't want to reproduce in your text. It's like linking in a library in a program. You're asserting that something is important to your argument and anyone can find out why they should believe it by following the citation. You don't have to explain the reasoning behind what you're referencing.

If you use citations to give shout outs to people in your field, then you don't need what I'm thinking about. Readers understand that these citations are to remind them about the other people and their body of work, not the particular passage pointed to in the citation. The details aren't important enough to look up.

I'm interested in the citations that people need to follow.

Continue Reading A Citation is Forever

OokOok: Timelines, Pages, and Editions

I've made some good progress on the OokOok project over the last week. The system has a minimal page management interface now, so you can create, edit, and delete new pages and place them in the project's sitemap. You can create project editions that freeze the content in time, and you can see the different versions of a page using time-based URLs.

You know you have a real software project when you have a list of things that won't be in the current version. So it is with OokOok. Eventually, I want to support any dynamic web-based digital humanities project and allow it to run forever without any project-specific maintenance. For now, I'll be happy creating a simple text content management system that has all the time-oriented features. We can add support for algorithms later.

Today, I want to talk a bit about the model I'm using to keep track of the different versions and the impact this has on the user interface.

Continue Reading OokOok: Timelines, Pages, and Editions

Curating Dynamic Digital Humanities

Perl
Perl (Photo credit: Wikipedia)

I'm using part of my research time to dive back into a development thread I've been working on for the last decade. It started with the various Gestinanna packages through 2004 that are now on BackPAN, the archive of CPAN modules no longer considered actively distributed. I followed it a few years later with a set of Ruby fabulator gems that are in my github repository. I designed these to run in Radiant, a nice CMS for small teams. I used the Ruby/Radiant system to teach a course at DHSI in 2011 covering data acquisition, management, and presentation.

Now, I'm combining my experience and starting a new coding effort. I'll be posting code to github.com/jgsmith/perl-ookook/. There are a lot of design decisions that I hope to discuss in a series of blog posts. For example, why did I choose Perl over something new and shiny like Ruby or Node.js? Why didn't I use a venerable environment like Java and the JVM? Why have a backend SQL database instead of a noSQL database. For now, I'll try to discuss a few questions that have shorter answers.

Continue Reading Curating Dynamic Digital Humanities

Going Digital

keystone 8mm model B8
keystone 8mm model B8 (Photo credit: B.S. Wise)

You might think that working in a digital humanities group would mean a lot less paper, but that's not the case. I have a folder for each project I'm working on, each filled with papers showing things like milestones, budgets, and work plans. Every time I have a meeting about a project, I pull out the folder(s) related to it and go through the papers to catch up with where we are.

The problem with having everything on paper is that I have to be where the paper is. If I'm at home, I don't have access to it. Same goes for the bus, or if I'm out-of-town. If I had everything digitized, or at least in some digital form, and available in the cloud, perhaps in Evernote, then I could use it anywhere, as long as I had wi-fi or cellular access.

What got me started thinking about this was the fact that in a few months, I'm going to have a 600 page (more or less) manuscript to edit. I don't want to have to print it out and lug it around, or take sixty pages at a time with me on the bus. It wastes a lot of paper and is difficult to manage.
Continue Reading Going Digital

Hacking Information

This image shows a technique that can be used ...
This image shows a technique that can be used to plot prime numbers in binary. (Photo credit: Wikipedia)

While eating breakfast this morning, I decided to finish watching "Will We Survive First Contact?," an episode of Morgan Freeman's Through the Wormhole, a nice series on Science that does a reasonable job of translating science into laymen's terms without simplifying too much. This episode dealt with how we might know when we encountered alien communication. The topic of aliens was just a vehicle for talking about information theory. Topic modeling made its appearance, though no one called it that.

One of the segments talked about efforts to understand dolphins. The problem with all the languages we already know is that they all come from the human mind. Trying to understand a language developed by a non-human mind helps us know what problems might crop up when trying to understand a language not from Earth.

Continue Reading Hacking Information

An Adventure!

In my last post, I talked some about the need to look across projects and find common elements that could be factored out. I'd like to start a series of posts in which I talk about some of the work I'm doing at MITH in developing some foundational libraries that we are using to build digital humanities projects. Along the way, I'll discuss some of the philosophy behind those libraries and our approach to the projects.

Today, I want to walk through the design of an example application I'm working on that implements the classic Adventure game as a JavaScript web application. I'm not finished with it yet, but the framework is there. I'm just adding content and tweaking some behavior now, such as handling darkness. You can go into the hut, pick up the key, and then go down and open the grate to get into the cave.

Continue Reading An Adventure!

Coding and Digital Humanities

Miriam Posner's post, "Some things to think about before you exhort everyone to code," has touched off a series of conversations on twitter and elsewhere. My own feeling is that she's nailing some things square on the head and, fortunately, doesn't conclude saying that we should banish coding from the digital humanities. We just need to be careful how we cast the need for coding.

I've tossed around a nugget in my mind for the last few weeks, and Mariam's post is making me focus more intently on it: A digital humanist afraid of the digital is like a scholar of French literature who is afraid of French. You can't be a digital humanist if you don't understand the digital. That doesn't mean you have to be able to code any more than being a scholar of French literature means you have to be able to write French literature. You just have to be able to understand the nuances of what you're studying and how you are studying it. Otherwise, how can you properly interpret the results?

Continue Reading Coding and Digital Humanities

The Role of Statistics

This entry is part 3 of 3 in the series Narrative Statistics

English: Hydrogen Density Plots for n up to 4.
Image via Wikipedia

In the Narrative Statistics series of posts, I'm exploring different ways to characterize fiction using statistics. I'm recovering from a flu or cold as well as a nasty cough that followed, so instead of delving into deep math, I want to review what I see as the role of statistics, at least for this series. Many people consider statistics to be magical formulae that give questionable answers. In the humanities, there seems to be a lot of mistrust for statistics because people don't understand them. 

I've been in the audience when someone has presented some statistical results and someone else comments that because the outliers obviously don't agree with what they already believe to be true, the outliers must be mistakes and thus the statistical method must be suspect. They then turn around and ask what statistics can provide other than reinforcing what they already know. They first throw out any new information and then ask what new information the methods can provide. The profound lack of logic mystifies me.

Continue Reading The Role of Statistics