Curating Dynamic Digital Humanities

Perl
Perl (Photo credit: Wikipedia)

I'm using part of my research time to dive back into a development thread I've been working on for the last decade. It started with the various Gestinanna packages through 2004 that are now on BackPAN, the archive of CPAN modules no longer considered actively distributed. I followed it a few years later with a set of Ruby fabulator gems that are in my github repository. I designed these to run in Radiant, a nice CMS for small teams. I used the Ruby/Radiant system to teach a course at DHSI in 2011 covering data acquisition, management, and presentation.

Now, I'm combining my experience and starting a new coding effort. I'll be posting code to github.com/jgsmith/perl-ookook/. There are a lot of design decisions that I hope to discuss in a series of blog posts. For example, why did I choose Perl over something new and shiny like Ruby or Node.js? Why didn't I use a venerable environment like Java and the JVM? Why have a backend SQL database instead of a noSQL database. For now, I'll try to discuss a few questions that have shorter answers.

Why "OokOok"? For two related reasons: I have the ookook.net and ookook.org domains, where I could host a copy of the software once I get it far enough along, and I like to think that I'm enabling the long-term preservation of digital projects. Perhaps like an ape of sorts in digital libraries who doesn't fit in with the rest of the field but is trying to have an impact anyway. Probably all I can say is "Ook Ook," like the librarian in the Discworld series.

What's the point of the project? Primarily, to make an on-line, web-based environment for building data-oriented research projects with a narrative context and provide for the long-term availability of the research project. Present day digital libraries are good at holding static documents such as PDFs, images, movies, and audio recordings. They can even manage collections of static documents, such as might be produced by spidering a website. What they can't do is capture the dynamic, data-oriented, interactive nature of a web-based project backed by server-side programming.

What about the past code efforts? The Gestinanna series of Perl modules never took off. The fabulator series fared a little better. My intent is to offer a migration path from a fabulator-based project to the OokOok system. It will be no different from saving a file from one program and loading it into another. However, there is not a significant install base for the fabulator-based system, so I won't consider that a significant constraint on the overall development of OokOok. It just means that OokOok may have more or different capabilities that a fabulator-based project will fit well enough to run while not taking advantage of everything.

I've learned a lot of what to do and what not to do in the past projects. I intend to use those lessons here. If nothing else, I'll try to make new mistakes.

Take a look through my fabulator-oriented posts to get an idea of what I was aiming for in that project. The main problem with fabulator was that you had to manage the raw XML descriptions of the applications. I intend to correct that with OokOok so that you just have to worry about how components connect to each other using an easy-to-use interface.

What's the basic model? I'm still working this out, but the idea is to offer a way to create project editions that you can refer to without worrying that other people will see something other than what you did. This would let you cite a project based on the date you accessed the project. Whether this applies to the data in the project or just the applications and textual content of a project is still up in the air. It is conceivable (and desirable) to have a project edition that allows you to change data (such as adding new records or correcting information). The question is whether or not to tie any data available to the project's components to the access date in the citation.

Why a new system? I'm trying to make a platform that lets you create a project and forget about it. The project should not require further investment in per-project maintenance. We tend to write projects in languages and frameworks that have a short lifetime. When we build projects with languages such as PHP, language updates represent a major maintenance nightmare because someone has to go through the code and make sure it will still do what it's supposed to do with the new language. Otherwise, the project disappears.

Instead of trying to keep PHP, Perl, or some other lower-level language alive for the next fifty years, I'm trying to create a system that lets you describe your project in a way that the underlying hosting system can change without affecting your project. Think of it as a player for your project. Just as you can use different players to play an audio recording, you don't have to worry about which player is running your project.

One way around this is to build your project in a system such as WordPress or Omeka. As long as you don't need any project-specific programming, someone is maintaining the frameworks and plugins, and someone is paying the hosting fees and applying updates, your project will be available.

OokOok won't do away with all that. Someone will still need to install releases and updates. It will need to run somewhere. There will probably be project-specific programming. What OokOok can do is amortize the costs across a large number of projects so that the incremental cost for a project is negligible.

Instead, OokOok seeks to intervene early in the project's development. Instead of preserving a project after it's built, OokOok wants to be there from the beginning to curate the project, a sort of "cradle to grave" system for building and managing projects.

I have a lot of other ideas floating around that could make OokOok the platform of choice for a certain class of web-based digital humanities projects. I'll be posting about them over the next few months as I get code fleshed out.

I'll be using somewhat agile development methods on the code, so what's there now may change as we move from testing simple database-oriented features towards the creator-oriented features. This means you can check out the tests to see which capabilities should work.

What features would you consider critical in an environment for building interactive, data-oriented digital humanities projects that are citable and available long-term?

One thought on “Curating Dynamic Digital Humanities”

Comments are closed.