My DH Agenda

A few months ago, I accepted a job outside the academy. This doesn't mean that I'm abandoning digital humanities. In this post, I lay out what I want to do in DH going forward. The common thread through all this is that I believe linked open data is the best way to break down the silo walls that keep digital humanities projects from sharing and build on existing data.

DHData.Org

I've been working on this site for the last year. It embodies my belief that data is the most important on-going artifact of a digital humanities project. Sharing open source software and open access publications are great, but sharing the data completes the triad. Instead of producing beautiful websites that are barriers to data reuse, scholars should be publishing their data in ways that let other scholars build on their work. When the beautiful website fails because no one is around to maintain the server, the data will still work.

The site combines sections for cataloguing data sets with recipes for working with data. Both assume that contributors have moderate expertise in using computers. This is the digital humanities.

This site also serves as an example of a project designed around scholarly concerns. The site will remain readable regardless of changes in Ruby or any other programming language: it won't disappear just because there's no one to update the software. The complete history of the site's content is available for anyone wanting to see what the site said in the past. Anyone can make a copy of the entire site and do with it whatever they want within the limits of the creative commons licenses.

DHData.Tumblr.Com

This Tumblr is a companion to the DHData.Org site. About once a day, I will post a link, video, quote, or other snippet related to linked data and digital humanities. Feel free to follow along or browse the history. Links to posts are tweeted through @DHData_Org.

Linked Data

Publishing data is great, but for the data to be useful it must fit with other data. Linked data provides the means by which different sets of data can mesh together. Linked data and the semantic web are sufficiently the same thing when used in the context of the digital humanities.

Linked data published on the web in a way that lets the computer traverse the information without requiring human intervention provides a form of random access memory. One of my side projects is to develop a processing system that can use this RAM as if it were local to the processor. I've dropped some hints in past posts.

I'll post more when I have code to share. For now, I'm prototyping in Perl with plans to move to Scala if/when performance becomes a factor.