The NEH and other U.S. federal government agencies are pushing the digital humanities projects to result in something that can be shared. If this is an application that people can use, especially an application that resides on a central server, then the NEH is also wanting provisions for long-term maintenance. Ultimately, digital humanities projects should seek to be a resource that other scholarly work can build on. In this post, I want to explore what this might mean for web-based applications.
I was looking around the web for references about EAD, an XML vocabulary mentioned in a Digital Humanities Working Group meeting Monday. I could see cases where people would want to have documents marked up with both TEI and EAD.
XSLTs basically describe a function that is applied to an XML document resulting in another document (not necessarily XML):
D = f(X) where
X is a subset of
D (for a particular document, I'd say:
d = f(x)). We usually are given
f and asked for
D, but I'm wondering if we could be given
X and find
This is definitely a pure computer science problem, but it has digital humanities applications. A web search shows some work in this direction, but usually having people manually map elements between the two document sets to generate the XSLT.
Another thing that would come from this is a way to rank XML vocabularies based on their expressive range. If we have two sets of documents (
B) based on two different XML vocabularies, then if an XSLT exists that maps
B, but no XSLT exists that maps
A, then the vocabulary for
A could be seen as having a larger expressive range than that used for
B. That would let us have a more solid foundation for saying that TEI is more expressive than Docbook (which I believe it is, but don't have good data to base that belief on at the moment).
I can manually create XSLTs to go from TEI to Docbook to HTML because I believe there's a loss of information from one format to the next (ignoring the pushing of that information into CSS at the final HTML stage) and because Docbook is a publishing vocabulary and HTML is, with CSS, a de facto typesetting vocabulary. The information isn't so much lost as transformed from semantic to presentation, with the person reading the resulting document adding back the semantic information based on the presentation. The semantic information though is removed from a readily computer-understood form: it's gone from a context-free to a context-dependent form.
I've given the novel I'm writing for my thesis the working title, Of Fish and Swimming Swords. I don't have names for the second or third novel yet, but ideas are beginning to come together. They'll complete the arc begun in the thesis.The last two nights, I've woken with farely vivid dreams. Dreams aren't useful in their raw state. If you actually transcribe a dream, it won't make much sense because dream logic isn't sufficiently realistic. But dreams can provide interesting settings and plot pointers. That's what these two dreams have done.
This series of posts writen for the Emerald Dream forums tries to walk through the design of World of Warcraft and Emerald Dream. We will explore how WoW is designed and where guilds fit into that design. We will also take a look at how guilds should be organized based on similarities to real world organizations. Finally, we will take a look at Emerald Dream and its structure with a focus on understanding why it is designed the way it is. Hopefully, by the end of this series, everyone will have a better understanding of how everything works and how they can best fit in. We want everyone to feel that they are part of a family.
One of the problems in web application design is the disconnect between traditional programming languages and the statelessness of the web. There are ways to work around this, storing session information in hidden fields, setting cookies and tracking session information there or on the server. There are languages designed for the web such as PHP and ASP. Traditional languages are made to work with the web: Java and Perl being two big examples. But none of these capture the nature of the client/server model fundamental to web applications. All of them require some reinvention of the wheel each time an application is built.
Abstract: We explore string comparison, graph theory, and dimensional analysis and their implications in computational textual analysis. In the process, we develop some expectations that can be tested on a large text such as Beowulf, though we only lay out those expectations and do not test them due to the computational requirements for doing so. We draw from Old English vocabulary for our examples.