Today, I want to explore what it means for something to be citable.
I come from the sciences, where citation is a shorthand for bringing in a body of work that you don't want to reproduce in your text. It's like linking in a library in a program. You're asserting that something is important to your argument and anyone can find out why they should believe it by following the citation. You don't have to explain the reasoning behind what you're referencing.
If you use citations to give shout outs to people in your field, then you don't need what I'm thinking about. Readers understand that these citations are to remind them about the other people and their body of work, not the particular passage pointed to in the citation. The details aren't important enough to look up.
I'm interested in the citations that people need to follow.
I've made some good progress on the OokOok project over the last week. The system has a minimal page management interface now, so you can create, edit, and delete new pages and place them in the project's sitemap. You can create project editions that freeze the content in time, and you can see the different versions of a page using time-based URLs.
You know you have a real software project when you have a list of things that won't be in the current version. So it is with OokOok. Eventually, I want to support any dynamic web-based digital humanities project and allow it to run forever without any project-specific maintenance. For now, I'll be happy creating a simple text content management system that has all the time-oriented features. We can add support for algorithms later.
Today, I want to talk a bit about the model I'm using to keep track of the different versions and the impact this has on the user interface.
I'm using part of my research time to dive back into a development thread I've been working on for the last decade. It started with the various Gestinanna packages through 2004 that are now on BackPAN, the archive of CPAN modules no longer considered actively distributed. I followed it a few years later with a set of Ruby fabulator gems that are in my github repository. I designed these to run in Radiant, a nice CMS for small teams. I used the Ruby/Radiant system to teach a course at DHSI in 2011 covering data acquisition, management, and presentation.
Now, I'm combining my experience and starting a new coding effort. I'll be posting code to github.com/jgsmith/perl-ookook/. There are a lot of design decisions that I hope to discuss in a series of blog posts. For example, why did I choose Perl over something new and shiny like Ruby or Node.js? Why didn't I use a venerable environment like Java and the JVM? Why have a backend SQL database instead of a noSQL database. For now, I'll try to discuss a few questions that have shorter answers.
I'm almost half way to my goal of 150,000 words for my next novel. Given how it's paced so far, I might need to aim for 200,000. However long the first draft ends up being, I intend to cut 20%. Hopefully, I'll cut the worst 20%, leaving a fairly decent 80%.
I'm a process kind of guy. If I know that I'll get to something later because of the process I'm going through, then I won't worry about it now. I'm this way when I program, and I'm this way when I write. Processes can make it easier to get around the tendency to overlook things that we're already familiar with.
I'm planning on a twelve step process for editing based on the chapters in Self-Editing for Fiction Writers. The book is an excellent resource for anyone who wants to try their hand at editing their manuscript. You still might want to pass your work by someone else, but going through Self-Editing will make subsequent edits less painful.
You might think that working in a digital humanities group would mean a lot less paper, but that's not the case. I have a folder for each project I'm working on, each filled with papers showing things like milestones, budgets, and work plans. Every time I have a meeting about a project, I pull out the folder(s) related to it and go through the papers to catch up with where we are.
The problem with having everything on paper is that I have to be where the paper is. If I'm at home, I don't have access to it. Same goes for the bus, or if I'm out-of-town. If I had everything digitized, or at least in some digital form, and available in the cloud, perhaps in Evernote, then I could use it anywhere, as long as I had wi-fi or cellular access.
What got me started thinking about this was the fact that in a few months, I'm going to have a 600 page (more or less) manuscript to edit. I don't want to have to print it out and lug it around, or take sixty pages at a time with me on the bus. It wastes a lot of paper and is difficult to manage. Continue Reading Going Digital
While eating breakfast this morning, I decided to finish watching "Will We Survive First Contact?," an episode of Morgan Freeman's Through the Wormhole, a nice series on Science that does a reasonable job of translating science into laymen's terms without simplifying too much. This episode dealt with how we might know when we encountered alien communication. The topic of aliens was just a vehicle for talking about information theory. Topic modeling made its appearance, though no one called it that.
One of the segments talked about efforts to understand dolphins. The problem with all the languages we already know is that they all come from the human mind. Trying to understand a language developed by a non-human mind helps us know what problems might crop up when trying to understand a language not from Earth.
I'm a quarter of the way through the first draft! I'm on schedule to finish the first draft by mid-June. Then, I'll spend the rest of June and all of July editing. If that goes well, I'll be formatting in August and publishing in September. I'll be writing about the editing process as I go through it. For now, I do most of my writing on the weekends. Evenings can net me about 500 words. I had hoped to get a lot more written during our spring break, but the days we had off weren't good for me. I did get other things done, and I've gotten back to some fast action, which is always easier to write.
If I divide the novel up into thirds, then we're almost at a third. Only 12,500 words to go. That's enough for about three more broad scenes or bits-of-things-happening. The reason this is important is because the first third of the novel needs to set up the overall problem, the second third needs to find the solution, and the last third needs to carry it out. There are always complications along the way, but that's the big picture for me.
The working title for my new novel is Silent Rain. When the novel opens, it's already been raining non-stop for a week or two. The reservoir up river from Sherman's family is overflowing and the dam is showing signs that it might go at any time. Pretty soon, it does collapse and all the water races downstream to wipe out the town below it. This sets off a series of events that finds Sherman searching for his family after he sees them get taken by an armed gang.
At this point, I have almost 31,000 words. Sherman hasn't found his family yet, but he has an idea of where they might be. He's run into a monster, scavenged for food, and escaped from someone. I think he'll eventually meet up with the rest of his family, but it may be a little while. Or it might not. He's about to open a door and explore a place where he might find them, eventually.
In my last post, I talked some about the need to look across projects and find common elements that could be factored out. I'd like to start a series of posts in which I talk about some of the work I'm doing at MITH in developing some foundational libraries that we are using to build digital humanities projects. Along the way, I'll discuss some of the philosophy behind those libraries and our approach to the projects.
Miriam Posner's post, "Some things to think about before you exhort everyone to code," has touched off a series of conversations on twitter and elsewhere. My own feeling is that she's nailing some things square on the head and, fortunately, doesn't conclude saying that we should banish coding from the digital humanities. We just need to be careful how we cast the need for coding.
I've tossed around a nugget in my mind for the last few weeks, and Mariam's post is making me focus more intently on it: A digital humanist afraid of the digital is like a scholar of French literature who is afraid of French. You can't be a digital humanist if you don't understand the digital. That doesn't mean you have to be able to code any more than being a scholar of French literature means you have to be able to write French literature. You just have to be able to understand the nuances of what you're studying and how you are studying it. Otherwise, how can you properly interpret the results?