Linked Open Code: Libraries

A while back, I talked about how linked open code complements linked open data if we consider the web to be a computer system. This time, I explore what a linked open code (LOC) library might look like. That is, how would a collection of functions be published and used?

The emerging standard for sharing linked open data (LOD or LD) is JSON-LD mainly because it's easy to use and plays well with JSON-based REST APIs. That was by design. Consider JSON to be the modern XML, but for data rather than documents, and JSON-LD the modern RDF/XML. Everything I talk about in this post could be done with RDF/XML, but is a lot easier with JSON-LD.

A library is just a collection of functions. A class is a collection of functions operating on a common data structure (or type). We should be able to do both with LOC since LOD has the concept of types.

Let's say that we're building a genealogy REST api and have a resource representing a person. Each person will have a set of names, links to families, and birth parents. Families and events like births and deaths are resources with their own URLs and REST endpoints.

Let's assume that a person resource has a type of http://example.com/gene/1.0/Person. Then we could publish a document at that URL describing what properties to expect for a person resource (e.g., your typical RDF schema) along with a set of functions that work with person resources.

We set up our context first, bringing together the namespaces and properties for LOC and those for the genealogy api.

We probably don't have to provide the @id property, but it helps keep identifiers consistent across serializations since some web servers use file extensions instead of content negotiation to determine which serialization to send to the client.

At this point, we've defined what kind of resource we're describing: an RDF class and linked open code library. Now to jump into the nitty-gritty of creating functions.

For this, I'm using a language I've code-named Dallycot.

This is really just one function definition with a couple private functions (by-nature and by-nurture with their own private functions in turn).

This looks like we're mixing a programming language into JSON, but just as Scala allows XML literals, Dallycot allows JSON, but with Dallycot data types as values. This lets members be an array of named function objects with the @id for each function object built from the @id of the library and the name to which the object is assigned.

Behind the scenes, the JSON builder is running the code and inspecting the environment to see what was defined. Nothing too magical there.

The resulting function takes a person resource and an option to select finding the person's parents by birth, or by family (e.g., if the person was adopted).

You might find it odd to provide an option that can tweak the behavior of the function instead of having one function for finding birth parents and another for possibly adoptive parents.

The option allows us to concentrate the difference in this one function. We don't have to worry about which parent finding function to call elsewhere.

For example, let's define a function to find the grandparents of a person.

Now, we can call grandparents(person) and get the person's birth parents, or we can call grandparents(person, by -> "Nurture") and get the person's possibly adoptive parents. The grandparents function just passes along to parents which set we're interested in.

The great thing about this is that we can specify the algorithm for determining parents and grand-parents with precision by referencing the library definitions. We can even create examples that can act as tests: given this data, when we apply the grand-parents function, we expect to get these people.

We could add more functions for managing genealogies, but these are enough to make the point that we can create libraries of algorithms with precise definitions that can be shared as if they were data. We can point to them using techniques from linked data to allow a machine to reason about the algorithms on two different levels: semantically using the tools of the semantic web, and computationally through executing the definition of the algorithm.

Of course, we could do this with any language's abstract syntax tree. We just choose to represent it with linked data and call it code.