- Openness I - Open Data
- Text and Multimedia
- D. Schmüdde and Alexander Rakoczy
- Samyuktha Varma, Hollis Coats
Introduction: Ancient Digital Culture
It took a group of engineers most of 2015 to resurrect two megabytes of information from a recently-discovered stash of 5 1/4″ floppy disks authored by Star Trek creator Gene Roddenberry. Roddenberry used a rare computer called a Lexoriter. No technical documents relating to its custom-built word processor remain, so the team had to painstakingly reverse-engineer the software and convert the data into files that a contemporary computer could read.
The engineers faced a problem of “interoperability,” the ability of one machine to read another machine’s data. For the most part, we take this for granted today. The internet is the sum of a large number of interoperable machines exchanging information. The vaunted cultural values of the internet age – global interconnectivity and openness – fundamentally depend on this non-trivial operation.
Although interoperability may seem like a technical term for a technical problem, most obstacles that prohibit the sharing of information are created by economic, social, and political conflicts. Roddenberry’s two megabytes aren’t just any two megabytes: they can hold anything from lost Star Trek episodes to enigmatic personal notes. Resurrecting this information was only possible because of the intense curiosity and financial resources associated with Roddenberry’s fame.
Part I: The Trust Quotient
I learned that my friend was diagnosed with Ebola from the American Centers for Disease Control. They called me after his diagnosis because I had just seen him the night before, which meant I needed to be monitored. Even though infection was highly unlikely, I was entered into a database as “Patient 7.”
Enumerating the diagnosed and those with whom they came into contact is a core strategy when fighting a burgeoning epidemic. Accurate data speeds up diagnoses and early treatment curbs a disease’s spread. Whether in the United States or Western Africa, the protocol is similar, and its effectiveness depends on disciplined enforcement and reliable, interoperable systems – both of which were scarce during the Ebola crisis.
It’s tempting to dismiss this as a problem inherent to working within Africa. Limited infrastructure often forced teams to collect information through an in-person interview recorded to paper. Paper databases are inherently non-interoperable with digital databases, so they must be transferred by hand. The process is slow and prone to error. The Liberian Ministry of Health estimates that fifty independent technology systems were created in the Ebola response alone, leading to redundant data and an incomplete picture of what was happening on the ground.
Trust was also an issue. Civil war was a recent memory in Liberia and the outside aid organizations were filled with unfamiliar faces. People were being asked to enter into a database they did not understand, ushered in by people they did not know, while a strange disease ravaged their communities.
When humans are the data entered into a database, the information must be handled with extreme sensitivity to ensure that it can be open and shareable while also respecting the dignity of the individual. Without the latter, people quickly lose confidence and hide or provide inaccurate information.
Broad legal and technical frameworks are necessary because they are the foundation of interoperable systems and human rights: two concepts that are quickly merging into different sides of the same coin.
My anonymous moniker, “Patient 7,” was issued to protect me from the disease’s stigma; it only took four Ebola first diagnoses in the United States to cause the nation to panic. Well-conceived protocols were broken and information was leaked. Friends of mine received death threats and others I didn’t know personally were placed into quarantine tents. The City of New York put me under house arrest for no medical reason, affecting my family and friends and compromising my employment.
When protocols break down, digital systems break down, and database interoperability becomes more difficult as the quality of the data comes into question. Humans are messy creatures, and we do not fit neatly into the digital world. Interoperability as a property of a system is more than just good design, it’s a critical feature. Combined with openness and transparency, these systems can respect the rights of individuals and save lives in a time of crisis.
Part II: The Encoded Self
I’m from the first generation of Gen Xers and Gen Yers to grow up with the home computer. My personal history isn’t just a collection of photographs and letters, it’s a box of 5 1/4″ disks like the ones that Gene Roddenberry used, along with 3 1/2″ disks, Zip disks, burned CDs, and hard drives with obsolete connections.
Worse yet, most of the information these disks contain was created using long-forgotten software: Splash! (images), Framework III (spreadsheets), Telix (communication logs), and Notator (songs). Megabytes of personal history sit inaccessible, short of rebuilding an original machine that can read and interpret the data.
Simple text files have proven to be the only format interoperable through the decades. As a result, they are the only artifacts that remain from my own digital history.
Text is “simple” because personal computers, tablets, and smartphones all use a common encoding format that dates back to the 1960s. The standard has been expanded to include character sets like Chinese and Arabic. Through decades of deliberate effort, Japanese websites can display text in Japanese on my American smartphone.
The only machines in common use today that shun this encoding format are IBM mainframes. They use an even older character standard. The importance of this encoding format can be seen by clicking this button, which converts common web encoding to IBM’s EBCDIC standard:
The data still exists, it’s just useless if it’s encoded in a way that your machine can’t interpret. Digital telecommunications have forced machines to become more interoperable, and the web browser is the ultimate manifestation of this trend. Webpages and web apps can be run on your Android phone, your Macintosh laptop, and your office Windows PC. More importantly, information created on any of these devices can be transferred, manipulated, and passed onto another device.
However, corporate interests have intervened in recent years to create artificial walls between these services. One cannot simply transfer a Twitter profile over to Facebook, or message a Snapchat user with Apple’s iMessage. In the sense that they are all built to transmit text and images, these platforms aren’t particularly novel, they’re just designed to be incompatible.
We are increasingly encoding and storing our personal life histories on computer servers beyond our physical reach and the legal boundaries of government enforcement. Private enterprises are now the stewards of our data. It will be bought and sold at a price determined by the market.
If we look at the data that has persisted over the last half-century, it’s clear that interoperable systems that adopt some degree of openness fare better than those with a single organizational or individual stakeholder. Market forces don’t think long term and individuals aren’t good at predicting the value of information over a thirty year span.
For example, take a look at this post from 1981. It’s from an early internet network that predates the world wide web called Usenet. The designers of Usenet valued openness. Therefore, it was relatively simple to place Usenet’s gigabyte-sized archive on the web and retain its hierarchy and essential character. At the time of this writing, the latest reply was April 2015 – almost thirty-five years since the thread began.
Interoperability is the principle that ensures that data from 1981 can be used in 2016 and then easily shared between Monrovia and New York City. From the humanities to health services, our lives are becoming increasingly intertwined with digital networks. As the newly integrated world takes shape, access and openness are no longer aspects of a technical specification, they are a moral imperative.