Open Context's Technology
Open Context integrates 100% open-source technologies to publish archaeological data via the Web. These technologies inlcude:
- Django (Python): Web application framework
- PostgreSQL: primary database
- Apache Solr: for fast faceted search and querying
- Redis: memory cache for performance
- Nginx: Web server
This page provides an initial introduction to Open Context's technology stack and how you can use it and contribute to it.
Internally, Open Context has a highly abstracted and generalized global schema for representing data. This general approach takes its inspiration from the data structure developed by the OCHRE project (originally called "ArchaeoML"). However, to reduce development and maintenance costs, we opted to implement simplified versions of these generalized models in common and easily deployed relational database systems (in our case PostgreSQL).
Over the years, Open Context has evolved from ArchaeoML and moved toward Linked Open Data approaches to data organization. In essense, the current internal data model of Open Context largely looks like a graph-database structure commonly used for RDF triple-stores. However, we do not actually use a triple-store. Our experience managing data from many sources over the past 10 years has shown us that we often need additional attributes describing data provenance, context, etc. We decided that our day-to-day reliance on these additional attributes would make RDF-only triple-stores a bit awkward and cumbersome for our typical information management needs. Thus, Open Context mainly emphasizes RDF and Linked Open Data to relate the data it publishes with the data curated by external sources.
Source Code and Version Control
Open Context is a Python 3 application built on the Django project. Where feasible, we try to keep to "plain vanilla" coding patterns and use of Django components. Because Open Context emphasizes Web interoperability, it uses its own APIs to generate views of individual item records and search results. As described here most of Open Context's APIs provide JSON-LD formatted data.
The source code for Open Context carries a GNU General Public License (Version 3). We use GitHub for software version control, issue tracking, documentation and collaboration. Relevant code repostories are:
- Open Context Python Application: the primary software code repository for Open Context. This has source code, deployment instructions, and additional documentation.
- Open Context Ontologies / Controlled Vocabularies: Open Context uses a variety of ontologies and controlled vocabularies described in OWL and SKOS. While still incompletely documented, these versions for these vocabularies are tracked in this repository.
- Open Context Data Repositories: Open Context has used GitHub for version control of datasets. Currently these repositories provide access to older legacy versions of Open Context data in XML format. We will be updating these shorting to add current data in JSON-LD format once we've finished GitHub API integration.