Sustainability
LINCS was conceived and proposed with both data and software sustainability front and centre in the context of a changing funding environment.
Sustainability funding
LINCS was funded as a multi-institutional collaboration through a program intended to establish national research cyberinfrastructure for data mobilization and sharing. The Canada Foundation for Innovation (CFI) Cyberinfrastructure program, started in 2015, was intended to be iterative with funding criteria less focused on innovation and more on operations and maintenance. It was created in light of increasing awareness that the usual operating funds provided by other CFI programs were not adequate to support ongoing, multi-institutional funding of production-level infrastructure.
In 2019, the federal government established the Digital Research Alliance of Canada (DRAC); with its responsibility for research software, DRAC was intended to provide a more stable, integrated program for such support. Consequently, the CFI Cyberinfrastructure program and the longer-running CANARIE Research Software program (which addressed the need for operations funding) were both dissolved. As of 2025, DRAC has not established any open, competitive programs to replace CFI and CANARIE support, so there is a funding void within Canada at the relatively modest levels of funding required for humanities research software. The only national program for infrastructure operations has as its eligibility requirement five years of prior operations with an annual budget of $1M, effectively excluding infrastructure for the arts and humanities.
Canada, in fact, does not fund research infrastructure on a long-term basis at all, unlike other countries: for instance, France has a record of providing multi-year mandates to its substantial array of humanities research infrastructure within an integrated national system. Even DRAC itself is funded through annual budget submissions to Innovation, Science and Economic Development (ISED) Canada.
Despite the attempt at the federal level to establish more stable funding for research software such as LINCS, then, there was literally no funding program in Canada to support research software operations beyond the modest amounts provided by CFI when LINCS completed its development phase in May 2024. The CFI model is not designed for human-centric infrastructure. LINCS has therefore continued to work towards sustainability through hosting symposia, strengthening partnerships in the broader knowledge and cultural heritage ecosystem, seeking continued funding through the CFI Innovation Fund competition, consulting on other possible funding possibilities, and partnering with research projects on funding applications.
People and partnerships
When we talk about sustainability for LINCS and other humanities infrastructure projects, we are talking almost entirely about people. Thankfully, LINCS has been able to rely on DRAC’s allocation of advanced research computing resources through the peer-reviewed Research Platforms and Portals competition. This program provides LINCS’s physical resources, as it has covered LINCS server, storage, and backup needs from inception. What keeps LINCS ticking, however, is people: the professional staff, faculty participants, and student assistants who consult with researchers, model the data, assist with data transformation processes, ingest and publish the data, maintain and enhance the software, provide training to grow the linked data knowledge community, and much more. The process of creating high-quality linked open data tailored to individual research questions is iterative, laborious, and requires substantial expertise.
The LINCS team is already somewhat smaller than it was during the development period, which was to be expected. Given the lack of significant sustainability funding, the team—and therefore the operational capacity of the infrastructure—will have to shrink further in the short term. From the outset, we have been aware of the possibility of fluctuation in funding and have therefore documented and designed our tools and workflows to allow for contraction as well as expansion. Contraction does, however, entail careful prioritization, and it limits our ability to support new projects and initiatives without dedicated funding.
LINCS has already benefited from ongoing partnerships with other stakeholders in the knowledge ecosystem, including through generous support from organizations including Library and Archives Canada and Scholars Portal, as well as fruitful partnerships with researchers who are keen to explore the potential of linked open data within their work. We look forward to continuing our work through these and other partnerships.
Interfaces
Humanities scholars have a tendency to focus on interfaces when thinking about data, for good reason, because we know that representation matters and that digital interfaces are powerful forms of representation. However, interfaces age quickly, both technically and aesthetically, and so pose major sustainability challenges. A major advantage of linked open data (LOD) is that it is interface-agnostic, so it can move from one interface to the other or serve more than one interface at a time. LINCS data is modelled with portability in mind: the knowledge graphs produced using LINCS methods and workflows carry with them provenance and other information to help contextualize and situate the knowledge they contain. Such nuanced LOD is not human-readable given the complexity of the knowledge it represents, but that nuance means that it can be represented in varying ways by multiple interfaces for different purposes.
The CFI Cyberinfrastructure program was for mobilizing data; this meant that the focus of LINCS was on developing a suite of tools, workflows, and documentation for transforming cultural data into linked open data. The project recognized the need for interface access, however, and took the light-weight approach of adopting existing interfaces for access to LINCS data.
LINCS data is accessible to humans and machines through multiple interfaces:
- SPARQL Interfaces: The SPARQL query language underlies the “standard” interface for LOD, making it accessible as part of the Semantic Web. The LINCS SPARQL endpoint is embedded in the LINCS Portal, which was built as a static website with sustainability in mind. ResearchSpace also embeds a SPARQL endpoint. Machines can read from SPARQL endpoints, and researchers can use SPARQL queries to retrieve targeted answers to questions or create subsets of data for analysis and visualization.
- Application Programming Interfaces (APIs): The LINCS API provides machine access to the LINCS knowledge graph and serves as gateways to related LINCS services, including Named Entity Recognition and entity linking, both of which are used by tools for linked data creation such as LEAF-Writer and NERVE. The LINCS Linked Data Enhancement API supports the LINCS data transformation workflows. APIs make it possible for other interfaces to draw on, and draw in, LINCS data.
- ResearchSpace Platform: ResearchSpace is a web-based platform for publishing and accessing LOD. It can be used to create new LOD, edit and enhance existing LOD, and search, browse, and visualize the contents of the LINCS triplestore.
- Vocabulary Browser: The LINCS Vocabulary Browser provides access to vocabularies created by the LINCS team and community. These living vocabularies can be used to create and enhance datasets that require semantic specificity, particularly with respect to cultural identities and activities.
- Context Explorer: The LINCS Context Explorer is a browser-based tool that takes LINCS LOD to where users are on the web. Using natural language processing, the Context Explorer scans webpages for entities that appear in the LINCS triplestore; for every match, it provides additional information alongside the page’s original content.
In addition to these web interfaces, LINCS data—available directly from the LINCS triplestore and the Borealis long-term datastore—can also be used (usually in subsets) in other interfaces for analysis and visualization (e.g., Gephi, Cytoscape, Tableau, and RAWGraphs).
LINCS built a modular suite of tools rather than a single integrated platform as part of its sustainability strategy from inception, since platforms are, as Paul Edwards says, “infrastructures on fire”: they age quickly and are expensive to maintain. ResearchSpace, for instance, was the best suited to LINCS’s needs of the open-source platforms for CIDOC CRM data available at the project’s inception. LINCS will continue to maintain our enhanced instance of ResearchSpace and evaluate emergent platforms as capacity permits.
Data and PIDs
Linked open data is designed to last. LINCS datasets are archived individually, post-publication, with appropriate metadata provided by project representatives, in Borealis, Canada’s national data archiving service, in partnership with the University of Victoria Library, and they are updated as needed. This ensures long-term preservation, discoverability, access, and reusability, and it provides each project dataset with a Persistent Identifier (PID). This ensures that the dataset is citable and that its creators receive appropriate credit for their work. See the LINCS Data Management Plan for details.
Linked data depends upon PIDs, stable identifiers that consistently and unambiguously point to a single entity. For example, in the scholarly landscape, academics are often identified using ORCIDs, while publications are identified using DOIs. PIDs are essential to creating integrated, interoperable LOD. In keeping with the core principles of linked open data, where possible, LINCS uses existing PIDs created and maintained by authoritative, stable sources (e.g., VIAF, GeoNames), relying on their continued availability through the infrastructure of other knowledge environment stakeholders. However, existing PIDs are not available for every LINCS data point, and some existing sources for PIDs do not meet LINCS’s standards. Many existing terminologies related to Indigenous peoples, for instance, perpetuate harm.
There are many different types of URIs:
- Prominent entities, for example for named persons, organizations, or cultural artifacts, which are often available in existing authorities that provide external PIDs.
- Vocabularies for concepts such as genres of texts, personal occupations, or categories of objects, for which there are sometimes external authorities.
- Dataset-specific identifiers that are essential to a particular dataset but are not likely to be extensively reused, for example a URI for the creation event for a particular recording or cultural object, or an identifier for an annotation that links an assertion to its source on the web.
LINCS has been consulting since the application stage with cultural heritage institutions and knowledge environment stakeholders on the recognized need for Canadian linked data authorities to support the emerging linked open data ecosystem. We are involved in discussions about the development of a National PID strategy being coordinated by CRKN. Progress is being made on various fronts, for instance, in initiatives such as the Respectful Terminologies Platform Project and specific advances such as the Canadian Heritage Information Network’s release of a SPARQL endpoint for the Nomenclature vocabulary. However, no national system has been established for creating, publishing and maintaining the kinds of PIDs for historical entities or the range of vocabularies needed for the range of cultural heritage content and historical data represented in LINCS. LINCS is partnering with these and other organizations to decide on the wisest and most sustainable strategy with respect to the PIDs that it mints.
The highest priority with respect to PIDs is to keep serving them as dereferenceable URIs. This core sustainability consideration takes precedence over the maintenance of any particular interface for the data. Should that prove impossible in the long term, LINCS would seek to export our identifiers and their core properties such as labels to the care of another infrastructure such as CRKN or Wikidata (for which we are establishing a LINCS identifier property). Even when particular infrastructures disappear, their data and their PIDs can live on in a linked data world. For instance, much of Freebase, a formerly open and community-sourced open data project purchased and subsequently shut down by Google, which used it to create the basis for its Knowledge Graph, has been migrated to Wikidata. Although differing data models made seamless migration impossible, the Wikidata Freebase ID property has become one of many IDs that serve to link data within the linked open data ecosystem. To future-proof data as much as possible, LINCS asks data stewards, ahead of transforming and publishing their data, to indicate whether they are willing to migrate their data from a CC-BY license (the license used for data in the LINCS triplestore) to a CC0 license, since a CC0 license will best facilitate migration into other infrastructures down the line, should that become necessary.
Linked open data is designed to travel. Through adherence to standards such as CIDOC CRM and the Web Annotation Data Model, through adoption of external authorities where possible, and through minting and maintaining URIs in standard ways, LINCS is positioning datasets for future use as FAIR (findable, accessible, interoperable, reusable) data that can migrate, if needed, to new homes.
Looking ahead
There is currently a large gap in the funding environment in Canada with respect to research software, which means there are no guarantees of sustainability. However, awareness and collaboration are growing in encouraging ways within the cultural data ecosystem. LINCS continues to engage in national conversations towards achieving stable and better resourced infrastructure for the humanities and social sciences, and for the cultural heritage and memory organizations on whom our research relies, in multiple contexts including the Forward Linking Partnership. Established in 2023 and supported with a SSHRC Partnership Development Grant, the Forward Linking partnership brings together partners from 14 institutions to strengthen the Canadian cultural data ecosystem, inform a sustainability strategy for a national investment in scholarly infrastructure, and demonstrate the value of data linking as a way forward for digital scholarship and cultural memory institutions in Canada. Among its many outputs, the Forward Linking partnership is working towards an analysis of the Canadian linked cultural data ecosystem, which will include an assessment of the landscape to date and policy and protocol recommendations for sustaining cultural and scholarly infrastructure.