World War I as Linked Open Data

Linked Data Finland

General Dataset Description

This dataset contains strictly quality-controlled rich information on events, actors and places related to the First World War. As such, it is meant to be used as a reference dataset to which other datasets (e.g. museum or library collections dealing with WW1 topics) can be linked.

The dataset is published using the CC-BY-SA 4.0 license, and is based mainly on joint work by the Semantic Computing Research Group at Aalto University and the University of Colorado Boulder (CU). For a general introduction to the dataset, the reader is referred to this article.

The dataset itself can be browsed here. A visual statistical overview can be accessed at http://ldf.fi/ww1lod/void/. Textual schema documentation generated from the dataset, on the other hand, can be viewed here. The dataset can be queried using SPARQL at http://ldf.fi/ww1lod/sparql.


Sample Usage


Detailed Dataset Contents

Imperial War Museum (IWM) Toplevel Events

View / Statistical Description  / Schema Usage / Download )

To provide a useful common base, general events pertinent to the whole war were included. For this, an authoritative framework of 326 top-level wartime events was provided by the IWM’s First World War Centenary Partnership.

This event timeline was principally derived from the official British series on the history of the war, the History of the Great War Based on Official Documents, particularly:

Great Britain, Committee of Imperial Defence. Principal Events, 1914-1918. History of the Great War Based on Official Documents. London: HMSO, 1922.

Additional published works were used to verify dates and facts.

Information included:event name, description, date(s), and whether a military, naval, aviation, political, or social event.

The primary schema used to model the data is the CIDOC-CRM.

Detailed source information:

Imperial War Museum

The IWM is often considered the premier cultural heritage institution in the English-speaking world relating to the war. Thus both historians and cultural heritage professionals consider the IWM’s vocabularies authoritative, and they are likely to be re-used by others who are preparing datasets in this subdomain.

Rich Events

View / Statistical Description  / Schema Usage / Download )

While authoritative, the IWM events did not contain place or actor information. To overcome this limitation, a separate catalogue of some 250 events selected by domain experts was built for richer description, including annotation of places, participating actors and temporal relationships.

The information entered is drawn from various sources, including approved terminologies from the Imperial War Museum and the British Army’s Battle Nomenclatures Committee, as well as a custom term list on Belgium and WW1.

These events were manually linked to the top-level events where appropriate, resulting in 46 owl:sameAs links. In addition, all events have been automatically linked to DBpedia, with a little over 100 owl:sameAs relationships. These latter links were validated by domain experts.

Information included: name, alternate names, description, agent, time of action, place of action, is contained in, contains, cause, effect, same as

The primary schema used to model the data is the CIDOC-CRM.

Detailed source information:

Musée Royal l'Armée et d’Histoire Militaire

The Musée Royal l'Armée is considered the authority on matters relating to the war in Belgium. It published Patrick Lefevre's standard bibliography on this topic, from which a librarian and historian derived much of the term list on WWI Belgium. Historians specializing in WWI Belgium and France also reviewed and supplemented the term list.

Atrocity Events in Belgium

View / Statistical Description  / Schema Usage / Download )

WWI historians John Horne and Alan Kramer (Trinity College, Dublin) wrote the standard work on the “German atrocities” of 1914. With their permission, detailed data on these incidents in Belgium was sourced from Appendix 1 of their vast study:

Horne, John, and Alan Kramer. German Atrocities, 1914: A History of Denial. New Haven: Yale University Press, 2001.

Information included: name, agent, time of action, place of action, combat related, deportations, human shields used, panic, destroyed buildings, killing

The primary schema used to model the data is the CIDOC-CRM with additional properties created in the ww1lod-schema namespace.

German Army Structure

View / Statistical Description  / Schema Usage / Download )

Information on the naming and organization of the Imperial German army units mentioned in was derived from the following trusted reference source:

Tessin, Georg. Deutsche Verbände und Truppen. Osnabrück: Biblio-Verlag, 1974.

Information included: name, unit type, part of

The primary schema used to model the data is the CIDOC-CRM with additional properties created in the ww1lod-schema namespace.

Other Actors

View / Statistical Description  / Schema Usage / Download )

Other than the German Army structure, actor information has been input by CU domain specialists in conjunction with enriching the event network. During this work, actors have not only been linked to the events they participate in, but also to each other and the organizations they belong to.

Information included: name, alternate names, organizational information, relationship information

The primary schema used to model the data is the CIDOC-CRM. Additional vocabularies used for organizational and relationship information include the W3C organization ontology, the relationship ontology, FOAF and the schema.org vocabulary.

Geography of Belgium and France

View / Statistical Description  / Schema Usage / Download )

Information included: name, alternate names, part of, coordinates

The primary schema used to model the data is the CIDOC-CRM. Coordinate data is expressed using the W3C geo vocabulary.

Belgian Statistical Data for the War Years

View / Statistical Description  / Schema Usage / Download )

Statistics on the population of Belgian provinces during the war years were sourced from:

Belgium. Ministère de l’Intérieur et de l’Hygiène. Annuaire statistique de la Belgique et du Congo Belge, volume 46, Brussels, 1922

Information included: male and female population of each Belgian province for each of the war years

The statistics have been encoded using the W3C data cube vocabulary.

Polygons of Belgian provinces in the Wartime

View / Statistical Description  / Schema Usage / Download )

Wartime boundaries for Belgian provinces were obtained from HISSTAT, a collaborative project of the Universities of Ghent, Brussels, and Louvain-la-Neuve, and the State Archives of Belgium. HISSTAT is developing a research infrastructure that brings together Belgian digital statistics and enables the creation of historical maps. Their geographies are highly accurate, and penetrate to the municipal level.

Information included:name of region, part of, polygon

The schemas used to model the data are the GeoRSS vocabulary for polygons and the W3C geo vocabulary for points.

Wikipedia Event Timeline

View / Statistical Description  / Schema Usage / Download )

Wikipedia World War I event timeline as extracted by Jon Voss.

Information included: name of event, time, Wikipedia link

The primary schema used to model the data is the CIDOC-CRM.

Principal Events

View / Statistical Description  / Schema Usage / Download )

Timeline of principal events of the war, as automatically extracted from an OCR’d version of:

Great Britain, Committee of Imperial Defence. Principal Events, 1914-1918. History of the Great War Based on Official Documents. London: HMSO, 1922.

Note: contains a lot of errors due to the automatic extraction

Information included: name of event, time

The primary schema used to model the data is the CIDOC-CRM.


Dataset Links

In summary, the dataset contains the following links to external resources:


Programmatic Use

Most of the access mechanisms to the dataset provide data in RDF given suitable Accept-headers. Particularly:


Semantic Computing Research GroupAalto UniversityUniversity of HelsinkiTekes