Increasing discoverability: the Wikidata thesis project at York

Ruth Elder (Collections Management Specialist) outlines how the Library has developed a methodology to promote University of York research theses to a global audience through the use of Wikidata.

Wikidata and White Rose eTheses Online logos
Wikidata and White Rose eTheses Online logos

Wikidata is part of the Wikimedia family and acts as a central store for the shared data of Wikimedia projects, including Wikipedia.  Wikidata can be read and edited by both humans and machines, can be interlinked with other open data sets on the linked data web, and (in part) populates Google knowledge graphs, digital assistants and Wikipedia information boxes with information harvested from Wikidata. University of York doctoral dissertations reflect the original, independent research of the institution, and are available digitally through White Rose eTheses Online (WREO.)  

By creating entries on Wikidata with unique identifiers for each thesis title, author and associated doctoral supervisor(s) this information becomes available as part of the linked open data ecosystem.  Links are created between individual entries to express relationships and connections, and the author and title entries signpost directly to the WREO digital entry.

Diagram showing connection between doctoral thesis, author/doctoral student and supervisor/doctoral advisor records on Wikidata

(Links in diagram: WREO entry; Teaching by example Q115006944; Emmie Rose Price-Goodfellow Q114664536; Sethina Watson Q57179761)

In addition to displaying the  “internal” relationships, the data within Wikidata is available to inform a range of external knowledge systems. One example of this is Scholia, a tool based on Wikidata information which creates visual scholarly profiles for topics, people and organizations using bibliographic and other information sources from Wikidata (example: Matthew Collins Q17386345).

A second example is EntiTree with the tagline – “bringing the data closer to the users.”  This is as an “academic family tree”, visualising the relationships between generations of doctoral supervisors and students (example: Richard F Paige Q102303862).

Screenshot of results of Wikidata query 'which University of York theses have been cited by others?'
Screenshot of results for query “which University of York theses have been cited by others?”

The structured data format supports querying the data in a similar way to a database.  So questions such as “which University of York theses have been cited by others?“, “which of our authors or supervisors have a Wikipedia page?“, or “which authors or supervisors have won awards?” can be explored. (For all these examples click on the link, and then on the blue button on bottom left to run the query.)

Of course the information has to be listed in Wikidata to be included in the result. Over 3,611 doctoral theses awarded between 1966 – 2025 are currently listed on Wikidata and the total is increasing all the time. Analysis indicates that the addition of doctoral information to Wikidata is  resulting in directing users to WREO, supporting the library objective of making our content as open as possible, and supporting the commitment of  the University of York as an institution for the public good. 

For more information about this project, contact Ruth Elder

Leave a Reply

Your email address will not be published. Required fields are marked *