Commons:Structured data/Archive/2014/Rationale

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Rationale[edit]

Today, multimedia file information on Wikimedia sites is stored using an aging software system that is hard to use and support. The current Commons database relies on wikitext, which is not adapted to storing structured data; this makes the current system very hard to search, confusing to users, and impractical for new feature development. Instead of using machine-readable data as most modern sites do nowadays, Wikimedia Commons relies on a cumbersome patchwork of plain text data embedded in a range of overlapping templates, with a set of English-only categories that are often incompatible with other sites or tools.

Wikibase now offers a practical way to maintain structured data in MediaWiki, and is widely considered to be a useful tool to support the growth of the free knowledge movement. As a result, many community members have proposed that we should use this mechanism to store and retrieve media metadata on Wikimedia Commons. This would provide a wide range of benefits to all users of Wikimedia Commons, as well as to the many sites which rely on its multimedia content. Associating files with Wikidata items on other sites, or even geo-location, would support more effective ways to search and browse files on Commons. This would also make it a lot easier to show the appropriate attribution and license information when re-using a file. More benefits are listed below.

The Wikimedia Foundation's multimedia team hosted a number of roundtable discussions with community members last year to ask what it should focus on in coming years. In every roundtable, the top request from participants was to implement structured data on Commons, even when this topic was not on the agenda to begin with. Community members pointed out that search does not work well on Commons, making it hard to find what you are looking for. Others also pointed out that categories are now mostly in English, making it difficult for non-English speakers to contribute on Commons. Many have also suggested that categories be complemented with more granular topics that could be linked to Wikidata's knowledge base in your language, as well as intersected to provide better search results.

Plan[edit]

The proposed plan for the next few months is to:

  • Plan & discuss this proposal (with the engineering team and the rest of the community)
  • Design the data structure and user interface
  • Develop the code and tools needed for this project
  • Migrate unstructured data to the new format
  • Test, measure and adjust as needed