User:FlickypediaBackfillrBot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

This is a bot run by the Flickr Foundation.

This bot is a project spun out of Commons:Flickypedia, a tool for uploading Flickr photos to Wikimedia Commons. When somebody uploads a photo using Flickypedia, it will add structured data fields to the file on Commons, including:

For a full description of the structured data mapping, see Commons:Flickypedia/Data Modeling.

Flickypedia Backfillr Bot will apply this same structured data mapping to existing Flickr photos on Commons.

Key info[edit]

  • Operator: User:Alexwlchan, working for the Flickr Foundation
  • Tasks: Update structured data for Flickr photos that have been uploaded to Wikimedia Commons
  • Operation: Automatically
  • When: Manually triggered, by paging through an SDC snapshot
  • Maximum edit rate: tbc, probably 5–10 edits per second
  • Language: Python

How it works[edit]

Running the bot will be a manual process with the following steps:

  1. Download one of the structured data dumps from Commons.
  2. Go through every entry in that snapshot (≈ every file on Wikimedia Commons), looking for Flickr photos. This will likely involve looking at either the source URL or the Flickr photo ID properties.
  3. Look to see if the photo has all the structured data fields that can be populated from the Flickr metadata.
  4. If any fields are missing, get the photo metadata from the Flickr API, then add the necessary fields to the file on Commons.

Eventually every Flickr photo the bot can find on Wikimedia Commons will have a consistent set of structured data, similar to that set by Flickypedia.

FAQs[edit]

How does it handle existing/conflicting statements?[edit]

Backfillr Bot is purely additive – it will create new statements, or add qualifiers to existing statements, but it will never remove useful information.

If it wants to write a statement that conflicts with an existing statement, it will flag it for manual attention, and not write anything to Commons.

(There are a few case where it will remove existing statements, but these are places where the existing data is obviously wrong – e.g. creators who have author name string (P2093) = null.)

What sort of data is it copying to Commons?[edit]

Backfillr Bot only uses public information which is available in the Flickr API – the same information you could see if you open the photo on Flickr.com.

Suggesting improvements[edit]

For now, to suggest an update to the SDC mapping, please leave a comment on Commons:Flickypedia/Data Modeling.

At some point there may be a dedicated page for the standalone library; tbc.