Commons:Flickypedia/Data Modeling
Modeling the Data Connection
A document to outline Flickypedia’s alignment with Structured Data on Commons
Hi. This is work in progress. If you have comments, do please add to the Discussion tab!
For any given image uploaded using Flickypedia or backfilled with Flickypedia, we automatically map the following data:
This is meant to be an unambiguous, canonical property for finding existing copies of a Flickr photo on WMC. It avoids all the hassle of dealing with different varieties of Flickr URL. For more info/examples, see the property proposal. Example:
|
|||
This is the Flickr photographer, i.e. the user who uploaded the photo to Flickr. We create a creator (P170) with qualifiers:
Example:
|
|||
We link to the original photo on Flickr using source of file (P7482) → file available on the internet (Q74228490) with qualifiers:
Example:
|
|||
We copy the currently-stated license from Flickr. Example:
|
|||
We populate based on the currently-stated license on Flickr. Example:
|
|||
We save the date a photo was posted to Flickr in published in (P1433) → Flickr (Q103204) with a qualifier:
Example:
|
We save the date the photo was taken in inception (P571). Not all photos on Flickr have a public "date taken" value, e.g. if the uploader doesn’t know when the photo was taken. In this case, we won’t save a date taken on WMC.
Examples:
|
|||||
We save the location the photo was taken in coordinates of the point of view (P1259). Unlike Wikimedia, Flickr doesn't distinguish between the location of the camera and the subject. We chose P1259 as the best match for how Flickr photographers use the field. Not all photos on Flickr have location information, e.g. if the photographer has made it private. If there isn't public location data, we won't save a location on WMC. i.e. we'll only copy across location information that you could see by visiting the photographer's page on Flickr – we’re using the location information returned by Flickr’s public API. Example:
|
|||||
For Flickr images from the Biodiversity Heritage Library account, we extract the BHL Page ID from the Flickr machine tags and store that as a structured data property. This is to create links between the original image on the BHL website, the image copied to Flickr, and the image copied to Wikimedia Commons. Example:
|
Flickr title and description |
We’ve decided not to bring these across – this text may not be appropriate for Wikimedia Commons (e.g. more informal/personal, not descriptive), and we may not be able to license this text as CC0. |
Flickr tags |
Flickr tags are more folksonomic that the structured data definitions used in Commons, e.g. they're not as specific as Depicts. We store them in the Wikitext with links back to Flickr, but not in the SDC. |
EXIF metadata, e.g. captured with |
There are already bots that fill in these properties with data from Flickr, and we didn't see a need to duplicate their work. |