Commons:Flickypedia/Data Modeling

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Modeling the Data Connection

A document to outline Flickypedia’s alignment with Structured Data on Commons

Hi. This is work in progress. If you have comments, do please add to the Discussion tab!

For any given image uploaded using Flickypedia or backfilled with Flickypedia, we automatically map the following data:

Always mapped

‎Flickr photo ID (P12120)

This is meant to be an unambiguous, canonical property for finding existing copies of a Flickr photo on WMC. It avoids all the hassle of dealing with different varieties of Flickr URL. For more info/examples, see the property proposal.

Example:

P12120 → 326048874

creator (P170)

This is the Flickr photographer, i.e. the user who uploaded the photo to Flickr. We create a creator (P170) with qualifiers:

Example:

link=https://commons.wikimedia.org/wiki/File:Clethra acuminata.jpg#P170
P170 →
    P3267: 54289096@N00
    P2093: zen Sutherland
    P2699: https://www.flickr.com/people/zen/

source of file (P7482)

We link to the original photo on Flickr using source of file (P7482)file available on the internet (Q74228490) with qualifiers:

Example:

P7482 → Q74228490
    P973: https://www.flickr.com/photos/merula/7226882/
    P137: Q103204
    P2699: https://live.staticflickr.com/5/7226882_1b236da680_o.jpg

copyright license (P275)

We copy the currently-stated license from Flickr.

Example:

link=https://commons.wikimedia.org/wiki/File:Músicos tradicionales de Tenejapa (Chiapas).jpg#P275 CC BY 2.0
P275 → Q19125117

copyright status (P6216)

We populate based on the currently-stated license on Flickr.

Example:

link=https://commons.wikimedia.org/wiki/File:Solstice Parade 2013 - 022 (9146560158).jpg#P6216 CC BY-SA 2.0 = copyrighted
P6216 → Q50423863

published in (P1433)

We save the date a photo was posted to Flickr in published in (P1433)Flickr (Q103204) with a qualifier:

Example:

link=https://commons.wikimedia.org/wiki/File:"A Beatles Fan" - Goodwood Revival 2019 - The Fashion (2019-09-13 10.40.00 by David Merrett - 48749708457).jpg#P1433
P1433 → Q103204
    P577: 17 September 2019
Sometimes mapped

published in (P1433)

We save the date the photo was taken in inception (P571).

Not all photos on Flickr have a public "date taken" value, e.g. if the uploader doesn’t know when the photo was taken. In this case, we won’t save a date taken on WMC.

Examples:

link=https://commons.wikimedia.org/wiki/File:Northern Spotted Owl.USFWS.jpg#P1433 taken 18 April 2008
P571 → 18 April 2008
link=https://commons.wikimedia.org/wiki/File:Murilla Shire Hall, circa 1930.jpg#P1433 taken circa 1930
P571 → 1930
    P1480: Q5727902

coordinates of the point of view (P1259)

We save the location the photo was taken in coordinates of the point of view (P1259).

Unlike Wikimedia, Flickr doesn't distinguish between the location of the camera and the subject. We chose P1259 as the best match for how Flickr photographers use the field.

Not all photos on Flickr have location information, e.g. if the photographer has made it private. If there isn't public location data, we won't save a location on WMC.

i.e. we'll only copy across location information that you could see by visiting the photographer's page on Flickr – we’re using the location information returned by Flickr’s public API.

Example:

P1259 → 56°22'34"N, 62°9'10"E

BHL page ID (P687)

For Flickr images from the Biodiversity Heritage Library account, we extract the BHL Page ID from the Flickr machine tags and store that as a structured data property.

This is to create links between the original image on the BHL website, the image copied to Flickr, and the image copied to Wikimedia Commons.

Example:

link=https://commons.wikimedia.org/wiki/File:Illustrations of ornithology (Color Plate 84) (7748014230).jpg#P687
P687 → 39769992
Never mapped

Flickr title and description

We’ve decided not to bring these across – this text may not be appropriate for Wikimedia Commons (e.g. more informal/personal, not descriptive), and we may not be able to license this text as CC0.

Flickr tags

Flickr tags are more folksonomic that the structured data definitions used in Commons, e.g. they're not as specific as Depicts.

We store them in the Wikitext with links back to Flickr, but not in the SDC.

EXIF metadata, e.g. captured with

There are already bots that fill in these properties with data from Flickr, and we didn't see a need to duplicate their work.