User:Fæ/Project list/United Kingdom Parliamentary photographs

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Full-length official portrait of Rt Hon Margaret Beckett MP, 2020. At the time of the photograph the longest-serving woman MP.

Introduction[edit]

This is the batch upload project for portrait photographs published on https://members.parliament.uk with a CC-BY licence.

All uploaded photographs are found under Category:Official United Kingdom Parliamentary photographs.

You can discover how well used and viewed these photographs are using the GLAMorgan tool, see live usage report (may be slow to load). Some of these photographs have had over 800,000 views and are actively used in many different languages. After the English Wikipedia, the most use is reported for Wikidata, then encyclopaedia articles in German, French, Chinese and Arabic Wikipedias. For the single month of January 2020 there were 21,928,395 views of photographs in this collection.

Starting from when the Parliamentary photographs were a beta test project in 2017, photographs have been mass released on Wikimedia Commons each year. This is not fully automatic, and probably never can be as the Parliament website changes layout, but relies on a customized upload using Pywikibot scripts designed by and requests from fellow volunteers. Example code is on GitHub, see #Technical_matters below.

Related project: User:Fæ/Project list/UK legislation

Background[edit]

Example searches[edit]

Categories and options[edit]

All files are uploaded with the bucket category of

Category:Official United Kingdom Parliamentary photographs <year>

File names are formated as:

Official portrait of <member name> <year> <crop type>.jpg

In the 2019/2020 catalogue "MP" was added to the web page title, which by chance distinguished potential duplicates from the 2017 photograph titles. Future batch uploads may need to add <year> in the file name to ensure they are unique. Official photographs will not be overwritten with later photographs or deleted as being out of date, as all have educational value.

For members of the House of Lords, there were name duplicates in the 2020 batch upload against the 2017 batch, so these have the additional name distinction of the photograph year being added, but only when necessary. For example:

For photographs taken after the December 2019 election, these were both in December 2019, and January 2020.

There may be MPs that had photographs taken in 2017 and 2020, and there may be MPs that are shown with the 2017 photograph as their current official photograph in 2020, an example being Official portrait of Ms Diane Abbott.jpg.

As each portrait is supplied by Parliament in 4 different "official" crop sizes, all four are uploaded with the smaller crops indicated in the title as "crop 1", "crop 2", "crop 3". Of these crop 2, being the 3:4 format photograph is the most often used in Wikipedia articles, and on Wikidata. Each official photograph image page is uploaded with a <gallery> table providing thumbnails of the other 3 versions, making navigation easy.

Dates[edit]

A key element of design is the 'year' value. This is not the current year, but is discovered via the local cached version of the photograph and examining the EXIF data and in particular XMP values of XMP:DateCreated or XMP:MetadataDate. This means that if a photograph taken in previous years is newly released as the current photograph for a Member, the photograph will be categorized by that previous year and given a filename based on that year value, even if the Member was not actually a Member in the year displayed.

Consequences:

  1. If Parliament changes technology to release photographs that miss out this EXIF data, the files will be skipped and a code change would be required to work around it.
  2. If a Member has a second series of photographs released in the same year, such as in a year there are sessions ending and starting, then the second series will not be uploaded as the generated filename already exists for that year. This is thought to be unlikely.

Housekeeping[edit]

Some automated tidying up may be done post-upload, such as things noticed during initial uploads and left for fixing once all the uploads are completed. This includes reformating dates from EXIF timestamps to a more usable yyyy-mm-dd style, correcting photographer names which may be in all caps in the EXIF or have other text that can be trimmed down and adding templates useful for Commons maintenance, such as {{Personality rights}}. When stable, these tasks may be passed to Faebot as they are unlikely to need monitoring.

Poor automatic crops[edit]

Several of the official crops have poor placement, with some being unusably bad. This is probably a consequence of automated mass crops to generate the Parliament.uk pages. This is something that Wikimedia Commons contributors can improve by making their own versions using the standard Commons:CropTool, which will automatically add links from the source to the derived crop. New crops are best created from the full official portrait image.

Technical matters[edit]

The original photographs published on parliament.uk, have slightly variable EXIF styles, for example some photographs have the subject named in the EXIF data while others do not. The main catalogue does not list the photographer or photograph date. For the purposes of distinguishing a 2017 photograph from a newly taken portrait of the same politician in 2019/2020, the creation date is extracted from the EXIF data at the time of upload. This requires the local file to be parsed using ExifTool as a Python module.

The source code is available at GitHub. This does require local customization to be used. In 2020 the upload run took around 2 days. If time were an issue, then whitelisting parliament.uk for direct URL uploads would be a magnitude faster, but alternative methods to scraping EXIF data on local storage would be needed. A small component for processing time is partly down to using an old laptop, see the discussion at Hardware donation program/Fæ.

There is no automated run for this project, so refresh runs have to be requested and instigated manually. As of October 2020, the script is more 're-run friendly' and more defensively avoids duplicate creation or creating uploads that may start to conflict with uploads by others. Reminder to self, these scripts are upload_house_of_commons.py upload_lords.py.

Known errors, bugs[edit]

  • Where the main photograph or one of the crops has previously been uploaded, this will result in all or some of the photographs in a member set not uploading. This is because as soon as a duplicate is flagged on the upload attempts, the uploading for that Member is cancelled. This may result in blank entries in the gallery section of an image page.

Log[edit]

  • 2020-10-23 Rerun to refresh and after minor changes to always use year in filenames and not ignore duplicate errors. Latest photographs discovered during this refresh, appear to have been taken in Feb/Mar 2020, before COVID-19 restrictions would probably have made photo sessions too difficult to arrange.

Copyright[edit]

All photographs are published with a CC-BY license.

As of 2020, the release statement is given as All these photos are released under an Attribution 3.0 Unported (CC BY 3.0) licence. This means that you can use, share and adapt it, as long as you give proper attribution, provide a link to the licence, and indicate if changes were made to the image. To validate, please refer to 2020 archive example from members.parliament.uk.

As a courtesy, the photographer is named on the image page whenever this is made available rather than the website or Parliament.

Other versions of these photographs such as desired crops, or digitally enhanced versions, or similar appearing photographs may be added by volunteers to the categories created for this project. In some cases the EXIF data may be missing, and the Commons image page may not contain a working source link or link back to one of the photographs of this project as a source. Copyright for these added photographs must be determined on a case by case basis, and the information about this batch upload project cannot be presumed to apply.