Commons:Bots/Requests/JhealdBot (6)

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

JhealdBot (talk · contribs) (6)

Operator: Jheald (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

Batch uploading for an image release by the British Library. Mostly 18th-century maps and engravings. Pilot uploads in Category:BL18C_pilot. Potential for about 20,000 images if everybody is happy.

Automatic or manually assisted:

Automatic, but with extensive manual pre-preparation and post-upload review.

Edit type (e.g. Continuous, daily, one time run):

Batches of up to a couple of hundred images at a time

Maximum edit rate (e.g. edits per minute):

Pilot images are taking about a minute and a half each to upload, via my home broadband connection (which is what will probably also be used for the production run). Production images may be circa five times larger (less compressed). Script pauses for about 8 seconds before starting each new image.

Bot flag requested: (Y/N):

Bot already has it.

Programming language(s):

Perl is used to create a file of description pages, based on various fields of MARC records extracted from the library catalogue, plus semi-manual matching of people and places to items known to Wikidata using OpenRefine. The upload itself is then executed using the Pywikibot upload.py script, driven by a further Perl script.

Additional comments:

Pilot images have been extracted from existing released images using a dezoomify-type script. This is the reason for their lack of EXIF information. Production images will come more directly from the BL's own internal systems, and may be somewhat less compressed. Initially JPG versions will be uploaded; TIFF versions may be added at a later date.
Permission is also sought, as JhealdBot (6A), to make minor post-upload edits to the pages, similar to those already approved for JhealdBot (4) -- eg to make any post-upload corrections that may be needed, additional categorisation, additional information that may become available such as scan resolution, updates to the pages to reflect updates in the BL catalogue, etc. The scripts to do this would be closely similar to those already used as JhealdBot (4) for a different set of images.

Jheald (talk) 06:36, 22 September 2018 (UTC)[reply]

Discussion

@EugeneZelenko: Hmmm... {{Occupation}} gives quite a nice, visually unobtrusive way to internationalise this (eg [1]), albeit with maybe a bit less coverage than Wikidata. Would that be acceptable? Jheald (talk) 21:59, 22 September 2018 (UTC)[reply]
I'm not sure this will always work, since person could have multiple occupations and file need only one of them. Could we use query translations from item for Draftsman? --EugeneZelenko (talk) 13:59, 23 September 2018 (UTC)[reply]
That's not a problem. The {{Occupation}} template simply translates the text it is given, and when I create the page I will only give it one occupation. On the minus side, the range of translations the template provides is (I think) not quite as extensive as wikidata. With luck somebody will fix that eventually, and adjust the template to use wikidata as a fallback source for translations. The plus side is that the translations are specialised to Commons, and are sometimes more accurate than what may be more generic translations given by Wikidata labels. Jheald (talk) 14:23, 23 September 2018 (UTC)[reply]

Approved. --Krd 07:05, 17 October 2018 (UTC)[reply]