Commons:Bots/Requests/Riksbot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Riksbot (talk · contribs)

Operator: Profoss (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Upload ~4800 Public Domain photos from the Norwegian Directorate for Cultural Heritage (Riksantikvaren)s image database.

Automatic or manually assisted: Supervised automatic

Edit type (e.g. Continuous, daily, one time run): One time run for ~4800 photos. Possibly more depending on the Photo team.

Maximum edit rate (e.g. edits per minute): 2-3 uploads pr minute

Bot flag requested: (Y/N): N

Programming language(s): Python:

import mwclient
import csv
import getpass
import time
import datetime
import urllib

username = 'Riksbot'
password = getpass.getpass()
site = mwclient.Site('commons.wikimedia.org')
site.login(username, password)
lopenummer = 0

with open('testrun.csv', 'rb') as f:
    mycsv = csv.reader(f)
#    mycsv.next() #skips header
    for line in mycsv:
         filename = line[0]
         newfilename = '%s, %s - Riksantikvaren-%s' % (line[4], line[9], filename)
         lopenummer = lopenummer + 1
         args = {
         'description': line[2], 
         'kultid': line[5], 
         'date': line[8],
         'place': line[4], 
         'urlplace': urllib.pathname2url(line[10]),
         'author': line[3], 
         'kommune': line[7],
         'county': line[9]
         }
         print "%s | %s | %s | %s" % (datetime.datetime.now().strftime("%d-%m-%Y %H:%M:%S"), lopenummer, line[0], newfilename)
         description = '=={{int:filedesc}}==\n{{Information\n|description={{nb|1=%(place)s, %(county)s<br> %(description)s}} {{Monument Norge|%(kultid)s}}\n|date=%(date)s\n|source=[http://kulturminnebilder.ra.no/fotoweb/default.fwx?search=%(urlplace)s %(place)s] / [http://kulturminnebilder.ra.no Kulturminnebilder] {{Institution:Riksantikvaren}}\n|author={{Creator:%(author)s}}\n|permission=\n|other_versions=\n|other_fields=\n}}\n\n=={{int:license-header}}==\n{{PD-Norway50}}\n\n[[Category:Images from The Norwegian Directorate for Cultural Heritage]]\n[[Category:Cultural heritage monuments in %(kommune)s]]\n[[Category:Files uploaded by Riksbot]]' % args     
         site.upload(open(filename), newfilename.decode('utf-8'), description , comment="Testrun for bot see [[Commons:Bots/Requests/Riksbot]]")
         print '%s | OK' % datetime.datetime.now().strftime("%d-%m-%Y %H:%M:%S")
         time.sleep(20) #sleeps for 20 secs
         continue

Profoss (talk) 11:47, 9 October 2013 (UTC)[reply]

Discussion

I've done a test run of about 50 images. There are a couple of issues that I'm going to work on, for example missing creator biographies and including the OTRS ticket under permission. Profoss (talk) 22:45, 14 October 2013 (UTC)[reply]
Is it possible to generate better file names? Will be good idea to add category like Images from The Norwegian Directorate for Cultural Heritage needing categories. --EugeneZelenko (talk) 14:38, 15 October 2013 (UTC)[reply]
Hi! well, I could use a different scheme, for example "Riksantikvaren (placename) (original filename).jpg The categories is defiantly something that needs to be done, Profoss (talk) 17:00, 15 October 2013 (UTC)[reply]
I'm currently trying to get higher res versions of the photos we have. But I was wondering if there is any other stumbling blocks for this bot request. Profoss (talk) 20:41, 31 October 2013 (UTC)[reply]
My recommendation would be to generate a filename of the style "Description - Riksantikvaren - original filename.jpg". So for example File:Riksantikvaren - T040 01 0060.jpg would become "File:Hovin kirke, Akershus - Riksantikvaren - T040 01 0060.jpg". Currently the link back to the original image goes via the search engine so for the Hovin kirke image the link is to http://kulturminnebilder.ra.no/fotoweb/default.fwx?search=Hovin. It would be better if the link could instead point directly to the image page using the permalink http://kulturminnebilder.ra.no/fotoweb/archives/5001-Kulturminnebilder/RA1_INDEKS/RA1/Topnummer/T040_01/T040_01_0060.tif. Is it possible to get that link out of the available metadata? /Lokal_Profil 19:52, 30 November 2013 (UTC)[reply]
Please repeat test run when you'll make bot code changes. --EugeneZelenko (talk) 15:21, 1 December 2013 (UTC)[reply]
EugeneZelenko Lovely, I'll do that early next week when I get my hands on the full quality images, and not the scaled down ones I got first. /Lokal, hmm, apparently the permlinks doesn't work that well, they don't follow the same pattern and are not defined in the metadata. I'm taking onboard the filename suggestion. Regards Profoss (talk) 17:56, 3 December 2013 (UTC)[reply]
@EugeneZelenko: Ok, I've started the second testrun with the new high resolution images, the new filenamestructure is a modification of what /Lokal suggested "Place - Riksantikvaren-original filename". The images are also automatically categorised in "Cultural heritage monuments in [Municipality]". I'll do a separate cleanup of the last testrun when all images has been uploaded. Profoss (talk) 00:46, 10 January 2014 (UTC)[reply]
Looks OK for me. --EugeneZelenko (talk) 15:46, 10 January 2014 (UTC)[reply]
@EugeneZelenko: Lovely, does that mean I can start full scale uploading? Profoss (talk) 16:57, 10 January 2014 (UTC)[reply]
I think will be good idea to wait for couple of days for other comments. --EugeneZelenko (talk) 15:33, 11 January 2014 (UTC)[reply]
Cool, I'll give it a week, and in the meantime start creating some of the missing creator templates. Profoss (talk) 16:01, 11 January 2014 (UTC)[reply]

If there are no objections, I think task should be approved. --EugeneZelenko (talk) 15:33, 11 January 2014 (UTC)[reply]