Commons:Bots/Requests/Picasa Review Bot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Picasa Review Bot (talk · contribs)

Operator: Dcoetzee (talk)

Bot's tasks for which permission is being sought: Much like User:FlickreviewR, this bot reviews files uploaded from Picasa and either marks them reviewed if the image and license match and are valid, or marks it requiring human review otherwise. Here's an example of a passed review, and an example of a failed review.

Automatic or manually assisted: The bot is fully automated. To avoid any mistakes, the full-resolution file at the source must be an exact match, as must the license. Detecting thumbnails would be a future extension. Update: The bot now detects downscaled images using a simple mean-squared difference test and uploads the original full-resolution version. See e.g. File:Bychkov igor.jpg. It skips images that have been modified from the original. Another new feature: People who upload Picasa images often just supply the album or even just the user. Picasa Review Bot will now search Picasa for the right image, add its URL to the image description page, and then proceed to review it normally. It also now gives the reason for rejection in the review failed tag.

Edit type (e.g. Continuous, daily, one time run): Run on-demand from time to time by its owner. I might move it to a cron job on Toolserver in the future.

Maximum edit rate (eg edits per minute): No more than 6-8 edits per minute due to the time needed to download and compare the image files.

Bot flag requested: (Y/N): Y

Programming language(s): C#, DotNetWikiBot (may translate to something else later)

See also: Commons:Picasa Web Albums files

(Note: I accidentally made some normal edits while logged on as this bot. Oops.)

Dcoetzee (talk) 11:19, 16 February 2010 (UTC)[reply]

Discussion

Looks OK for me. Thank you for taking care about review automation! Is it possible to do same for Panoramio (if their API allows to fetch license information)? --EugeneZelenko (talk) 15:45, 16 February 2010 (UTC)[reply]
Glad to help. :-) If you can see the license on the webpage, it can be scraped - but do we have a lot of Panoramio files? Do they have much of a license-changing problem? As far as I can tell they don't seem to provide a convenient way to search for images by license in the first place. Regardless this would end up having to have its own page and set of categories, since it's a good idea to keep images sources separated. Dcoetzee (talk) 20:23, 16 February 2010 (UTC)[reply]
I suggested Panoramio to add API to filter images by license a long time ago :-)
May problem with Panoramio images is copyvio (on my experience). So will be good idea if such robot will check all images where Panoramio URL is source. Change of license is possible also.
EugeneZelenko (talk) 15:46, 17 February 2010 (UTC)[reply]
Unfortunately bots are particularly bad at detecting copyvios where the user who first uploaded it to Flickr/Picasa/etc. was the one committing the copyvio, see Commons:Flickr washing. I have ideas for this in the future but right now it's just checking that the license at the source matches the license on Commons. Dcoetzee (talk) 23:20, 17 February 2010 (UTC)[reply]
I meant tagging images with Panoramio URL as source which had non free license on Panoramio. --EugeneZelenko (talk) 15:38, 18 February 2010 (UTC)[reply]
Oh, my mistake. I'm sure it's possible. Dcoetzee (talk) 17:38, 18 February 2010 (UTC)[reply]
  •  Comment It looks ok to me. But I wondered about images that fail review. Does the bot tell the reason? Missing link, image deleted, possible copyvio, different size etc.? --MGA73 (talk) 15:26, 21 February 2010 (UTC)[reply]
  • The new feature with search for right image in gallery + reason for failed review is good. If it used seperate templates for "image not found", "unfree license" etc it would be possible to put images in different categories. If an image fail review same day as it was uploaded we could delete it. If source is not found we (or the bot?) should ask uploader for a better source. So perhaps it would be nice to seperate images.
If bot has succes to find the right image it would be nice if it could do the same with Flickr images. But that is an other story :-)
This edit [1] says "image license on Picasa is invalid". However it is not the same image so I would prefer that it said "image is not the same" or something like that. --MGA73 (talk) 11:50, 27 February 2010 (UTC)[reply]
Great point - I've been assuming that if a source URL is given that it's correct, but as this case demonstrates it may refer to a different image. This is worth detecting. Dcoetzee (talk) 22:44, 3 March 2010 (UTC)[reply]
Were you going to change anything to take this into account? Looks good otherwise. ++Lar: t/c 17:07, 27 March 2010 (UTC)[reply]

Question, are you sure you need a bot flag? I'm fine with no flag if you wish, given the edit rate and that there may be some benefit from human review. This looks close to closable though, either way. ++Lar: t/c 17:07, 27 March 2010 (UTC)[reply]

✓ Approved; bot flag given per precedence. –Juliancolton | Talk 14:32, 24 May 2010 (UTC)[reply]