User:Faebot/old speedies

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
A very early 2004 upload of an 1896 photograph. Placed by a template design in Category:Images without source, though the history shows a dead source link was removed in 2011. Files in this maintenance category have a history of being speedy deleted without discussion and is a sub-category of the main speedy deletion category.

Introduction

[edit]

The Faebot Pywikibot task speedies_maintenance.py (source code) populates

Discussion about the proposal Commons:Village pump/Proposals#Apply a default of Good Faith for very old files resulted in more interest in ensuring that files hosted for several years and marked for speedy deletion were treated more carefully than new uploads. These categories make that task easier, and also highlight a number of maintenance categories, like Media without a source, which are also used as a rationale for speedy deletion, sometimes controversially.

The categories may be quite large, so it can be useful to search on keywords or links for which you have an interest or experience with and would like to 'repair', for example this search for "flickr.com" or this search for 'map' or Copyvios 3-10 years old or 3 years and not in Images without source.

Technical

[edit]

Files are found using selected subcategories under Category:Candidates for speedy deletion.

The speedy deletion parent category is shown in the edit comment (example):

User:Faebot/old speedies | (<speedy deletion parent category>) add Category:Files uploaded over <3 or 10> years ago in a speedy deletion subcategory <upload year>

When the file is removed from the speedy deletion subcategory, or the script has a category removed from being tracked, the maintenance category will be removed with an edit comment making that clear (example):

User:Faebot/old speedies | Removed from tracked speedy deletion subcategory

As some of the subcategories are not obvious speedies, a more restricted set is used. Categories marked with "+" include all subcategories automatically.

Copyright violations
Non-free logos
Copyright violations (OTRS)
Unfree Flickr files
Flickr files from bad authors
No OTRS permission
Personal files for speedy deletion
Duplicate
Image crop missing parent page
Other speedy deletions
Pending fair use deletes
Flickr images from bad authors
Advertisements for speedy deletion
Images without source
Media missing permission+

How many years old the file is, is tested using:

rev = f.oldest_file_info
age = ( datetime.now() - rev.timestamp ).days/365

Automation

[edit]

This is being run as a non-urgent informal maintenance task. It is launched hourly on a laptop and may become more formalized once established as stable.

The effective throttle is to limit changes to under 100 files on each run.

Bugs

[edit]

The search of each category's content rely on a generator like:

ts3years = datetime.now() - timedelta(3*365)
pywikibot.Category( site, "Category:Images without source" ).articles( recurse=False, namespaces=6, sortby="timestamp", endtime=ts3years )

Due to a rare filehistory bug for very old files, there may be some files with incomplete records for the upload revision. As the generator fatally falls over with the error 'FileInfo' object has no attribute 'timestamp', any time such a file is encountered during querying the generator, the file search halts. Consequently only a partial search of Category:Images without source is performed at the time of writing.

Further research via SQL with the commonswiki database pinned down:

  1. File:Potenzmenge von A.png 2007
  2. File:Mosfet linear.svg 2006
  3. File:Yogya Symbol.jpg 2007
  4. File:Froeb.jpg 2006

All of these have corrupted file histories, as can be seen on the image pages as blanks in the revisions section.