Commons:Bots/Requests/GreenC bot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

GreenC bot (talk · contribs)

Operator: GreenC (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Fix dead links. The bot is called WaybackMedic (w:WP:WAYBACKMEDIC) which has operated on Enwiki since 2015. This is an experiment to see if it can be ported to Commons. It will initially only run on domain-specific requests at Wikipedia:Link_rot/URL_change_requests ie. when the original domain is still live but certain paths within that domain no longer work and need to be ported to a new URL. For example changing some instances of https://www.bbc.co.uk/arts/yourpaintings/paintings --> https://www.artuk.org/discover/artworks/ as seen in this diff. This task was completed successfully on Enwiki and will be the testbed for porting WaybackMedic to Commons.

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): As needed

Maximum edit rate (e.g. edits per minute): 5-10

Bot flag requested: (Y/N): Y

Programming language(s): Nim and GNU Awk

GreenC bot (talk) 15:29, 31 March 2019 (UTC)[reply]

Discussion

EugeneZelenko: Sure. Do I need bot flag for Commons or does the flag on Enwiki carry over? -- GreenC (talk) 15:03, 1 April 2019 (UTC)[reply]
Your bot flag does not carry over. On commons we expect a short test run (approx. 50 edits) as part of the request. So these are expected to be performed without bot flag. --Schlurcher (talk) 22:56, 1 April 2019 (UTC)[reply]
Sorry I was logged into the bot account. -- GreenC (talk) 15:03, 1 April 2019 (UTC)[reply]
You say the edit happen automatically, so why did you edit from the bot account at all? Do you intend to make more manual edits with the bot account? --Krd 04:52, 2 April 2019 (UTC)[reply]
I was logged into the bot account to create a user page with the {{bot}} template. In retrospect that might have been done with the GreenC account, but I also wanted to make sure the GreenC bot account was working on Commons so I happened to be logged into it when creating the bot request. -- GreenC (talk) 13:40, 2 April 2019 (UTC)[reply]
BTW if the concern are the edits from February those were made at Enwiki but the edit history is carried over when the page was moved to Commons. I run a bot process on Enwiki that tags files as being "shadowed" at Commons so there will likely be a bunch of shadow edits like that in the bot's edit history. -- GreenC (talk) 13:45, 2 April 2019 (UTC)[reply]
  • Trial 50 complete.
  • Types of changes:
  1. BBC link dead no archive available. Example
  2. BBC link dead archive available. Example
  3. BBC link dead and redirected to Art UK. Example (plain URL). Example (title conversion). Example (title conversion).
  • Comments:
    • The {{cbignore}} seen in #1 is copied from Enwiki. It is a voluntary system that archive bots use to intercommunicate. It can also be used by editors to keep bots off a particular link. In this case, it is being used because WaybackMedic is more accurate than InternetArchiveBot which is slated to run on Commons sometime in the future, it prevents bot edit wars.
    • The title conversions in #3 are best effort. Because they can be so variable in wording and placement I will try to find rules but no guarantees.
    • It was unable to process File:The Surprise PMA(08) (15772335313).jpg because of the non-standard HTML formatting.
    • There are apparently some cases when the bot deploys #1 or #2 but a live Art UK URL is actually available. However the bot has no way or determining what that URL is because the BBC didn't leave a redirect. It would require manual research to make the conversion. Nevertheless the BBC link is in fact dead, so the process is moved forward by marking dead and archiving.
-- GreenC (talk) 15:51, 5 April 2019 (UTC)[reply]

Approved. Please start slowly, just in case anything goes wrong. --Krd 07:36, 22 April 2019 (UTC)[reply]