Commons:Bots/Requests/InternetArchiveBot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

InternetArchiveBot (talk · contribs)

Operator: Cyberpower678 (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information) and Harej (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Tagging of dead links in the File namespace and adding archival URLs to them to mitigate link rot.

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Continuous

Maximum edit rate (e.g. edits per minute): approximately 60 edits per minute

Bot flag requested: (Y/N): Y

Programming language(s): PHP. Source code.

Harej (talk) 00:47, 19 November 2020 (UTC)[reply]

Discussion
  • The configuration for the different messages the bot uses is available on Toolforge. Feel free to make adjustments as necessary for use on Commons. Harej (talk) 00:55, 19 November 2020 (UTC)[reply]
  • This seems to have a very wide scope and may have policy implications, such as in future years it may be normal to delete files which never had links archived at IA, and it ties the management/curation of Commons' collections to whether IA can continue to exist online though the WMF has never specifically had IA in its corporate sustainability strategy. As this has the potential to result in changes for all Commons images with links (it being a fact that all links must go dead eventually), is there an existing consensus for this how this project should work? Writing as someone who has mass added IA links due to link rot, but with narrow scope to certain link types. -- (talk) 12:18, 20 November 2020 (UTC)[reply]
    Per Fæ's comment I have the feeling this should be discussed at another venue where more community feedback can be gained. Can anybody arrange that? --Krd 18:41, 22 November 2020 (UTC)[reply]
FYI @Kaldari: -- (talk) 22:32, 22 November 2020 (UTC)[reply]
  • , by default, the bot would only replace links that are already dead. Irrespective of the longevity of the Internet Archive, replacing a link that is currently dead with an archive that is currently available seems reasonable to me. Further, the bot is programmed to operate agnostic of a particular archival provider, and indeed has linked to other archives. So even if something happened to the Internet Archive, the bot could simply continue with a replacement archival service. I am not sure I understand the connection to Commons' own curatorial practices. In any case, the bot currently runs without issue on over 40 Wikimedia wikis, and the bot is highly configurable as well. Cyberpower678 and I are happy to work with the community on making sure the bot is a constructive participant. Harej (talk) 23:23, 24 November 2020 (UTC)[reply]
This all seems very good, in addition it would be useful if a process using these methods were to identify, say, in-use media or media from highly used sources, and ensure those external links are available on backup archives somewhere in case they are needed in the future.
My comment is not that there should be a proposal, but it would be wise to have an announcement post on the Village Pump to see what questions the Commons community may have. Bots/Requests is a technical area and very, very few contributors follow these request pages. As the scope of this project could include tens of millions of image pages and their associated template use, some additional feedback as early as possible could raise questions that we can't think of on our own. -- (talk) 10:20, 25 November 2020 (UTC)[reply]
I do think there should be a proposal or request for feedback, not just an announcement, for exactly the reasons you outlined. As an example question, if there is a dead link somewhere which is part of the file attribution, should it be replaced or should it remain as initially set by the copyright holder? --Krd 10:28, 25 November 2020 (UTC)[reply]
Thanks for the distinction. I was thinking of a word that was not 'proposal', as the bot is already well defined. Requesting feedback makes more sense, which could be done in a less than strictly RfC way. -- (talk) 12:24, 25 November 2020 (UTC)[reply]
I left a message on the village pump to solicit input on this page. As for preserving attribution, the bot is configured (by default) to append archival links to broken links, rather than replace them outright. So the archival link would be an annotation on top of the original link. Harej (talk) 23:22, 25 November 2020 (UTC)[reply]
@Krd and Harej: Input isn’t needed. This already has consensus per this existing thread.—CYBERPOWER (Chat) 19:23, 6 December 2020 (UTC)[reply]

As far as I understand all issues have been addressed, and the test run looks good, so this should be called approved. --Krd 08:17, 19 January 2021 (UTC)[reply]