Commons:Bots/Requests/SchlurcherBot2

SchlurcherBot (talk · contribs)

Operator: Schlurcher (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information) (assign permissions)

Bot's tasks for which permission is being sought: Change selected URLs from using HTTP protocol to HTTPS to improve security. KolbertBot is currently approved to operate on this task.

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): As needed

Maximum edit rate (e.g. edits per minute): 30

Bot flag requested: (Y/N): N (Bot has flag already)

Programming language(s): Pywikibot

Schlurcher (talk) 21:19, 22 November 2017 (UTC)[reply]

Discussion

To generate a synergy with the Internationalization task of the bot (recent database dump already downloaded, system setup, etc.) and per here, I am requesting approval for the above mentioned task. For a test run and sample of edits please see the contributions of the Bot. --Schlurcher (talk) 21:19, 22 November 2017 (UTC)[reply]

The bot run LGTM. Just curious: does it run on a pre-defined list of domains, or does it determine whether to convert dynamically during the bot run? --Zhuyifei1999 (talk) 07:18, 23 November 2017 (UTC)[reply]

The bot currently runs a pre-defined list of domains (for Flickr and Geography). Both the list might be expanded (short term) as well as a dynamic conversion might be added (long term) --Schlurcher (talk) 17:12, 23 November 2017 (UTC)[reply]

Looks OK for me, but why do we need two bots which do same thing? @Jon Kolbert: . --EugeneZelenko (talk) 14:46, 23 November 2017 (UTC)[reply]

In my case, it is primarly to have a synergy with the other task of the bot (Internationalization Task). It currently runs around 600 regex replacements for that, which will anyhow be run on a recent database dump and generates a high CPU load on my tiny server. There are multiple other bots that have included basic internationalization tasks in their bots to have the same synergy the other way around. --Schlurcher (talk) 17:12, 23 November 2017 (UTC)[reply]

I think it's fine to mix HTTP/HTTPS with internationalization changes, but please reflect kind(s) of changes in edit summaries. --EugeneZelenko (talk) 15:31, 24 November 2017 (UTC)[reply]

The edit summaries will always reflect what kind(s) of changes have been performed. --Schlurcher (talk) 03:56, 26 November 2017 (UTC)[reply]

@Schlurcher: Would you be able to send the internationalization changes regex list to me so I can implement it in KolbertBot? Might as well avoid making unnecessary edits and perform both tasks simultaneously. Naturally I will submit a relevant bot request for approval of this task for KolbertBot. Thank you. Jon Kolbert (talk) 04:07, 26 November 2017 (UTC)[reply]

The best location for Localization/Internationalization changes (regex) is this page: Commons:File description page regular expressions --Schlurcher (talk) 19:07, 7 December 2017 (UTC)[reply]

Thanks for the comments so far, please let me know, if it is save to proceed. --Schlurcher (talk) 17:12, 23 November 2017 (UTC)[reply]

The current bot scope is open enough under "internationalise" to include these trivial improvements. I hope this overly literal case does not lead to unnecessary requests, or put off bot operators from adding obvious maintenance tasks because of a mentality of "bureaucracy". --Fæ (talk) 04:37, 24 November 2017 (UTC)[reply]

I made some more test edits. Please let me know, if anything more is needed. --Schlurcher (talk) 07:39, 29 November 2017 (UTC)[reply]

There have been no more suggestion regarding this bot for over a week. In fact, while checking the edits from my bot, I could suggest an improvement of KolbertBot. This demonstrates that there are benefits of having more than one bot for a specific task. May I ask someone to approve the task and close the discussion? --Schlurcher (talk) 04:36, 4 December 2017 (UTC)[reply]

As there were no objections, I have restarted my bot. --Schlurcher (talk) 17:07, 7 December 2017 (UTC)[reply]

You failed to address my request even though I mentioned you twice. Our bureaucrats are active, I would suggest you wait until they approve the request, instead of assuming approval. Jon Kolbert (talk) 17:31, 7 December 2017 (UTC)[reply]

Hi, I respond to each comment on my discussion page so I do not understand the suggestive remark that I am unresponsive. I normally do not monitor other users discussion pages, though. So indeed, I have overlooked the comment on you talk page. I will reply now. I am still closely monitoring the changes of my bot to learn if it can be improved (edits that are reverted, and when there are additional edits to the pages after my bot). --Schlurcher (talk) 19:03, 7 December 2017 (UTC)[reply]

@Schlurcher: Thank you for your response. One further question, how would your bot handle URLs in this format? https://web.archive.org/web/20170727004838/http://www.geograph.org.uk/

Thank you. Jon Kolbert (talk) 19:13, 7 December 2017 (UTC)[reply]

With some surprise, I noticed that both https://web.archive.org/web/20170727151857/http://www.geograph.org.uk/ and https://web.archive.org/web/20170727151857/https://www.geograph.org.uk/ (with the additional s in between) are valid URLs. The Bot will only transform URLs in the wiki format [http://***] (includling the brackates); so it would not touch these URLs. It also does not touch URLs in nowiki statements, as these often are used to reference the original upload log. These are all precautions to eliminate false replacements. --Schlurcher (talk) 22:36, 7 December 2017 (UTC)[reply]

If I understand correctly, the bot wouldn't edit a page with only [https://web.archive.org/web/20170727004838/http://www.geograph.org.uk/ ], correct? Jon Kolbert (talk) 23:03, 7 December 2017 (UTC)[reply]

Correct, as the wiki link would start with [https://***] not [http://***]. Once home, I will make a sandbox example and verify. I will report the result shortly. --Schlurcher (talk) 23:47, 7 December 2017 (UTC)[reply]

I can now confirm that the bot works in these as expected: [1] (altought it is a bit difficult to see in the diff). Only the main url is updated, not the url referenced in the web.archive. --Schlurcher (talk) 04:53, 8 December 2017 (UTC)[reply]

@Schlurcher: Okay! The behaviour in that scenario looks satisfactory to me. How does the bot behave when a link is in a template like this example? Jon Kolbert (talk) 08:02, 10 December 2017 (UTC)[reply]

@Jon Kolbert: General, the text of the template is processed separate from the images description. Arguments of templates are processed with the image. Only if the full argument of the template matches an url, the url will be processed. I processed the page you highlighted; please have a look. --Schlurcher (talk) 15:24, 10 December 2017 (UTC)[reply]

Looks good to me. Jon Kolbert (talk) 15:32, 10 December 2017 (UTC)[reply]

I´d say all question have been answered sufficiently, so we can call this approved. --Krd 16:05, 10 December 2017 (UTC)[reply]