Commons:Bots/Requests/AAlertBot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

AAlertBot (talk · contribs)

Operator: Hellknowz (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • This bot works on English Wikipedia to deliver Article alert report pages to subscribed projects and taskforces. Article alerts is a project that aims to deliver reports to internal topic-based projects about which pages enter and leave certain Wikipedia maintenance workflows, such as, deletion discussions, comment requests, featured content processes, etc.
  • On Commons, the bot reads and parses workflows that may be relevant to report delivery, namely, the deletion processes concerning Commons' media that is used on English Wikipedia.
  • Specific needs: read-only API -- for higher bot limits when reading and parsing categories/templates, relevant discussion pages, and relevant file histories and usages. Possibly occasional own userspace sandbox edit (such as a run problem report and while testing results).

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Daily run

Maximum edit rate (e.g. edits per minute): N/A

Bot flag requested: (Y/N): Y

Programming language(s): C#

—  HELLKNOWZ  ▎TALK  ▎enWiki 09:57, 9 August 2018 (UTC)[reply]

Discussion

  • We usually don't approve read-only bots. --Krd 12:15, 9 August 2018 (UTC)[reply]
    • I would prefer to have the flag so I can have the 50->500 API request limit and no individual rate limits. Especially, since the bot is running individual queries in parallel. I have not investigated the local workflows in detail to give any exact numbers. I'm some ways off from actually implementing this.
    • That said, I don't really need the flag, since the bot won't edit anything that needs flagging besides may be its sandbox, the actual query count is low, and I can run everything as a regular account. In that case, I would like the approval filed just "for the record" so it's clear what the account is for in case anyone asks later why the bot is doing anything at all on Commons. —  HELLKNOWZ  ▎TALK  ▎enWiki 13:52, 9 August 2018 (UTC)[reply]
      Could you please clarify what you'll do with data? How Commons or other WMF project will benefit from this bot? Could you use Commons dumps instead? --EugeneZelenko (talk) 13:58, 9 August 2018 (UTC)[reply]
      The bot is part of English Wikipedia's Article alerts internal project. In short, the bot delivers organized daily reports about pages that are under discussion for deletion, review, featuring, etc. and any additional information. Here's an example. I also left a brief description on User:AAlertBot as per COM:BOTS. I cannot use dumps, because it needs live data as per the purpose. —  HELLKNOWZ  ▎TALK  ▎enWiki 14:14, 9 August 2018 (UTC)[reply]
      So bot will add deletion/permission/etc requests from Commons? --EugeneZelenko (talk) 13:45, 11 August 2018 (UTC)[reply]
      The current goal is to list images up for deletion on Commons in the report on English Wikipedia (along with the local already-reported up-for-deletion images). The reason is that these don't get reported as the deletion process is on Commons and not English Wikipedia and so the images occasionally "disappear", usually because they are incompatibly/badly licensed. For this, I need to read the deletion request page, it's dated subpages, possibly deletion categories, every file under deletion, likely part of its immediate history, and likely discussion page history. I haven't fully investigated the exact pages I need to read though and I'm going off what I have to do on En Wiki.
      In the future, I may add other processes as appropriate, though none come to mind immediately that happen on a per-file basis. I doubt I will add something that is only on Commons and not related to English Wikipedia, like licensing issues or something. I don't know what permission request is. —  HELLKNOWZ  ▎TALK  ▎enWiki 15:07, 11 August 2018 (UTC)[reply]
      If you cache the results of previous queries, this should be easily achievable with 50 results per request. If you disagree please elaborate. --Krd 15:16, 11 August 2018 (UTC)[reply]
      I cache things that I can. But I can't cache queries that change between runs: category members, template embedding, discussion page contents. And yeah, it will likely be <50 items per query for most queries. But Commons has a long backlog and stuff like Category:Deletion requests has thousands of pages though. But as I said, I don't need the flag if you decide it's not necessary -- it might take marginally longer to do a run. As long as this request exists for future reference. —  HELLKNOWZ  ▎TALK  ▎enWiki 15:57, 11 August 2018 (UTC)[reply]
  • Is this bot doing the same thing as phab:T167614? If not, could you clarify the difference? --Zhuyifei1999 (talk) 13:18, 13 August 2018 (UTC)[reply]
    • Likely similar internally. AAB would read and parse more, but only daily. Here's a discussion on this. Those notices don't contain a bunch of information that AAB reports. For one, they never go away and I cannot "close" the entries or report what happened to the file. Here's an an example of AAB report and the sort of info it contains, including keeping closed entries visible for a while. —  HELLKNOWZ  ▎TALK  ▎enWiki 14:12, 13 August 2018 (UTC)[reply]
  • The request appears reasonable to me, but I'd still prefer to have this implemented via a query-continue approach and not creating a precedent for a read-only bot. Additional opinions? --Krd 15:03, 14 September 2018 (UTC)[reply]
    • PS: If the approach was from the other side, i.e. not looking for issue on Commons from the enwiki view, but gathering issue lists at Commons for all projects, this could change my mind. But I understand that this is a totally different project than you actually run. --Krd 15:06, 14 September 2018 (UTC)[reply]
      • I am using the API and I am using query-continue when applicable. Sorry if I was unclear on this (although I did mention that I am using the API). Just to be clear, I'm not screen-scraping or anything like that. The bot will continue its queries using the expected Commons API syntax as per MediaWiki manual until it gets all the results (regardless if the limit is 50 or 500).
      • I say "read-only" because I don't plan to make any non-test edits on Commons. The bot only queries Commons API to find the most recent data it needs. All the output and extra edits will appear on English Wikipedia. Hence "read-only" for this request as far as Commons space is concerned. I realize this isn't the usual request and doesn't directly benefit Commons. I hope it will benefit it indirectly with more user participation for deletion requests for images used on English Wikipedia.
      • I admit I'm not quite sure what you mean by "[..] gathering issue lists at Commons for all projects". If you mean that ideally the bot would run on Commons and build reports sorted by project for Commons editors to use -- yes, that is unfortunately out of scope for me. Honestly, I just don't have the time. The community has expressed interest in this and a solution is being investigated by WMF, and I think it might eventually work for Commons if they make it compatible. But it's out of scope for this project/bot at the moment. Even this request is quite the task. —  HELLKNOWZ  ▎TALK  ▎enWiki 18:39, 14 September 2018 (UTC)[reply]
        Understood and agreed. I'd still say this should run without flag unless there arises any showstopper, just because we're saying No to all other read-only requestors. --Krd 06:54, 15 September 2018 (UTC)[reply]

I'm going to close this as withdrawn per above. --Krd 14:45, 21 September 2018 (UTC)[reply]