Commons:Bots/Requests

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
This project page in other languages:

Shortcut: COM:BRFA

If you want to run a bot on Commons, you must get permission first. To do so, file a request following the instructions below.

Please read Commons:Bots before making a request for bot permission.

Requests made on this page are automatically transcluded in Commons:Requests and votes for wider comment.

Requests for permission to run a bot

[edit]

Before making a bot request, please read the new version of the Commons:Bots page. Read Commons:Bots#Information on bots and make sure you have added the required details to the bot's page. A good example can be found here.

When complete, pages listed here should be archived to Commons:Bots/Archive.

Any user may comment on the merits of the request to run a bot. Please give reasons, as that makes it easier for the closing bureaucrat. Read Commons:Bots before commenting.

Operator: Ammarpad (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: File description cleanup and categorization for files uploaded with Reworkhelper tool. Per this request

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): One time run

Maximum edit rate (e.g. edits per minute):

Bot flag requested: (Y/N): N (the bot already has a bot flag )

Programming language(s): Python

Ammarpad (talk) 18:25, 9 September 2024 (UTC)[reply]

Discussion

Operator: Fl.schmitt (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Add {{Information}} to Media missing infobox template. See exhaustive preparative discussion on Commons:Bots/Work_requests#Media_missing_infobox_template. The bot tries to put as much information as possible into SDC fields (author, source, captions, date), since {{Information}} uses those data as default.

Automatic or manually assisted: Manually assisted. The bot follows "divide and conquer" tactics. Since it seems to be impossible to apply one solutions to > 300,000 media files lacking an infobox template, it will work on sets of files, usually defined by same author / creator (assuming that those files share sufficient similarities). The bot will be run multiple times on that set of files in different modes. First, analyze the file page content and try to categorize each of its components, without modifying and content on Commons. This step will be repeated (manually) as often as needed to adapt the categorization patterns, until a pattern set that fits for all file pages of the current set has been found. Now, a "dry-run" ("simulation") generates an overview over the "planned" modifications (see txt and SQLite analysis and simulation results for Category:Media missing infobox template (maps t1)). Only if this simulation result seems acceptable, the bot will run in "doit" mode to apply the "proposed" edits.

Edit type (e.g. Continuous, daily, one time run): Multiple times a week, but not daily.

Maximum edit rate (e.g. edits per minute): Maybe 5-6 per Minute?

Bot flag requested: (Y/N): Y

Programming language(s): pywikibot

Fl.schmitt (talk) 21:56, 6 September 2024 (UTC)[reply]

Discussion

Operator: DaxServer (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Commons:Batch uploading/U.S. Army Corps of Engineers Digital Visual Library

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): One-time

Maximum edit rate (e.g. edits per minute): 10

Bot flag requested: (Y/N): N (see #2)

Programming language(s): OpenRefine

-- DaxServer (talk) 20:58, 1 September 2024 (UTC)[reply]

Discussion

Operator: トトト (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Simple text replacement

Automatic or manually assisted: Automatic manually assisted

Edit type (e.g. Continuous, daily, one time run): One time run

Maximum edit rate (e.g. edits per minute): 6 edits per minute

Bot flag requested: (Y/N): Y

Programming language(s): Python Pywikibot

トトト (talk) 12:40, 30 August 2024 (UTC)[reply]

Discussion

Operator: Leaderboard (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: meta:Global_reminder_bot for Commons

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Daily

Maximum edit rate (e.g. edits per minute): Roughly expected to be a maximum of ~2-3 edits per day

Bot flag requested: (Y/N): N (the bot already has a bot flag for another task, but the bot flag won't be used for this task)

Programming language(s): Python

A test edit is available at testwiki:User talk:Leaderbot demo.

Leaderboard (talk) 09:16, 22 August 2024 (UTC)[reply]

Discussion

I think this shouldn't be done. We don't make much use of temporary user rights at Commons, and I don't remember even a single case where rights expired without the user noticing it but leading to disruption. If a temporary right expires, it will be extended on request, and it makes no difference if renewal happens before of after it expires. If there are precedents which suggest different, please advise. --Krd 07:36, 29 August 2024 (UTC)[reply]

Hi @Krd: , my experience (at least on Meta) is a bit different - I've seen cases like metawiki:Talk:Steward_requests/Global_permissions/2024#Question happen which was what motivated me to write this bot. Now Commons might be a bit different on that respect, but in my opinion, I see this as a "no loss" situation and while yes the user can renew the rights after expiry, the point is to avoid this disruption entirely especially if it's a right that requires some sort of discussion. Worst case the user ignores the notification. And Commons does appear to make use of temp rights, at least from the user rights log (covering rights such as primarily IPBE, account creator and trials for rights such as autopatroller).
(Note that if the decision is to not approve this bot which I understand, Commons will be placed in the bot's opt-out list which means that no user can opt-in and the bot will be disabled entirely on the wiki. Users can easily opt-out from the bot however if they want) Leaderboard (talk) 09:02, 29 August 2024 (UTC)[reply]
I think worst case is that by the notifications users may request renewal for rights they actually don't need, just because they can. But, I'm just providing feedback and will make any decision here. Krd 06:26, 30 August 2024 (UTC)[reply]
"users may request renewal for rights they actually don't need" - can happen even without a reminder, right? I do appreciate your feedback in any case BTW - this helps when thinking about it for other wikis as well. Leaderboard (talk) 06:37, 30 August 2024 (UTC)[reply]
It can happen anyway, but per my experience the reminders will significantly raise the number of cases. If you look at Commons:Administrators/Inactivity section, the part of users who signed for keeping their right and still was inactive half a year later, has always been significant, and just dropped a bit in the last few years, Krd 02:04, 2 September 2024 (UTC)[reply]

Operator: MFossati (WMF) (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: add the following structured data statement and qualifier to the file page of a new upload that is detected as a logo by this tool.

Automatic or manually assisted: automatic, supervised

Edit type (e.g. Continuous, daily, one time run): continuous

Maximum edit rate (e.g. edits per minute): it depends on the amount of image uploads and on the amount of images detected as a logo. Hard to tell for now

Bot flag requested: (Y/N): Y

Programming language(s): Python, Pywikibot

Source code: https://gitlab.wikimedia.org/toolforge-repos/gogologo

MFossati (WMF) (talk) 12:19, 24 July 2024 (UTC)[reply]

Discussion
  • I think it'll much better application for bot it it could detect non-trivial logos or logos already deleted. --EugeneZelenko (talk) 14:41, 24 July 2024 (UTC)[reply]
  • Wouldn't it be better to add them with a separate property? While I'm in favor of adding more such ways to identify images, I don't think it mixes well with other statements. This was attempted and finally discarded with "depicts" statement a while back. Please make sure these statements can also be searched with Special:Search. Enhancing999 (talk) 14:53, 1 August 2024 (UTC)[reply]
    Hey Enhancing999, thanks for your comment. Could you please provide any specific pointers to the previous attempt you mentioned? MFossati (WMF) (talk) 11:29, 12 August 2024 (UTC)[reply]
    here Enhancing999 (talk) 11:31, 12 August 2024 (UTC)[reply]
  • Is this bot going to be used as "act once on new uploads", "act once on all existing files", "potentially act more than once on the same file", or what? Unless it only acts exactly once on any given file, what is to prevent it getting into an edit war if its edit is reverted or otherwise changed? - Jmabel ! talk 18:11, 1 August 2024 (UTC)[reply]
    Hi Jmabel, thanks for your question. The bot is expected to act once on new uploads. MFossati (WMF) (talk) 11:31, 12 August 2024 (UTC)[reply]
    • Good. Is there any chance that the bot could also look at the wikitext for {{Own work}} and add a maintenance category (call it Category:Own work logo to checked) if it appears to be a logo and is claimed as "own work"? We see that combination a lot, and it is almost never true. And possibly something similar for a logo + any CC license, because that's usually false as well: we very rarely get a license for any logo that is above the threshold of originality. - Jmabel ! talk 15:15, 12 August 2024 (UTC)[reply]
      I agree that the ability to search for logos plus own work and/or CC licenses would make a lot of sense. I think this is something we can do by querying structured data. For instance, we can already run a query like this to look for own work files with CC BY-SA 4.0. As soon as the proposed logo statements get added, we can then insert a wdt:P31 wd:Q1886349 constraint in the query. MFossati (WMF) (talk) 09:50, 14 August 2024 (UTC)[reply]
  •  Comment As requested by the rules, we've test-run the bot on 100 uploads randomly sampled from uploads made between Aug 21 and today, and here are the results:
    • 4 medias were deleted beforehand, so no edit
    • 1 media was skipped (maximum retries attempted due to maxlag without success), so no edit
    • 95 medias were successfully edited
It seems that it successfully worked, but we'll wait for community review. Sannita (WMF) (talk) 15:34, 30 August 2024 (UTC)[reply]
It appears each file is edited twice. Is that for technical reason, or can the edits be combined in any way? Krd 17:36, 30 August 2024 (UTC)[reply]
Great point, Krd! It made me realize that the current code first adds the claim, then adds the qualifier, thus producing two edits. I've just tried that we can do the other way around. So - yes - we can indeed combine them into a single edit. I've updated the code accordingly. Thanks a lot, this is really helpful. MFossati (WMF) (talk) 14:16, 9 September 2024 (UTC)[reply]
Can you use another property than P31 as suggested above? I think we should avoid a re-run of c-a t where WMF mostly ignored community input.
 ∞∞ Enhancing999 (talk) 17:52, 30 August 2024 (UTC)[reply]
Hi @Krd and @Enhancing999, thanks for your feedback and sorry for the late reply, for some reason your replies did not appear in my notifications.
While we wait for @MFossati (WMF) to be back in office for answering the first question, we are open to suggestion as to which property to use. @Enhancing999 do you already have one in mind? Sannita (WMF) (talk) 16:11, 5 September 2024 (UTC)[reply]
You can create one ad hoc.
 ∞∞ Enhancing999 (talk) 17:16, 5 September 2024 (UTC)[reply]
@Enhancing999 Sorry for the long answer, but I felt the need to clarify some things about the request.
We need to start somewhere to see if the experiment is of some value to the moderators. This is an experiment within the first quarter OKR work for FY24/25 (WE2.3.1). We don't think a new property would work, especially because the property proposal request would likely be considered too specific in scope to be accepted by the Wikidata community, not without reasons.
We can quickly and easily use an existing property, and see if it’s valuable. If not, we will rollback as quickly and easily. The property instance of (P31) seems like the best fit, because we think it’s specific and meaningful. More importantly, the property is indexed, thus enabling search queries both in Special:Search and in Special:MediaSearch. Furthermore, qualifiers are also indexed, so it will be possible for moderators to find media classified as a logo by this bot. You can either use a search query (example with Special:Search, example with Special:MediaSearch) or a SPARQL one to achieve it.
If detecting and tagging incoming logos does not help with easier logo moderation, then our plan is to rollback our own edits at the end of the experiment. If it does help, then we’re planning to investigate other ways to store and query such data, as we are considering other experiments in the near future as suggested by the community. Sannita (WMF) (talk) 15:09, 9 September 2024 (UTC)[reply]
Wikidata easily creates properties that are just meant to be used for Commons. This shouldn't take much time and compared to working speed of WMF (It's seven weeks since you asked for input), this shouldn't be an issue. Nothing prevents you for indexing this property as well.
If you think a separate property wont work, it means that ultimately this wouldn't work using instance of (P31) either. I think such implementations need more attention than once every month.
Given the massive community backlash WMF got from an ill-prepared, hastily implement, not community feedback driven, likely costly previous experiment mixing machine contribution with our highly valued volunteer contributors, I think it's good to take good care this time, especially as a simple way was suggested already seven weeks ago.
 ∞∞ Enhancing999 (talk) 15:43, 9 September 2024 (UTC)[reply]
@Enhancing999: unless there are a lot of false positives (and I don't think there are), the tagging of these as instance of (P31) : logo (Q1886349) seems at worst harmless. What would be the advantage of a distinct property? - Jmabel ! talk 04:45, 10 September 2024 (UTC)[reply]
There are likely few false positive in the first test set as it's still followed, but last time, it became problematic when person at WMF developing it moved on to something else.
Based on past experience, I guess you know what happens afterwards: you will have to wait 7 weeks for an acknowledgment, then you will be told to ask for a change in the next wishlist, and, even if everybody agrees with it, you will have to wait for the next annual plan to have it scheduled. Possibly somebody will then throw it out entirely, because they don't know how to fix it.
In any case, the idea is to classify also images where there is a lower confidence in the automatism so review is necessary.
Using two different properties allows users to easily switch between volunteer assessment and machine assessment, focus on volunteer assessment while excluding machine assessment if they happen to agree.
 ∞∞ Enhancing999 (talk) 11:12, 13 September 2024 (UTC)[reply]
Is a coat of arms or a military unit insignia or a sports uniform a logo per the definition a "logo"? --Krd 07:29, 13 September 2024 (UTC)[reply]
@Krd: we're targeting images similar to Category:Logos, thus making a distinction between other classes such as Category:Coats_of_arms or Category:Sports_kit_templates. MFossati (WMF) (talk) 13:45, 13 September 2024 (UTC)[reply]
In my personal opinion there are too many false positives. Krd 13:52, 13 September 2024 (UTC)[reply]
Special:Permalink/923690458 has a gallery of images edited by the bot. Personally, I don't think false positives are an issue as such, at least when they are clearly distinguished from manual edits (see separate property above).
 ∞∞ Enhancing999 (talk) 14:08, 13 September 2024 (UTC)[reply]
I agree that most of them are some kind of symbols or graphics, but I'd guess a third of them would not be put under Category:Logos, so "instance of logo" doesn't make much sense then. Am I mistaken? Krd 14:16, 13 September 2024 (UTC)[reply]
It really depends what the logo people want to do with it. Today it's "logos", but it could be just any image type or topic. The confidence level of the classification can also evolve or be changed.
 ∞∞ Enhancing999 (talk) 14:29, 13 September 2024 (UTC)[reply]