Commons:Bots/Requests/SchlurcherBot11

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

SchlurcherBot (talk · contribs)

Operator: Schlurcher (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Remove duplicate information from file description templates that is already stored in structured data and automatically recoved by the corresponding template from associated structured data

Examples:
Process:
  • To minimize errors, the script is not cross checking information, but instead constructing wiki template code like {{Location|50.106395|8.705226}} solely based on structured data information and then performs a string match to the file description information. Only if an exact match and thus duplicate information is found there, it will proceed to replace the template to a version without parameters (which will then recover the identical information from structured data)
Benefits:
  • No visual change to displayed information on page
  • Removal of duplicate information

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Batches based on prepared lists, starting with files in Category:Pages with maps

Maximum edit rate (e.g. edits per minute): 30

Bot flag requested: (Y/N): N (Bot has flag already)

Programming language(s): C#

Schlurcher (talk) 22:04, 1 November 2022 (UTC)[reply]

Discussion

Thanks. Approval/signoff for further changes is fine and I've updated the proposal accordingly. --Schlurcher (talk) 08:09, 2 November 2022 (UTC)[reply]
  • Absolutely not. This has zero benefit and substantial downsides for Commons users.
  • You have not indicated how this would provide any actual benefit to Commons users. "No visual change" is no more a benefit than making no edits at all. I fail to see any benefit to "removal of duplicate information", given that the duplicate info was already created by this bot and does not affect the user's view on the file page. Duplicate information is only an issue when the wikitext coordinates and the SDC coordinates don't match - and that almost always occurs when someone corrects incorrect coordinates in the wikitext and the bad coords remains in the SDC.
  • Every discussion about SDC on Commons has had the clear consensus that wikitext should remain the primary source of information about a file, and that SDC should not involve removal of information from wikitext. Removal of coordinate information from the wikitext is a clear violation of that.
  • Wikitext is the easiest way to add and edit coordinates - including removing them. (There have been several mass uploads with bad coords, such as files being given the coordinates of the nearest town, so this is a substantial issue.) I can add/remove/edit wikitext coords on multiple pages with VFC; SDC coords require manual edits or using even-less-user-friendly-than-VFC queries.
  • Wikitext coordinates can always be added or modified in a single edit along with other changes (such as uploading using the basic upload form, or editing categories or description). SDC coordinates require a separate edit. How does it benefit users to have a bot make a cosmetic edit on a significant fraction of files that are uploaded, just so that coordinates are no longer in the wikitext?
  • This is a massive number of edits - there are currently about 25 million files in that category. SDC bots have already been clogging up watchlists for several years now by making multiple edits per file. Why should the community allow this to increase?
I am also concerned that the most-seen structured data bots on Commons are operated by users who do not actively upload files - you have uploaded two files in the last three years, and Multichill only a handful. This means you are rather divorced from the experience of users who are actively uploading files, and the issues they face. Pi.1415926535 (talk) 00:15, 2 November 2022 (UTC)[reply]
@Pi.1415926535: . Thanks for your thoughts. The idea of this request was to challenge the consensus (that I am aware of) that wikitext should remain the primary source of information about a file. I have to agree that editing wikitext is the easiert way to add or edit coordinates, but it's also the most unstructured way. While maintining SchlurcherBot, I've come across too many variations on how coordinates are described in wikitext or included though various templates and sub-routines. Thus, systematically parsing this information is almost impossible. On the other hand from SDC, it's easy to make a map view of all your contributions and there is so much more potential: [2]. Are there any other next steps that you think will help adoption of SDC? --Schlurcher (talk) 08:09, 2 November 2022 (UTC)[reply]
re: clogged watchlist: There is ABSOLUTELY URGENTLY AT ONCE needed a way to unwatch SchlurcherBot and BotMultichill only. There is a thousends ways to filter the watchlist for this and that. But there is NO NONE NOT ANY way to filter SchlurcherBot and BotMultichill only. While today I experience less than in the past 1000+ edits by one of this Bots within less than 24 hours (actually less than 24 minutes), it still happens from time to time and with this proposal (especially if only one field at a time is processed) it will become worse than ever. C.Suthorn (talk) 09:53, 2 November 2022 (UTC)[reply]
  •  Oppose, per Pi.1415926535. -- Tuválkin 00:49, 2 November 2022 (UTC)[reply]
  •  Oppose in my experience as an uploader it is easier to describe files through the wikitext/Information fields, not structured data. Of course the structured data fields can only be edited by the bot, but what if the fields contain inaccurate information? Like wrong dates or wrong coordinates? It will be very difficult for majority of users like me to make corrections, forcing us to file requests that needlessly take our time. This burden will be double for new users who may not immediately know where or how to file requests for incorrect data correction (maybe some will use the file talk page, others technical requests sub-forum of COM:Village pump, still others COM:AN or at a certain admin's talk page). But just to be clear that I do not oppose what SchlurcherBot is already doing; rather proposals must take into account the convenience of all users too. Second in motion to Pi.1415926535's input. JWilz12345 (Talk|Contrib's.) 04:21, 2 November 2022 (UTC)[reply]
  • I agree with Jmabel, please don't take the outcome of this request as a green pass for all parameters/fields. In principle I support the idea of having singular data to minimise the amount of editing needed in case of change. Certain counter arguments above boil down to "I don't like it / it has always been like that", but I do share the concerns about making large-scale editing more difficult for non-bot users. There are tools for SDC-editing multiple files, but they are principally aimed at adding new data, not changing/removing existing SDC. I think these tools should be updated first, before this bot request can be authorised. --HyperGaruda (talk) 05:55, 2 November 2022 (UTC)[reply]
  • Strong  Oppose I think the timing is clearly too early. At present, it would only bring chaos and strife. In fact, I think it's questionable whether it should happen at all. There is so much information that cannot even be transferred into the structured data yet. Only when that is consolidated, a overview site that is acceptable and practicable for third parties is created, can one think about deletion. First and foremost, the following steps should be completed:
  1. Transfer of all necessary and useful data into the structured data.
  2. Adding to the guidelines for Featured images, Quality images and Valued images the requirement that the structured data is maintained.
  3. Structuring and addition of the {{Information}} template on the pages that do not already have this.
  4. Creation of a compact overview page based on the structured data.
  5. Change of presentation: structured data as primary view (overview page), old presentation as secondary view.
Please no actionism. --XRay 💬 06:27, 2 November 2022 (UTC)[reply]
There is no intention for actionism, more for a discussion on how we should move in the future. So far, I've implemented this only to files that were uploaded by myself. To the extent that most files now solely rely on structured data, like File:Vanillekipferl-Nahaufnahme.jpg. Still, I also have a list of files on my userpage were this was not possible yet. I appreciate your thoughts on some good ideas how to eventually reach a state where we rely (more) on structured data. This proposal was also seen as part of this. Currently, my bot is only working on the first item. --Schlurcher (talk) 08:09, 2 November 2022 (UTC)[reply]

 Info I had cross-posted this on Commons:Village pump, as I was aware that this request would indeed change the current philosophy that the wikipage is the primary source of information. Some further comments from Commons:Village pump: --Schlurcher (talk) 08:09, 2 November 2022 (UTC)[reply]


  •  Oppose The action would tickle an enormous number of watchlists for no visual change. Schlurcherbot already pounds my watch list with edits such as hey this .svg file is an SVG file! (Something the MW API already knows.) Furthermore, magic fields confuse users. Say a user wants to edit the text filled in by structured data. The user clicks edit and searches for the text, but the text is nowhere to be found on the page. Glrx (talk) 03:13, 2 November 2022 (UTC)[reply]
redundancy (engineering) is good. RZuo (talk) 04:22, 2 November 2022 (UTC)[reply]

So to summarize this discussion: Thanks everyone for your contributions. As of now, there is no interest to remove any information from the file description pages (not even or specifically not only in the code). Users generally seem to edit/correct the wikitext only (often leaving the structured data out of sync/uncorrected). There even seems to be the perception that structured data can be only edited by bots (which is not the case, but mass editing structured data is indeed tedious, and no adequate tools exist es of now). I'll consider this request rejected. I'll keep my bots focus on adding structured data and will rather consider extending functionality in the direction of syncing changes made to wikitext with structured data. I'll make a corresponding request once ready. So, I guess this can be closed/archived. --Schlurcher (talk) 19:03, 4 November 2022 (UTC)[reply]


Withdrawn. --Krd 05:13, 6 November 2022 (UTC)[reply]