Commons:Village pump/Proposals/Archive/2020/02

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Halo, Wikipedia.

saya masih bingung cara mengunggah foto. — Preceding unsigned comment added by Scenedry (talk • contribs) 12:10, 13 February 2020 (UTC)

@Scenedry: Category:Commons help/id. —Justin (koavf)TCM 13:38, 13 February 2020 (UTC)
This section was archived on a request by: Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 08:32, 18 February 2020 (UTC)

Restrict Computer-aided tagging to autopatroll users till the major problems are solved

The Computer-aided tagging tool is producing many bad edits, especial by new users. (Discussions on: Village pump & Commons talk:Structured data/Computer-aided tagging) There are many feature requests to solve these problems and the AI has to become better too. Because of these problems this tool should be restricted to trusted and experienced users with autopatroll rights for now. When the tool works very well it could be open for everyone, even IP users. --GPSLeo (talk) 19:30, 15 February 2020 (UTC)

I agree, though this doesn't go far enough. I have an RfC in draft. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:50, 16 February 2020 (UTC)
AI has to become better - We will be making it better by doing the verification job for Google (it's not bad). Reinforcement based learning, rejected tags will be not shown again. How do you think they improve these results ? -- Eatcha (talk) 10:46, 16 February 2020 (UTC)
I agree with @Pigsonthewing: that this tool is a nightmare. Tags are of no use, we need precise depict statements only. Please could you ping me Andy once the RfC is open? :-) --Vojtěch Dostál (talk) 13:06, 17 February 2020 (UTC)

Our video cut trim tool is working fairly well (still improvements needed of course).

Wondering peoples thoughts about adding this to the left sidebar for video files, similar to how we have the "CropTool" for images? Doc James (talk · contribs · email) 03:18, 10 February 2020 (UTC)

I would love to have a direct link to this tool from the target video. Having this on the side bar would be great. Thanks -- Eatcha (talk) 03:25, 10 February 2020 (UTC)
Usually to alter the sidebar one edits MediaWiki:Sidebar. But I do not see the tools listed. Doc James (talk · contribs · email) 04:16, 10 February 2020 (UTC)
How many times has it been used?
Independently of whether this gets accepted or not, would it be an idea to bundle some helpful video tools together, and offer it as optional? Then people who want to work with video can just turn them on at once. Effeietsanders (talk) 04:47, 10 February 2020 (UTC)
Probably only been used a dozen times. It is just getting functional. What other video tools do we have to bundle? Doc James (talk · contribs · email) 05:45, 10 February 2020 (UTC)

I would love to have a video editing tool as part of Commons seen under/next to videos, happy to help with things if needed to make it easily available to people. I have no opinion as to whether it should go in the sidebar or underneath the video itself, as long as its obvious the tool is available (also not sure if there are different requirements for either). John Cummings (talk) 10:00, 10 February 2020 (UTC)

Does anyone know how to do this? User:Steinsplitter? Doc James (talk · contribs · email) 04:45, 20 February 2020 (UTC)

Abbreviated Dutch names

Dutch names that are suffixed with "szoon" ("son of") are very often abbreviated to "sz". The most famous example probably being Rembrandt Harmenszoon van Rijn, often rendered as "Rembrandt Harmensz. van Rijn" or "Rembrandt Harmensz van Rijn". In a large number of these cases, the abbreviated version has become the most common rendering of the name. On Commons, we sometimes write these names with the period (Dirck Jacobsz., Jan Ewoutsz., Lubbert Gerritsz.), and sometimes don't (Leendert Claesz, Arend Fokke Simonsz, Ernst Jansz). In English language sources, it seems that the period is generally left off of these abbreviated names, while Dutch sources more commonly use the periods. Would it make sense for us to try to implement a standard practice for these? Here are the options:

  • Option A: Always expand abbreviated Dutch names
  • Option B: Always use a period when abbreviated
  • Option C: Never use a period when abbreviated
  • Option D: Handle on a case-by-case basis (status quo)

Kaldari (talk) 19:32, 28 February 2020 (UTC)

I never actually realized it, but category names are almost always English here. See Category:Artists from Japan, nothing in the Japanese script. So I guess we should follow the English sources. - Alexis Jazz ping plz 09:01, 29 February 2020 (UTC)
Option C would be best IMO, but I could be persuaded by arguments to use Option A since this is a multilingual project. Abzeronow (talk) 18:40, 29 February 2020 (UTC)

Proposal to implement blocking by abuse filters

Following the scoping discussion #Time for abuse filters to block (temporary and permanent) (permalink), a formal proposal for consideration.

One of the standard abilities for abuse filters in mediawiki is to allow blocking of accounts or IP addresses (Block the user and/or IP address from editing) based on criteria in a filter. It has not been something that we have typically needed over the earlier years as we haven't had persistent vandalism or spam. Things have changed, and it is the time for us to move to having blocking functionality available.

[technical detail https://noc.wikimedia.org/conf/highlight.php?file=abusefilter.php and setting $wgAbuseFilterActions['block'] = true;]

If that occurs we also need to define a default period for blocks. I suggest that the default would least demonstrate that we are looking for a minimal approach, so let that be the most gentle setting. Though noting that this would just be a default, and a dropdown with other values will always be present for selection.

To have this change made at Commons, we would need to demonstrate a consensus of the community, and lodge a phabricator site request. Noting that this is a technical change, not a policy change to what we block, or to the blocking policy. Accordingly I propose:

  • Wikimedia Commons moves to have enabled the ability to block through its abuse filters.
  • Default periods for blocks to be 2 hours for user accounts, and 2 hours for IP addresses.

I also note that if consensus is reached that Commons administrators will need to work to operational guidance and that is being developed in a separate section, and is outside of the scope of this technical request, and will have a separate consensus.  — billinghurst sDrewth 12:58, 15 February 2020 (UTC)

Support

  •  Support as proposer  — billinghurst sDrewth 12:59, 15 February 2020 (UTC)
  •  Support --Herby talk thyme 13:30, 15 February 2020 (UTC)
  •  Support Christian Ferrer (talk) 19:27, 16 February 2020 (UTC)
  •  Support Kaldari (talk) 00:30, 17 February 2020 (UTC)
  •  Support unfortunately because there's not enough admins.--BevinKacon (talk) 12:43, 22 February 2020 (UTC)
  •  Support But don't allow non technical users to touch the filters Eatcha (talk) 14:57, 22 February 2020 (UTC)
  •  Support I've actually thought about making this very proposal before. This can be extremely useful in dealing with LTAs. In fact, my relatively recent dealings with an LTA was what make me think about wanting this. However, there would have to be some restrictions. My original filter has been modified to be much more broad at this point to capture more of the LTA's attacks. Currently, there are false positives. Not many. But even a few being caught by a blocking filter would be too many in my mind. For that reason, admins who use this functionality need to be aware of what they are doing and be willing to monitor their filters extremely closely to ensure that any false positives are dealt with quickly and the filter is modified to preclude them. The filter debugging page can be used to quickly undo errant blocks under such a scenario. A blocking filter is a nuclear option and it should be treated as such. I do believe that it is an option that we should have and I do believe that in very limited circumstances it would be a massive benefit to be able to do but the ramifications of its use need to be fully understood by those that use it and the consequences for misuse should amount to admin abuse. --Majora (talk) 17:33, 23 February 2020 (UTC)
  •  Support --pandakekok9 01:48, 14 March 2020 (UTC)

Oppose

  1. The referenced consensus is weak, two supports and general discussion is not convincing. This type of systems decision can and should be made on convincing reports and analysis. We do not have to implement the filter in order to do testing, we can simply run a test of the proposed filter against past contributions and analyse what the impact would be, both positive impact for reducing disruption to this project, and negative impact for possible good-faith contributors. Without this, it is unclear what a "minimal approach" is, or how it would be measured. So, let's have some test reports so the community can vote against more than hypotheticals. -- (talk) 13:13, 15 February 2020 (UTC)
  2.  Oppose I have personal bad experience with vandalism-detecting filters in en.wiki. I do not know, what kind of edits are considered vandalism by bot. I have seen no analysis yet about proposed filters. What if quarter of blocks will be false positives? I do not know that, I feel, that nobody knows. After test run and analysis my vote can change. Taivo (talk) 19:00, 15 February 2020 (UTC)
    @Taivo: Every edit you have been making is already going through every active abuse filter, there is no change involved here. The suggested change is an action that comes from an abuse filter. From your 167k edits, maybe you can explain and relate on your experiences with abuse filters affecting your editing here, I can see about 42 interactions in the logs.

    With regard to the processes, I covered that separately below, and our process would not be getting that criteria, that is why we test and manage. We already know what is happening here. I gave specific links to meta's logs (abuse and block) where there is the process in place and it can be demonstrated what is happening. I perfectly understand a cautious approach, and that is what is being proposed.  — billinghurst sDrewth 09:41, 16 February 2020 (UTC)

    "Every edit you have been making is already going through every active abuse filter"
    Technically that's true, but many filters exclude administrators and even more exclude patrollers/autopatrollers. So an admin never tripping an abuse filter doesn't mean much. - Alexis Jazz ping plz 12:36, 18 February 2020 (UTC)
     Oppose on procedural grounds. This should go at VP and not here. VP has twice the page watchers and is the appropriate place for seeking community consensus. GMGtalk 02:44, 16 February 2020 (UTC)
    @GreenMeansGo: This seems to be resolved now. Kaldari (talk) 00:30, 17 February 2020 (UTC)
  3.  Oppose, abuse filters themselves are open to abuse, if an admin doesn't like certain behaviour it can be blocked and having an automated blocking process will only create a less collegial working environment for everyone. Imagine being a newbie doing his/her first edits and then getting permanently blocked because you triggered some filter, you probably won't even know what filter you triggered and your only impression of this website is that you're not welcome (for whatever reason), this is something that already happens on other Wikimedia websites. What's worse is when we allow some users (sysops, rollbackers, autopatrolled, Etc.) to do some edits but block others for the exact same edits, this just creates an even more unfair system where some users are more disadvantaged than others. We need more human eyes on admin actions, not less. Countless of free files already get deleted because someone who doesn't properly understand "Commons:licensing" tags some image as "unsourced" or "speedy" and then an admin just deletes it. So why would we make an already imperfect system even more imperfect? Also, unblocking is a nightmare, it's already difficult to get unblocked now, let alone if you're blocked by an automated system and don't even know what you did that was wrong and saying that you understand why you've been blocked and won't do it again is a prerequisite for unblocking but sysops can still decide to leave the block in place. This will just create a whole lot more problems than it solves. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 10:58, 23 February 2020 (UTC)
  4.  Oppose The possibility of false positives makes this a non-starter for me. I also agree with a lot of what Donald Trung said. Abzeronow (talk) 18:23, 27 February 2020 (UTC)
  5.  Oppose I don't oppose the essence of the idea, but I do oppose this form. For starters, the duration should be reduced to 1 hour and only increased to 2 hours if that is proven to make a real difference in practice. AF blocks should be partial blocks unless there is absolutely no other way. User talk (the namespace in its entirety, not just the user's talk page) should not be blocked unless talk page abuse is what the filter targets. Equally, if talk pages are the target, File: should not be affected. And last but definitely not least: we will require more transparency. While the filters themselves can't be made public as this would explain exactly how they can be evaded, there should be a public list of all current targets. Basically a list of LTAs, spammers with description and the like. If it's not on the list, there shouldn't be an AF block. And when any entry on that list or any particular filter is disputed by the community, that filter should be disabled until the dispute is resolved in discussion. And by that I mean actually resolved in discussion, so not what Jameslwoodward did. - Alexis Jazz ping plz 08:34, 29 February 2020 (UTC)

Neutral

Comment

In response to . Umm, I referenced no consensus, this is the discussion for consensus. I mentioned a scoping discussion.

With regard to your request for analysis, there is plenty of evidence of spambots active here, and those attempting to be active here. We have been manually been blocking these for years, and this is to stop having to do this manually. This proposal does not change what we are blocking, to that there is no change, it is the processing from manual to automated. This becomes about ensuring that the filters are targeted appropriately, and tuned appropriately for their use, and to agreed measures, some here are close though would need tuning to go the next steps. I linked to some of those active blocking filters at Meta, which would be similar, though not exact that were performing well on 700+ wikis covered by global abuse filters.  — billinghurst sDrewth 14:27, 15 February 2020 (UTC)

Suggested guidance followed at #Draft of operational guidance for use of blocking by abuse filters. Feel welcome to make suggestions, or asked for clarifications to be made.  — billinghurst sDrewth 14:31, 15 February 2020 (UTC)
@Billinghurst: As I indicated previously, if you are seeking broad community consensus for site-wide changes, you need to transfer these discussions to the village pump. AN is a place for requesting administrator assistance, not a place for building community consensus, and having this discussion here instead of there is out of order. GMGtalk 14:34, 15 February 2020 (UTC)
When I scanned the referenced discussion, it read as a proposal with votes. You mention "general agreement" a few lines in, but the title "Time for abuse filters to block" I read literally. If you want to discount that discussion as no evidence of consensus, that's fine.
However in line with GMG's point, the history here is (1) run a proposal for "general agreement" that people vote on, (2) run a proposal to "implement" that is laid out as a vote, (3) run a proposal for "we would need to demonstrate a consensus of the community", which this presumably is not.
That's 2 votes more than we actually need and seems exhausting for the limited numbers of volunteers that will be interested and know what we are talking about. -- (talk) 20:09, 15 February 2020 (UTC)

Fæ, I wrote the following to the subject line "Time_for_abuse_filters_to_block_(temporary_and_permanent)"

Hi. To me we have some persistent LTAs and enough spambots getting caught against filters, that I think that it is time that we consider the ability to apply blocks with spam filters, either short term application or permanent. The blocking ability is now in place in numbers of wikis, and has been for a number of years and it is not seen as problematic or out of control. If we did go down the path, we would want to look at some concepts and practice around what would, and how would we apply temporary or permanent blocks, though as we already have a good blocking policy and application of that, then it is not about novel concepts of why we are blocking. If there is a general feeling of agreement, then I will put forward a more specific plan. …

So please don't selectively quote or misrepresent what has been said. I said that I would come forward with a proposal, and I have done so. I did not call for votes, and no body counted votes, they expressed opinions as guidance to my opening statement. I would also like to address the contradiction in some of the argument. It is indicated that this is a limited scope argument for a limited set of people interested and knowing about what we are talking. Yet also argued that the conversation should be at another forum where it would be of less interest and less relevance and small knowledge base, so how does that work? This is an administrator only action, and there are numbers of administrators who keep away from the area, so how is that going to progress in a more inexperienced forum.  — billinghurst sDrewth 10:08, 16 February 2020 (UTC)

The contributors affected by this change are not limited to administrators. It is weird to limit the discussion or consensus to those in the sysop group, when it is everyone that will be affected by it. From what you are saying here, I don't understand why you are replying to me, because I am not an administrator, so by the logic above, I have no say here on what happens. -- (talk) 13:09, 16 February 2020 (UTC)
 Info Comment made before the move from COM:AN -- (talk) 20:02, 16 February 2020 (UTC)
 Comment Hard to reply to this. I think the time period should be reduced to 1 hour, that's typically enough for a human to look at it. In my experience even humans have a hard time correctly identifying abuse. Also, these automated blocks should virtually always be partial blocks: user talk should never be blocked, unless user talk page abuse is specifically the target of the filter. I've experienced more than once the situation where I got blocked on a site by some automated process and as a result I also couldn't complain about my block, because I was blocked! And of course, how do we decide what kind of thing will be eligible for an automatic block? - Alexis Jazz ping plz 12:36, 18 February 2020 (UTC)
@Alexis Jazz: To my best knowledge, abuse filters can't partially block users, but blocking talk page access can be modified. I think talk page access should never be blocked using abuse filter.
We will most probably hide these filters so that people won't be able to bypass them. That's especially needed for LTAs. When we're sure that the filter is working properly, an administrator can bring it up on administrators' noticeboard. If there is at least one support and no oppose or there is general consensus, we can enable the blocking feature. These blocks should surely be monitored routinely. I can't think of any other proper way, because logs and code of a hidden filter is not available to non-admins. Maybe we should create an "edit filter managers" user group here as well so that skilled non-admins can work on filters too. Ahmadtalk 15:42, 18 February 2020 (UTC)
@Ahmad252: well I do remember one time (can't remember the exact details now, doesn't really matter) when I wasn't able to contribute to a WMF project. Somebody had misconfigured the abuse filter and basically everybody who wasn't a trusted user was blocked from editing. They hadn't gotten any complaints, because doh, nobody who was affected could file any. That's something we should really avoid. - Alexis Jazz ping plz 15:46, 18 February 2020 (UTC)
@Alexis Jazz: While I remember the incident you are talking about such a thing is no longer possible. The abuse filter now has an emergency shutoff function that would automatically turn the filter off if it is hitting too many people all at once. So while it is possible to design a filter to turn off editing for everyone, in practice the shutoff would stop that pretty quickly. As for your other comment that this should be a partial blocks I don't think that is possible. In any case, these types of filters, as I mentioned above, should really only be used as a nuclear option and such a thing is really only useful for persistent LTAs who repeated test multiple edits very very quickly to try to find a way around our current filter system. With blocking filters their first hit would short circuit their attempts immediately. The time frame for blocks can always be discussed and changed or tweaked. And if I recall correctly from the documentation the time frame set is only a default anyways. It can be modified during the creation of the filter. --Majora (talk) 17:46, 29 February 2020 (UTC)
@Majora: Emergency throttling looks neat. Is it known whether it works well in practice? And I know partial blocks are currently not possible with AF, but that could be requested if it hasn't been requested already. That does leave one thing which I actually do find important and probably also should exist for the current abuse filters: a public description of what it is targeting, if at all possible with a link to some related discussion or checkuser request. - Alexis Jazz ping plz 18:05, 29 February 2020 (UTC)
@Alexis Jazz: I don't know how well it works in practice. I've never designed a filter bad enough to have triggered it (luckily). I also don't know what Commons's settings are. The emergency shutoff is hard coded into each project's settings and those settings are not viewable by normal people. So we would have to ask what are settings are. All abuse filters have a public title. For most of these cases where a blocking abuse filter would be warranted I'm not sure a detailed description would be appropriate. Enwiki has a LTA page and I've actually heard of long term abusers trying to tailor their attacks to get a page there. They see it as a mark of pride. I'm not entirely sure naming an abuse filter directly after a specific LTA wouldn't have the same effect. That's why, at least in my experience, such filters have much more vague names. --Majora (talk) 18:09, 29 February 2020 (UTC)
@Majora: it's a balance between being effective and being open. What about a revision ID for the related discussion? - Alexis Jazz ping plz 18:17, 29 February 2020 (UTC)
If there is a specific revision ID or a specific discussion about a LTA I wouldn't mind putting that in somewhere. But for some LTAs I'm not aware of any community discussion regarding their persona non grata status here. They just are vandals. Pure and simple vandals, so discussions may not have occurred. If we are requiring a community discussion for a blocking filter to be used I wouldn't oppose including a link to that in the filter title. --Majora (talk) 18:21, 29 February 2020 (UTC)
  •  Comment - If abuse filters are going to block users, I think there must be a bot that notifies users of their blocks properly so that they will know how to appeal their block. Abuse filter doesn't do that, so the blocked user wouldn't know how they can appeal. Ahmadtalk 15:44, 23 February 2020 (UTC)
  •  Question Since the contents of abuse filters can be hidden to non-administrators, are there any plans to use blocks in those hidden filters? I don't think that's a good idea. pandakekok9 02:08, 12 March 2020 (UTC)
    @Pandakekok9: Filters for LTAs are generally always hidden. This is so they can't just look at the filter and figure out, immediately, how to get around it. As this feature would really only be used to block long term, highly active, abusers I would imagine that all blocking filters would be hidden by necessity. For example, one of the main filters that I monitor tracking a very long term abuser is generally tripped half a dozen times or more before they either give up for the time being, get through and are blocked that way, or get blocked by a watchful admin before they can actually make a edit (we have filter logs displayed in real time on IRC). Recently, on the 11th, they were blocked by the filter 8 times in the span of 5 minutes. With a blocking filter this would have been stopped at the first filter trip. --Majora (talk) 00:26, 14 March 2020 (UTC)
    @Majora: Okay, seems fair enough pandakekok9 01:47, 14 March 2020 (UTC)

To those that have opposed this proposal, , Taivo, GreenMeansGo, Donald Trung, Abzeronow, and Alexis Jazz. In the past few hours I've blocked 4 IPs and 1 2 accounts that are part of a long term, extremely active, abuser. I happened to be connected to the abuse filter log channel on IRC which allowed me to quickly notice and block them before any actual edit could be made but that is often not the case. It is situations like this that a blocking filter would be used for. There are no false positives (in the spirit of honesty there were some in the very early stages when I had a logic error in the code that has since been fixed). I can't really stay awake forever monitoring the IRC channel and this type of option being available in our toolkit would certainly help stop the disruption in these cases. --Majora (talk) 01:11, 18 March 2020 (UTC)

I only opposed on procedural grounds. Looks like that's been resolved. Stricken. GMGtalk 10:13, 18 March 2020 (UTC)
@Majora: alternative set of requirements for me:
  • A phabricator ticket is opened to request partial blocking with abuse filters if this doesn't exist already.
  • Every time an account or IP is wrongfully blocked, a post on the village pump must be made. For the sake of scale, this post may also include the number of correctly blocked IPs/accounts since the last report.
  • Knowingly omitting that report is ground for launching a desysop procedure without prior discussion.
  • If a filter or edit to a filter is responsible for 3 wrongful blocks on 3 different days within 3 months, this will be ground for launching a desysop procedure without prior discussion.
Too harsh? Note that I trust you, but this proposal doesn't restrict AF blocking to you. - Alexis Jazz ping plz 17:42, 18 March 2020 (UTC)
@Alexis Jazz: The first bullet point is not really a requirement more of a statement. The second is fine but I don't see why you chose the village pump for such a notification. The third shouldn't really happen as admins should be monitoring their own filters but I guess, ok. The fourth is a tad much as we are humans and mistakes do happen. One error in logic can hit a few people before it is caught and corrected. That is grounds for a reevaluation of the filter. Not for a public hanging. Limiting it to multiple occurrences would be fine as the literal interpretation of your fourth bullet point implies but you should still remember that we are not automatons. Mistakes do and will occur. As a side note this proposal is 'not about the operational requirements of operating a blocking abuse filter. It is about whether or not to turn the feature on or off. Nothing more. The details were always going to come as a secondary proposal. --Majora (talk) 21:20, 18 March 2020 (UTC)
@Majora: "The first bullet point is not really a requirement more of a statement."
It's a requirement for my support. I think such a request should be made by someone who actually works with the stuff, in other words, not me.
"The second is fine but I don't see why you chose the village pump for such a notification."
Other suggestions? AN could perhaps be acceptable, but it has to be public and not hidden.
"The third shouldn't really happen"
It shouldn't, but unfortunately we have a history of admins and bureaucrats doing all kinds of things they're not supposed to be doing. Look at this and then look at this. Some people just don't seem to care much.
"The fourth is a tad much as we are humans and mistakes do happen. One error in logic can hit a few people before it is caught and corrected. That is grounds for a reevaluation of the filter. Not for a public hanging."
Fair enough, but my fear is this: some admin who configures AF blocks and simply doesn't care about 10% false positives and continues to configure AF blocks like that, fully knowing they are causing damage but simply not giving a rat's ass. Something must be defined that allows a desysop to be started without being shut down by a bureaucrat because those bureaucrats can't interpret the rules as they were originally intended. I know, it's a different problem, but AF blocks are very powerful and in the wrong hands can cause a lot of damage.
"It is about whether or not to turn the feature on or off. Nothing more."
If conditions can't be negotiated beforehand, I'd have to oppose. Turning it on depends on conditions, at least for me. - Alexis Jazz ping plz 01:05, 19 March 2020 (UTC)
@Alexis Jazz: Oh! I thought you meant that you had already opened one. Not that you require one. There is phab:T201815. It is a part of a much much larger "Enhance blocking in AbuseFilter" request. Not exactly a promising ticket as the main ticket has been open since 2017 but there is a request there. AN should probably be fine. We don't have a centralized abuse filter noticeboard and AN would be both public and likely to get the attention of others who know how to configure the filter. More opinions are always better when trying to track down why something hit when it shouldn't have. I understand your desire to have conditions discussed beforehand, I really do. But that is kinda beyond the scope of this proposal and makes it harder for others to properly voice their opinion on the main topic. I'm fairly confident that we can refrain from using the blocking feature until the exact circumstances of its use are reached by community consensus. Even if it is turned on. And even if someone does overstep abuse filter blocks are logged, just like any other. The block itself is performed by User:Abuse filter so their block log will be able to show you every block that was done making even private filters partially trackable by everyone. --Majora (talk) 01:46, 19 March 2020 (UTC)

Category overdiffusion by date

Seeing the overdiffusion discussed in Commons:Categories for discussion/2019/06/Category:Mausoleum of Queen Arwa bint Ahmad Al-Suhayli by year made me cringe. Overdiffusion has been discussed previously (2018, 2019), but there has never been a follow-up. My main issue is the overdiffusion by date: a building will rarely look any different a year later so I really do not see the point of grouping pictures based on when they were taken. Yet there still are proponents of highly specific by-date categorisation and it has already rooted firmly into our category structures, so it will be very difficult to fix it. As a compromise, can we please at least codify that by-date categories should always be accompanied by its date-free counterpart or date-free subcategory thereof? --HyperGaruda (talk) 20:36, 29 February 2020 (UTC)

Overdiffusion votes

  •  Support as proposer. --HyperGaruda (talk) 20:36, 29 February 2020 (UTC)
  •  Support I'm not happy to compromise on uch an obvious issue, but if there is no way of getting rid of the unwarranted by-date categories, it's a helpful initiative. --Jonund (talk) 15:49, 1 March 2020 (UTC)
  •  Support I don't find these "buildings by year" categories helpful at all. -- Deadstar (msg) 16:50, 1 March 2020 (UTC)
  •  Oppose: Certainly there’s many “cringe”-worthy cases of bad categorization one could find (lovely collegial tone, too), but as it is this proposal is one more deletionist wikilawyering attempt. Define “overdiffusion” first, in a clear manner (akin to that of overcategorization), one that can be approved by wide community consensus. (Question number 1: What is overdiffusion, as opposed to regular diffusion?) Only after that can the problem, if there’s one, be addressed. -- Tuválkin 17:04, 1 March 2020 (UTC)
    • (oh, this again) “overdiffusion” is not defined. Files pertaining to a given year should be categorized as such. If other characteristics of the file or its contents, other than date, get lost by categorizing by year, then create other categorization subtrees, instead of undoing the work already done. If someone wants an overview, create a gallery. If this proposal is meant to mean «use common sense and do no create date-cats prematurely», then word it as such instead of allowing that sensible original motivation to be hijacked by those who think that the ideal number of categories is zero because wikidata and intersection and deep search and other bogus reasons. -- Tuválkin 01:55, 24 March 2020 (UTC)
  •  Support Such over-categorization makes it impossible to find anything on Commons. Kaldari (talk) 02:54, 2 March 2020 (UTC)
  •  Neutral These categories might be sometimes useful (for example, if I know I have taken a picture of the building on a certain day I can relatively easily find it), but I do not see any reason why these categories should not be hidden.--Ymblanter (talk) 06:45, 2 March 2020 (UTC)
  •  Oppose, while I can understand that not every building changes every year, major renovations and remodeling are a thing and buildings that retain the same address can change ownership and both exterior and interior decorative styles over the years. If a re-user is looking for Images of a certain building when it was owned by Company X and not Company Y or because a large part of it was damaged because of some natural disaster and had later been remodeled then these categories are very helpful. Images which basically contain the exact same building but with little difference 5 (five) years later or earlier can already be deleted if they don't add anything educationally of value on their own. These categories furthermore prevent the overpopulation of certain categories like "[Year] in Chicago" by narrowing it down and therefore makes it easier for re-users to find images. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 11:26, 6 March 2020 (UTC)
You make it sound as if I was proposing to entirely do away with date-based categories, which I am not. --HyperGaruda (talk) 06:58, 7 March 2020 (UTC)
  •  Support except in cases where we have literally 100+ images in many of the resulting categories. And even then this should typically not be the only axis of diffusion, and it shouldn't be intersected with the others. -- Jmabel ! talk 16:58, 18 March 2020 (UTC)
  •  Support Most cases I've seen like this (particularly with people by year) are ridiculous. Keep it simple please, and avoid this type of category unless it's *really* necessary. Thanks. Mike Peel (talk) 10:28, 24 March 2020 (UTC)
  • (Oh, the word "ridiculous", yet another token of collegiality and good faith.) Categories that categorize people by year intersect with Category:nn-year old humans; it’s better than categorization of individual files when there is enough of them — which is the case. -- Tuválkin 14:16, 28 April 2020 (UTC)
  •  Support: In general, intersection is a thing that can and should be done by software. If one day the UI will provide this feature in a reasonable manner we will get rid of millions of obsolete categories. However, I'm not sure if that will happen during my lifetime... --Achim (talk) 19:50, 25 March 2020 (UTC)
  •  Support but this doesn't solve the underlying issue, which is categories existing when they shouldn't. We should prohibit by-date subcategories of static subjects unless they reach a certain size. -- King of 05:06, 21 April 2020 (UTC)
    "Prohibit" is another warm and fuzzy word. And what would the penalty be against such prohibition? -- Tuválkin 14:16, 28 April 2020 (UTC)
    Actually, I  Oppose this proposal, even though I agree with the general sentiment that overdiffusion is a problem. I do not believe that by-date categories should be always accompanied by date-free categories, as in my opinion for a building that doesn't change over time, the sole reason for by-date categories to exist is for the purpose of diffusion. Either a building is the Eiffel Tower and we diffuse completely (i.e. the main category should ideally contain no images), or the building is not so popular and we don't diffuse by date at all. In response to your question: The "penalty" would be that overdiffused subcategories get nominated for deletion. -- King of ♥ 22:09, 23 May 2020 (UTC)
  • Oppose The general proposal. I think the system worked in that people say it as too overly categorized, took it for deletion and discussed it there. I don't think a general rule is warranted. If there was a general rule, then I would say we revise Commons policy so that it's something like "by year if you have more than 10 or else by decade or by century" or something general. It's the same with the maps discussion below. -- Ricky81682 (talk) 01:57, 22 May 2020 (UTC)

Overdiffusion discussion

Case in point: Category:Jacob Riis Park by year. I can see separating out the year there were 450 photos (though I bet something else about them characterized WHY there were 450 photos that year, but one of these categories has all of 5 photos. - Jmabel ! talk 04:00, 28 March 2020 (UTC)

One person took a lot of photos. We should happy we have that. Otherwise, we would have all 450 photos in both Category:2018 in New York City (they are already in the month subcategories) and in Category:Jacob Riis Park. Separating out that year means that both parent categories are a little more organized. The five photographs may be too small but I think we need a broader discussion on categorization in general. -- Ricky81682 (talk) 02:03, 22 May 2020 (UTC)

Where to codify this

Overdiffusion wording

Problem with that is that is would invoke COM:OVERCAT, of which I am no great fan, and someone WILL come along and remove it from Category:Great Mosque of Mecca, citing that policy. Rodhullandemu (talk) 12:27, 1 March 2020 (UTC)
As you put it, I guess my proposal is indeed a request for adding an exception to the overcategorisation rule. I believe, however, it is more annoying having to browse through Category:Great Mosque of Mecca by year and all its sub(sub)categories (53 pages in total!) in an attempt to find pictures of, say, its minarets. --HyperGaruda (talk) 13:31, 1 March 2020 (UTC)
  • Per what I've said in the "old maps" section immediately below, how about:

Specific "... by year" categories bring a risk of overdiffusion, making images on a particular subject harder to find and harder to view together. Such categories should not normally be created, unless they bring together images of more than one different subject, that each have their own time-independent category.

I think this wording would help promote the existence of time-independent narrower subject categories, which I agree are to be encouraged (when there are enough images to fill them). Jheald (talk) 18:24, 1 March 2020 (UTC)
  • @HyperGaruda: Like Themightyquill, I'm undecided. Indeed, hiding content into thousands of small pidgeonholes (like, say, Category:June 2016 in Moscow with >1000 files) has more flaws than benefits. But taking your proposal literally, all files in this category should be piled up to Category:Moscow, which then will become impractically huge. So what works for a smaller (speaking only of file count) subject scope will not work for larger items. There should be some line drawn, but where and how? Retired electrician (talk) 08:53, 28 April 2020 (UTC)
Or, if the problem is the fact that some files might be categorized only with something like Category:June 2016 in Moscow and no other cat, then the solution is not to attack that single categorization, which is valid, to make it less useful, but to complement it with more categorization, as in the stork example above.
The solution is not to combat “overdiffusion”, which is a false problem — the real problem is a relative lack of categorization depth, when a file was subjected to more detailed categorization concerning one of its categorizable aspects and less so or none at all concerning others. That problem needs to be addressed — which is mostly done by curation work, not by wikilawyering proposals of new rules and their wording.
Overdiffusion is, at best, a red herring, and those even considering it are, again at best, misinformed about Commons categorization. -- Tuválkin 20:21, 29 April 2020 (UTC)
@Retired electrician: the problem is indeed as Tuaválkin described, where some images are only categorised by date. The example file however, already has other categories and subcategories of Category:Moscow, so there is no need for it to be in Category:Moscow as well. What I try to achieve with this proposal is to "force", if you will, people who add only a date category to also include something directly related to what is depicted. --HyperGaruda (talk) 18:47, 2 May 2020 (UTC)

Old maps of ... by year

  • I certainly see this issue with Old Maps. In most cases the most useful grouping for old maps is to bring together maps that all cover the same area, regardless of the exact year when they may have been created -- so one can see at a glance how the portrayal developed of the same subject, often over centuries, and so that one can find all those maps together, in the one category, ideally sorted into date order. Breaking out the maps on a single subject into a myriad multitude of subcategories "by year" (cf: Category:Maps of Hamburg by year) is a menace, and should be banned.
There may be a case for a separate category showing all the maps together from across a number of subjects that were created in a single year. I am not 100% convinced by this -- I suspect that kind of search will soon be handled by an SDC search, rather than a purpose-built category.
I would back a rule to say that "by year" categories should only be allowed if they bring together images of multiple different subjects taken in a particular year (where the different subjects each have their own specific categories, not "by year"). Images of a single subject -- eg a particular building, or a maps of particular defined region -- should have a specific time-independent category, which should not be split down by year unless it has become very full. Jheald (talk) 17:54, 1 March 2020 (UTC)
However, pinging @AnRo0002: , who has created a number of these categories for "Old maps of ... by year", and may take a different view. For myself I would burn almost all of these "by year" categories down. Jheald (talk) 17:57, 1 March 2020 (UTC)
Also pinging @Aeroid: who's made some of these. Jheald (talk) 18:16, 1 March 2020 (UTC)
New to this topic, sorry. Yes, categorizing the Hamburg Maps was quite some work. The problem was, that after an import the Hamburg Maps category had hundreds of maps, most of them in two versions and many also in more than these two. This was the easiest way to bring these groups of the same map together and I hoped this would help also in the future. I just followed some pre-existing categories and nav-bars which looked to me like a good idea. If this is problematic, I would like to understand what you would suggest to not have hundreds of maps of hamburg filling up one category. --Aeroid (talk) 20:10, 1 March 2020 (UTC)
@Aeroid: Where we have hundreds of maps of a place like Hamburg, I would first split by what they show -- do they show the whole of Hamburg, or only specific parts? Where they show eg the whole of Hamburg, I would order them by date created (I have a script that can help to do this), but leave them in the one category, so that they can be browsed and the sequence can be understood. So long as they show the same subject and are ordered in the category in sequence, a category size of two to three hundred maps is quite readily navigable, and far preferable to two to three hundred different categories. If the category still contains too many maps, then I would consider splitting by century. But I would be very reluctant to split more by time than absolutely necessary. To me splitting by subject makes more sense (while at the same time also having categories for specific atlases or map series that belong together. Jheald (talk) 20:36, 1 March 2020 (UTC)

Jheald describes one use-case. But there are others, orthogonal to it - for instance, someone may be researching "Hamburg in the 18th century". A solution my be to categorise by date and to have an "all old maps of Hamburg" category.

Or we could rely on well-applied sturctured data ;-) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:23, 1 March 2020 (UTC)

@Pigsonthewing: I think the "Hamburg in the 18th century" case is quite well achieved when the maps are ordered by date, so that someone can just scroll through the category to then find all of the 18th century maps together. (Note that per the usual rubric, one would expect to find later maps depicting Hamburg as it was at an earlier date in c:Old maps of the history of Hamburg, etc.) What I don't think helps someone researching "Hamburg in the 18th century" is if they have to separately work through c:Category:1702 maps of Hamburg, c:Category:1705 maps of Hamburg, c:Category:1716 maps of Hamburg, c:Category:1720 maps of Hamburg, etc, each one only containing a single or a couple of images.
As for SDC, SDC searches will come in time. But I for one don't think they will ever fully replace the ease of clicking and browsing pre-curated categories. SDC may turn out to help a lot in making categories more fully comprehensive though. Jheald (talk) 21:37, 1 March 2020 (UTC)
I value the additional discoverability quite high. Note the additional use cases supported by the higher level categories like c:Category:1858 maps of Germany and c:Category:1858 maps of Europe inherited from categorizing it as c:Category:1858 maps of Hamburg and the different nav-bar that make this image discoverable. It becomes easy to find an adjaced map of SH in the same year.
Your "browsing" use case could well work if the UI had a capability to expand categories to images similiar to the categories that unfold subcategories on a categorie page.--Aeroid (talk) 09:01, 2 March 2020 (UTC)