This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
Change all instances (6 now) to .[A-z]{ . a-z only matches lowercase file extensions. A-z will match any mix of lowercase and capital letters.--Roy17 (talk) 11:16, 12 July 2019 (UTC)
File:(Screenshot|SCREENSHOT|Screencap|SCREENCAP|螢幕截圖|截圖|截图|写真)[Ss]*[\d\s\.\,\-\_\'\~]*[A-z]?\.[A-z]{2,} <reupload|errmsg=titleblacklist-custom-filename> I think this will catch them all. Anything that (1) starts with screenshot or synonyms in English, Chinese or Japanese (feel free to add other languages) (2) followed by a string of digits, whitespaces, dots, commas, minus signs, underscores, apostrophes and tildes (3) followed by at most one single letter (4) of any extension, will be blocked.
I was just testing filenames using UploadWizard. Strangely "Screenshot 2019-06-09 【一二三四五六七八九十是好的】赵钱孙李" would be blocked. I dont seem to find anything in the Commons blacklist that would match this. It seems to be a global blacklist that matches everything like Screenshot 20XX-XX-XX whatever follows. Yet Screenshot 2019 0706 150131.png would escape the list. I believe an upgrade to my version should block all such --Roy17 (talk) 11:16, 12 July 2019 (UTC)
Please add in parentheses as well. Change \d\s\.\,\-\_\'\~ to \d\s\.\,\-\_\'\~\(\). I was uploading windows sreenshots and their names had a format of screenshot (99).png . This was not caught because of the parentheses.--Roy17 (talk) 18:41, 24 September 2019 (UTC)
A few characters outside the Basic Multilingual Plane are useful in titles
I want to upload a file titled "CADAL02069441_文𤪪清娛(八).djvu" but this file name is not allowed. The character "𤪪" is in CJK Unified Ideographs Extension B, which is not allowed in Wikimedia Commons because "Very few characters outside the Basic Multilingual Plane are useful in titles". I think they are useful because not only this book, but also at least a hundred files for other Chinese books contain such characters.
I think this line can be changed from
.*[^\0-\x{FFFF}].* <casesensitive|errmsg=titleblacklist-custom-hidden-char> # Very few characters outside the Basic Multilingual Plane are useful in titles
to
.*[^\0-\x{3134F}].* <casesensitive|errmsg=titleblacklist-custom-hidden-char> # Very few characters outside the Basic Multilingual Plane and CJK Unified Ideographs are useful in titles
CJK Unified Ideographs Extension G (not used yet): 30000-3134F
CJK Compatibility Ideographs: F900–FAFF
Since the original rule has covered three of them, the following may be a more precise rule:
.*[^\0-\x{FFFF}\x{20000}-\x{2A6DF}\x{2A700}-\x{2B73F}\x{2B740}-\x{2B81F}\x{2CEB0}-\x{2EBEF}].* <casesensitive|errmsg=titleblacklist-custom-hidden-char> # Very few characters outside the Basic Multilingual Plane and CJK Unified Ideographs are useful in titles
{{Ep}}
Search intitle:/[0-9]+[0-9\ \_\-]+[Ii][Oo][Ss]\W*\.[A-z]+/ to see bad examples.
I suggest adding File:[0-9]+[0-9\ \_\-]+[Ii][Oo][Ss]\W*\.[A-z]+ <reupload|errmsg=titleblacklist-custom-filename> # iOS.--Roy17 (talk) 23:26, 7 March 2020 (UTC)
Done, I tried it out in the UploadWizard and the warning is shown now. (I slightly simplified the regex because it’s case-insensitive by default.) --Lucas Werkmeister (talk) 22:00, 8 April 2020 (UTC)
I noticed that several of the regular expressions contain capturing groups even though they don’t use them to refer back to the captured part (with \1 etc.) – for example: File:(照|图)片.*, File:.*(UNADJUSTEDNONRAW).*, (Template|Translations:Template):Welcome\/i18n\/.*. Are those capturing groups used anywhere, or can they be turned into non-capturing groups ((?:pattern)) or removed? I suspected they might be used in the message, but as far as I can tell that’s not the case. --Lucas Werkmeister (talk) 20:40, 1 July 2020 (UTC)
Blocking files with double extensions?
{{Ep}}Would it make sense to have a blacklist rule to block filenames like File:Eurovision Song Contest 2021 logo.svg.png and File:2021 Avroviziya Mahnı Müsabiqəsi loqo.svg.png? On enwiki we use File:.*(\,|\.)(png|gif|jpe?g|tiff?|xcf|pdf|mid|ogg|ogv|svg|djvu|oga|flac|opus|wav|webm)(\ |\,)?\.(png|gif|jpe?g|tiff?|xcf|pdf|mid|ogg|ogv|svg|djvu|oga|flac|opus|wav|webm) <reupload | errmsg=titleblacklist-custom-file-extension> for this purpose, it would presumably work as well on Commons if you change errmsg=titleblacklist-custom-file-extension to errmsg=titleblacklist-custom-filenameJo-Jo Eumerus (talk) 17:00, 23 November 2020 (UTC)
Not seeing which rule it's hitting even after a closer examination. Nor can I find it in the global blacklist. --Xover (talk) 20:44, 5 March 2021 (UTC)
@AntiCompositeNumber: Ah, thanks. My eyes didn't notice the \P in there. Funny, I used the same approach to killfile newsgroup subject lines back when those particular dinosaurs still roamed and nobody much had heard of Unicode (Quoted-Printable, ISO 8859-1 vs. Windows 1252, and whether to permit HTML in netnews posts and email were the hot topics).File:\P{L}*\.[^.]+ is a pretty blunt weapon in the title blacklist. "Easiest thing to do […]" Sure. But this is the standard naming schema for books, so this is an entirely valid filename even if it is an edge case (there's going to be a handful of others where the title contains no letters qua letters). And adding random strings to it does have some consequences, not just for the humans trying to use it directly with MW image syntax ([[File:… ]]) but also indirectly through Proofread Page on the Wikisourcen (where the Index: and all the proofread Page:-namespace pages must have the same basepagename since that's how PRP connects them; that's 200+ wikipages with a random string in the name just to circumvent the blacklist, for this work alone). There's enough trouble in disambiguating works with actually identical titles if we're not going to add artificially imposed constraints like this to the mix.Could we at least add \p{N} to the set of valid characters? Or explicitly whitelist anything that matches the book naming schema (roughly File:.+ \(\d{1,4}\)\.[^.]+; any number of printable characters—including symbols, emoji, and numbers—and literal space characters, followed by any valid year in parenthesis)? And perhaps in either case give this case a custom message that describes a process of requesting temporary whitelisting for the specific filename; or upload to a temporary name and requesting a +sysop (can filemovers do this?) to rename the file to the final name. --Xover (talk) 08:34, 6 March 2021 (UTC)
@Xover: Looks like it's been in place for a decade. I don't think allowing filenames with only numbers and punctuation is a great idea either, and the whitelist you suggested is a little overbroad. It would allow things like File:Untitled document (1).pdf, which we don't want. Allowing 4-digit years only might be better, and you could probably convince me to whitelist File:\d{4} \(\d{4}\)\.[^.]+. The existing message does already say to ask for help here or at AN. Only template editors and administrators can override the title blacklist on Commons, normal file movers can't. I can temporarily whitelist this filename or move a file to it, whatever you prefer. --AntiCompositeNumber (talk) 00:18, 10 March 2021 (UTC)
One of the entries in the forbidden-characters list covers the wrong codepoint range
Currently, the sixth of the seven entries forbidding certain characters in titles reads thus:
.*[\x{E000}-\x{F8FF}\x{FFF0}-\x{FFFF}].* <casesensitive|errmsg=titleblacklist-custom-hidden-char> # Surrogates, Private Use Area and Specials, including the Replacement Character U+FFFD
However, the ranges given cover only the PUA (U+E000 through U+F8FF) and the Specials block (U+FFF0 through U+FFFF), and not the surrogate range (U+D800 through U+DFFF). Thus, if my interpretation is correct, this entry (which claims to block surrogates as well as PUA and Specials characters) would not actually prevent someone from uploading a file with a title containing unpaired surrogates. I would like to suggest either modifying it to actually block surrogates as well - for instance,
.*[\x{D800}-\x{F8FF}\x{FFF0}-\x{FFFF}].* <casesensitive|errmsg=titleblacklist-custom-hidden-char> # Surrogates, Private Use Area and Specials, including the Replacement Character U+FFFD
or else modifying the description so that it no longer falsely claims to block surrogates. (Also, one might want to look at adding the noncharacter range from U+FDD0 through U+FDEF to the character - or "character" - blacklist, given that, A, there is very little reason why one would want one of those in an image title, and, B, they aren't actually title-blacklisted globally [living proof here]!) Whoop whoop pull upBitching Betty | Averted crashes00:30, 3 August 2021 (UTC)
After trying unsuccessfully a few times to upload directly the video of Erotikon (1920 film), I have tried through video2commons and got the following error message : "An exception occurred: TaskError: pywikibot.Error: APIError: titleblacklist-custom-filename: ⧼titleblacklist-custom-filename⧽ [help:See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes.; invalidparameter:filename; stasherrors:[{u'type': u'error', u'message': u'uploadstash-exception', u'code': u'uploadstash-exception', u'params': [u'UploadStashBadPathException', u"Path doesn't exist."]}]]". Any idea? — Racconish💬13:06, 26 August 2021 (UTC)
LANGCODE comment
Looking at this edit (and I really do not mean this as a criticism of NguoiDungKhongDinhDanh in any way), can someone put a better comment around the LANGCODE section explaining why it's on the blacklist. The comment "people love to create this subpage" is cutesy but not actually explanatory. Pinging @Yann: who seems to actually know a thing about this. I would honestly do the exact same thing and it's worth saying something like "need to use the actual LANGCODE" or something. Most of the other sections (than Miscellaneous) have headers that are explanatory. -- Ricky81682 (talk) 08:14, 2 February 2022 (UTC)
@Ricky81682: For legacy autotranslated templates (i.e. autotranslated templates not using the Translate extension), {{TemplateBox}} prefills the translation creation form with Template:templatename/LANGCODE, where it replaces LANGCODE with the user’s user interface language only if it doesn’t exist (as it makes no sense to create the translation if it already exists). If the translation exists, the form is prefilled with literally LANGCODE—but that’s not an actual language code and it makes no sense to create a translation with this title, this is why it’s disabled by the title blacklist. See Template:PermissionTicket for an example for this translation creation form: if you open it with Colognian interface language, it prefills the box with the language code ksh, as it’s not yet translated into Colognian, but if you use Chinese, LANGCODE appears in the box. —Tacsipacsi (talk) 23:05, 8 February 2022 (UTC)