Commons:Village pump/Proposals/Archive/2012/07

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

ZoomViewer enabled for all users

I would like to propose to make Help:Gadget-ZoomViewer enabled by default for all users, anonymous included. I believe the script has clearly shown its stableness and its usefulness for users.

Thoughts?

Jean-Fred (talk) 21:40, 4 June 2012 (UTC)


✓ Done -- RE rillke questions? 11:33, 30 June 2012 (UTC)

I do not see "ZoomViewer" link under many large images as it is shown here: File:ZoomViewerShot1.png. ZoomViewer is enabled in gadgets in my preferences. But many multi-megabyte images do not have the ZoomViewer link. Is there a minimum image size before it shows up? If so, what is it? I see it linked under this image: File:Seattle 7.jpg. But not on this one: File:Seattle 4.jpg --Timeshifter (talk) 08:01, 1 July 2012 (UTC)
20MP. See MediaWiki:Gadget-ZoomViewer.js. This code was also linked from Help:Gadget-ZoomViewer. (Thanks to Rd232 for the nice gadget-template that users could use to easily find the code location.) -- RE rillke questions? 21:04, 1 July 2012 (UTC)
OK. I added that info to Help:Gadget-ZoomViewer. That is what is linked from the gadget tab of my preferences. I used "20 megabytes" because I only see "megabytes" used in the image description page. Unless the MediaWiki software actually multiplies the 2 pixel numbers for width and height? --Timeshifter (talk) 05:15, 2 July 2012 (UTC)
MediaWiki doesn't but the gadget does. -- RE rillke questions? 08:09, 2 July 2012 (UTC)

Filenames with invisible characters

Hi! As I pointed out in Commons talk:File renaming, I noticed that a lot of filenames (in particular among the images from the Geograph British Isles project) contain some invalid Unicode characters ranging from U+0080 to U+009F. For example: File:St Ethelreda’s Church Horley Oxfordshire - geograph.org.uk - 1771691.jpg has got a U+0092 character in Ethelreda’s between a and s. Please note that these characters are invisible on many systems. These invisible characters create confusion (eg. File:St Ethelredas Church Horley Oxfordshire - geograph.org.uk - 1771691.jpg), it's almost impossible to create a link or to use the image without cut&pasting the filename and I suspect they are not allowed within a valid XHTML. I made a complete list of filenames that contain one or more characters between U+0080 and U+009F (about 2200 images). I'm running a bot and I'm able and ready to perform the task so, if there will consensus, I will request the filemover flag for my bot in order to directly perform the moving. If it will not be possible I will ask in COM:BWR to other bot operators. Are there in your opinion other invalid characters I should consider for this mass renaming? What do you think? -- Basilicofresco (msg) 05:58, 23 June 2012 (UTC)

I trust that despite the 'invalid' characters a redirect will be left behind at the original name. --Tony Wills (talk) 11:36, 23 June 2012 (UTC)
Of course! -- Basilicofresco (msg) 12:06, 23 June 2012 (UTC)
By the way, these are CP1252 characters... AnonMoos (talk) 21:34, 23 June 2012 (UTC)
You mean these are the "blue dot" control characters there? Rd232 (talk) 22:54, 23 June 2012 (UTC)
Have no idea what that means, but Windows-1252 character 92 hex (146 decimal) corresponds to Unicode character U2019 (curly apostrophe, HTML ’)... AnonMoos (talk) 02:44, 24 June 2012 (UTC)
"blue dot" is the representation for invisible characters in the image in the Wikipedia article. Are you sure about the 92 hex? Because that sounds like it should be transcoded to Unicode, rather than blocked or ignored, as I'd assumed. Rd232 (talk) 14:33, 24 June 2012 (UTC)
Well, your question was asked in a rather obscure way, and doesn't seem to make all that much sense when understood, because those character values are quite unlikely to be introduced into filenames on a Windows-1252 system. As for "transcoding into Unicode", that's supposed to be done by the web browser, as it interfaces between names in the local operating system's filesystem and the Unicode used in Wikimedia Commons. If the Commons software fails to check whether the web browser has done its job correctly, this can allow problems to be created, as seen... AnonMoos (talk) 04:30, 25 June 2012 (UTC)
What are these invisible characters for (are they always invisible?), and is there any legitimate use? The example you give certainly needs fixing (going to the file and typing out "Ethelredas" in the URL over the existing "Ethelredas" should get you back to the file, but it doesn't), but I'd like to understand these characters better before supporting mass replacement. There's also the issue that if there really is no legitimate use for these mystifying invisible characters, we should file a bug to prevent them being used in new filenames. Rd232 (talk) 22:51, 23 June 2012 (UTC)
Right, per Bugzilla:5732 (MediaWiki allows characters in the U+0080 to U+009F range) we should do the mass rename, and we should bump the bug to ask for the characters to be disallowed in pagenames at least, because of the problems caused. Rd232 (talk) 23:00, 23 June 2012 (UTC)
They come from a wrong character coding (Windows-1252 characters presented as Latin-1 where the range U+0080~U+009F is for control codes) and, as far as I know, they should always be avoided. Thank you for finding Bugzilla:5732, pretty interesting. -- Basilicofresco (msg) 06:59, 24 June 2012 (UTC)
To be precise, they seem to be the C1 control codes (en:C0 and C1 control codes). Rd232 (talk) 12:40, 25 June 2012 (UTC)

We have some of these chars already on our title blacklist. Should we add some more for file names (<reupload>)? -- RE rillke questions? 07:55, 24 June 2012 (UTC)

That would be helpful. My only concern would be that the resulting error messages (when users try to create filenames with such characters, blocked by the blacklist) are non-specific and a bit confusing as a result - users won't know why they can't upload. Some more specific error handling via Bug 5732 would be better in the long run. Rd232 (talk) 14:33, 24 June 2012 (UTC)
I one can set the error-message (a mediaWiki page; there are examples) but I am not sure what UpWiz will display. This will be something to try. Please note that admins silently override the title blacklist while uploading. -- RE rillke questions? 16:30, 24 June 2012 (UTC)
OK, let's do it and see. Rd232 (talk) 16:50, 24 June 2012 (UTC)
I'd like to point out that many of these characters aren't invisible per se: rather, they're the result of improper transcoding from CP1252 to UTF-8, and represent things like curly quotation marks, the em and en dashes, and the ellipsis. Instead of dropping the characters, the bot should rename the files to the correct UTF-8 representation of those characters (eg. File:St Ethelreda’s Church Horley Oxfordshire - geograph.org.uk - 1771691.jpg to File:St Ethelreda’s Church Horley Oxfordshire - geograph.org.uk - 1771691.jpg rather than File:St Ethelredas Church Horley Oxfordshire - geograph.org.uk - 1771691.jpg). --Carnildo (talk) 23:14, 25 June 2012 (UTC)
Actually, the best of all would be to "de-smartquote" and rename to File:St Ethelreda's Church Horley Oxfordshire - geograph.org.uk - 1771691.jpg ... -- AnonMoos (talk) 09:05, 26 June 2012 (UTC)
Well that explains how the characters got into filenames in the first place. Is there an authoritative transcoding list from CP1252 to UTF-8, so the bot can fix it? Rd232 (talk) 00:10, 26 June 2012 (UTC)
The authoritative list is at http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT , but we may want to make some adjustments, such as de-"smartquoting"... AnonMoos (talk) 09:05, 26 June 2012 (UTC)

I'm not going to drop them, here is the conversion table:

  • 0x0080 -> €
  • 0x0081 -> space
  • 0x0082 -> '
  • 0x0083 -> f
  • 0x0084 -> "
  • 0x0085 -> ...
  • 0x0086 -> +
  • 0x0087 -> ++
  • 0x0088 -> ^
  • 0x0089 -> ‰
  • 0x008A -> S
  • 0x008B -> <
  • 0x008C -> Œ
  • 0x008D -> space
  • 0x008E -> Z
  • 0x008F -> space
  • 0x0090 -> space
  • 0x0091 -> '
  • 0x0092 -> '
  • 0x0093 -> "
  • 0x0094 -> "
  • 0x0095 -> -
  • 0x0096 -> -
  • 0x0097 -> -
  • 0x0098 -> ~
  • 0x0099 -> ™
  • 0x009A -> s
  • 0x009B -> >
  • 0x009C -> œ
  • 0x009D -> space
  • 0x009E -> z
  • 0x009F -> Y

Basilicofresco (msg) 11:58, 26 June 2012 (UTC)

OK. What about doing a dry run that produces a list showing the results (Filename A -> B)? We can then review the list to manually rename any where the auto-conversion can be improved on, and then the bot can run again to rename those that are left (taking care to not rename ones already handled...). Rd232 (talk) 12:40, 26 June 2012 (UTC)
Sure, here is the list. -- Basilicofresco (msg) 23:07, 26 June 2012 (UTC)
Thanks. I got as far the File:Bishops... without finding any problems. It's mostly adding missing apostrophes and dashes/hyphens. It doesn't look like any weird things would happen (would be nice to skim the entire list though). Rd232 (talk) 09:08, 27 June 2012 (UTC)

In the meanwhile I noticed also filenames with broken html entities (eg. "and^8470," instead "№"). As you can see in the previous list, I'm going to fix also this problem while moving for invisible characters. Nevertheless, what do you think? Should we move also all these files per criterion #5? Here is the list (converted name included): User:FrescoBot/List of filenames with broken html entities. -- Basilicofresco (msg) 12:21, 27 June 2012 (UTC)

Yes, absolutely. However, a lot of files in that list (even a majority?) are files with and^8470, which is supposed to be . But having a character in a filename which no-one can type is foolish; I think we should convert that to No. instead of . Rd232 (talk) 17:28, 27 June 2012 (UTC)
It is really just the same thing as having file names in different scripts, in my opinion. I am not able to type Arabic or Cyrillic file names, but I don't mind having files with such file names. --Stefan4 (talk) 19:05, 27 June 2012 (UTC)
No, it's different (and why I highlighted that as an exception from the list!) - because no-one can type that symbol. See also Commons:File naming (Avoid "funny" symbols). Rd232 (talk) 19:40, 27 June 2012 (UTC)
Please note: we are not talking about converting within namefiles any "" in "No.", but about converting "and^8470," in "No.": it is not a different script and in my opinion is a good choice. -- Basilicofresco (msg) 21:39, 27 June 2012 (UTC)
"and^8470," -> "" is what happens in your list User:FrescoBot/List of filenames with broken html entities. This is what you'd expect with a simple fixing conversion, and avoiding that result (producing "No." instead) takes an extra decision. I'm suggesting we do that. I'm not suggesting we go around renaming files that have "" in them already, just that we avoid making more, when we're renaming anyway. Rd232 (talk) 01:20, 28 June 2012 (UTC)

Right, I've been through both file lists, and as far as I'm concerned we're ready to go (ideally with the and^8470, exception mentioned above). Please note that I've done some manual renaming of files on the llist to improve on the planned automated conversion - so please be sure to exclude redirects from the process. Rd232 (talk) 20:19, 29 June 2012 (UTC)

Sure, I'm skipping any redirect. -- Basilicofresco (msg) 05:16, 30 June 2012 (UTC)
Since my original bot authorization did not include filemoving, I asked the permission to run this new task: Commons:Bots/Requests/FrescoBot 2 -- Basilicofresco (msg) 06:26, 30 June 2012 (UTC)

Category names with invisible characters

I lately run into some problems with invisible characters in category names. See Help_talk:Gadget-Cat-a-lot#Invisible_characters_in_Category_names. More specifically I seem to be occasionally producing categories with invisible a character #26 (0x001A) on the end of the category (not sure how it happens). The character seem to be ignored by the mediawiki software, but confuses Cat-a-lot gadget, AWB based bots, and possibly can cause other mischief. May be we can scan for those too? --Jarekt (talk) 12:34, 27 June 2012 (UTC)

The invisible character in your example is U+200E (left-to-right mark). It is not allowed in namefiles so there are no file with that character and therefore no file to move. Nevertheless, since it causes problems, I can add a fix for it within my link syntax fixing script. -- Basilicofresco (msg) 21:23, 27 June 2012 (UTC)

Invisible characters within the file descriptions

A related job: Most of the Geograph files with this issue also include the same characters within the file's description (the title is generally repeated as the first part of the description). For example, File:Clayhidon, St Andrew's church - geograph.org.uk - 128206.jpg. It would make sense to fix these at the same time.--Nilfanion (talk) 07:17, 8 July 2012 (UTC)

I know it. During the link fixing task I was also fixing these characters within the description (example). Anyway during a file moving is not possible to edit at the same time the page. These characters in the description are a low priority problem so I don't know if there is consensus for a specific fixing run. -- Basilicofresco (msg) 08:21, 8 July 2012 (UTC)

I thought it might be a nice idea if we maintained some thematic portals which highlighted some of the great work our contributors achieve in individual areas - i.e. animation, wildlife photography, musical performances. Any thoughts ? --Claritas (talk) 20:51, 1 July 2012 (UTC)

Highlighting the best works in a particular area is what galleries are for. I agree that it would be nice if we maintained them. :) LX (talk, contribs) 22:28, 1 July 2012 (UTC)
Well, a portal would then be a gallery of galleries, perhaps :) That could actually be a useful way to navigate galleries which might help promote them. I've no idea if such a thing exists (I don't deal with galleries at all, really), but it sounds sensible to me. Rd232 (talk) 07:40, 2 July 2012 (UTC)
It could be a very nice idea. I suspect that quite a few galleries would have to be created before you could use them in your super-gallery, though. --Philosopher Let us reason together. 07:20, 3 July 2012 (UTC)
Portals might encourage that. Maybe a "portal" space on the Commons could be created. The portal namespace exists on Wikipedia and Wikia. So portal pages can be searched in advanced search on Wikipedia. --Timeshifter (talk) 10:51, 3 July 2012 (UTC)
  • Galleries tend to have limited focus, I like the idea of having a wide-range metagallery. Creating a higher level structure might encourage editors to get more involved in the editing of galleries. --Claritas (talk) 13:29, 3 July 2012 (UTC)
  • I think the best way to move this forward would be for someone to create a good example. Then we can say "we should have more things like this". Once we have an example, we can also add a section to Commons:Galleries explaining what a "meta-gallery" is, and create a custom header tag for it. I think that's a good way to go - a special form of gallery. It's basically what is called a "portal" elsewhere, but I don't like the name and most portals are unimpressive. (And besides, creating a new namespace now would mean an almost empty namespace.) Anyway, it would be a start and the concept could evolve from there. Rd232 (talk) 09:37, 5 July 2012 (UTC)
    +1 with Rd232. Jean-Fred (talk) 11:06, 5 July 2012 (UTC)
One thing that would help start things moving is to categorize galleries. I just started this:
Category:Gallery pages
There are an amazing number of galleries on the Commons. I did not realize this. See
The Wikipedia portal index could be a model for a gallery index on the Commons.
What is needed is a way to browse galleries. It would be nice if one could browse categories on the Commons, but exclude everything except galleries. It would be even better if categories without gallery pages in them did not show up. So "empty" categories would not show up while browsing for galleries. --Timeshifter (talk) 09:45, 6 July 2012 (UTC)
This Category:Gallery pages is completely ridiculous. The gallery main space contains 106000 galleries (and 60000 redirects). See discussion in Commons:Categories_for_discussion/2012/07/Category:Gallery_pages. Please use Special:AllPages or just type a name in the search box. --Foroa (talk) 09:56, 6 July 2012 (UTC)
Another in a long list of flat categories. Such as the ones you so love. --Timeshifter (talk) 10:11, 6 July 2012 (UTC)

Deletion policy and promotional or out of project scope files

Laterly, I patrolled the new uploads and watch some Deletion requests. Most of the files nominated for Deletion request are unused, private photos of irrelevant persons, but other class of files are for promotional puroses uploaded by single-purose account. All of these files are outside the project scope

The Criteria for speedy deletion does not explicitly forbid promotional uses for files (but the (Gallery) or Main namespace). I tag most of the obvious promotional-purposes files (mostly to be used in promotional pages at Wikipedia) using a custom template made by me, that is based in {{Speedy}}. So my question is that if needed to review or update the deletion policies to forbid explicitelly the promotional-purposes (and other out of project scope) files through the Criteria for speedy deletion, as in the promotional pages in the Wikipedias.

Thanks for discuss this and clearing my doubts. Amitie 10g (talk) 02:46, 5 July 2012 (UTC)

There has been a fair amount of discussion about this at Commons talk:Criteria for speedy deletion, and the decision to leave promotional content out of the speedy deletion criteria seems to have been a deliberate one. Remember that this is quite a subjective topic – what may appear promotional to you may actually be educational, and it largely appears on the context in which he images are used. You should not be using an official-looking template that misrepresents current policy to get files speedily deleted when they're not eligible for speedy deletion. LX (talk, contribs) 08:24, 5 July 2012 (UTC)
Well, I refer to the obvious and undoubted cases of files for promotional-purpose (most of them Copyvio), uploaded generally by single-purposes account (that is really easy to determine them), that I think than should be speedy deleted, as well as the promotional pages at Wikipedia. Sometimes is difficult to found the uses of the file if the promotional articles at Wikipedia were deleted; then, the file appears as unused in Commons, but it was used before, with obvious purposes. I always research the uses of the file and the uploader's behavior, warnings and blocks in Wikipedia.
Also, I seen than the administrators (and patrollers that tags the files) does not apply the same criteria for the same kind of files, so I've been thinking that it is important to clarify more the Deletion policies in order to avoid the doubt and ambiguity in the obvious cases that I mentioned.
Thanks in advance. Amitie 10g (talk) 18:13, 5 July 2012 (UTC)
obvious and undoubted cases of files for promotional-purpose (most of them Copyvio): If they are copyvios, they should be tagged accordingly.
uploaded generally by single-purposes account: If this account was created by someone of the company that is promoted, they are likely not copyvios, right?
but it was used before, with obvious purposes: If a file is useful for something else, it would be stupid to delete it because it was used for promotional purposes before. I am always happy when we get OTRS permission for professional product shots. Recently Steinway & Sons, Hammelmann (deleted Wikipedia article) a time ago. Note that CC-BY-SA allows editing these files.
I would not oppose if we are notified in case you discover promotional material on Wikipedia with files on it. Sometimes I have to go to Wikipedia to get the promotional files out of not-accepted article drafts. Otherwise they are "in use" and we are not allowed to delete them (we would be accused that we misused our power as Commons serves for all WMF wikis). -- RE rillke questions? 11:42, 6 July 2012 (UTC)
An obvious promotional file is this, that I tagged for Speedy deletion, because is clearly unusable for any other purposes than promotion himself. Also, the user violates several policies by creating a promotional page in his User page at Commons (Commons is not Wikipedia). Is this tag (Speedy) correct or is beter a DR for this case of obvious self-promotion? Amitie 10g (talk) 00:41, 8 July 2012 (UTC)

Deletion request and non-autoconfirmed users (or IPs)

I'm watching the Deletion request (just curiosity), and I found some derisive of them opened by IPs like this. Therefore, I thing than is necessary to limit the opening the DRs (use of the {{Delete}} template) only to autoconfirmed users.

Has been discussed this before? Was approved or dissaproved the antiabuses filter for these editions? Thanks for clear my doubts. Amitie 10g (talk) 04:10, 9 July 2012 (UTC)

Please read Commons:Village pump#abusing anonymous ip, especially the reply by LX (without numbers). There were several discussions before. -- RE rillke questions? 07:16, 9 July 2012 (UTC)
OK, reading these thread. Amitie 10g (talk) 19:29, 9 July 2012 (UTC)

Clarify search options in preferences.

Concerning: "Enable enhanced search suggestions (Vector skin only)" in the search options tab in preferences (Special:Preferences#mw-prefsection-searchoptions). It took me awhile to figure out how to disable automatic redirects from search to a page. I hate those automatic redirects when I am actually trying to do a search. I am talking about searches from the search form, not the search suggestions in the dropdown.

Is it possible for the Commons to add a description link in preferences that links to meta:Help:Preferences#Search options, or to some specific help page about this on the Commons? It is a very confusing option. I tried to clarify it at the meta help page:

Enable enhanced search suggestions (Vector skin only)
If this option is turned off, the Search box reverts to its traditional look, with separate "Go" and "Search" buttons. Search no longer automatically redirects sometimes to a page. "Go" is the way to do that. When this option is turned on there is only a text-less search button. One is automatically taken to a page in some cases. Also, the search suggestions in the dropdown list are different (on the Commons, for example) than the ones presented when this option is turned off.

It may need further clarification. In some ways I think this option is a detriment. It is "Enhanced" in some ways, and a detriment in other ways. It would be more accurate to call it "Enable different search suggestions (Vector skin only)". At least on the Commons. --Timeshifter (talk) 20:15, 22 July 2012 (UTC)