Commons:Village pump/Proposals/Archive/2017/08

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Extension of Abuse Filter to combat WP0 abuses

The Abuse Filter should be extended to provide more granular control of WP0 traffic, for instance by partner, county, city, and IP range, for both uploads and downloads.   — Jeff G. ツ 03:52, 3 August 2017 (UTC)

See above #Partnerships & Global Reach team proposal: Targeted Video upload restriction. I don't think such parameters will be provided to AF due to privacy concerns. --Zhuyifei1999 (talk) 11:51, 3 August 2017 (UTC)
Automated-blocking (for clear cases, SHA1 matches for example) would be useful. But as far i know we need community consensus to enable such a feature :/. --Steinsplitter (talk) 11:56, 3 August 2017 (UTC)
@Zhuyifei1999: Can we amend the privacy policy to no longer protect the privacy of pirates?   — Jeff G. ツ 16:55, 3 August 2017 (UTC)
I can't answer this. Please ask WMF Legal. --Zhuyifei1999 (talk) 17:26, 3 August 2017 (UTC)
@Jeff G.: I'd have serious qualms because such information could be used by copyright trolls. I'd be fine with this being handled semi-privately through CheckUsers. Guanaco (talk) 09:44, 4 August 2017 (UTC)

Use of deep learning and AI on Wikimedia Commons

Deep learning says there is a fountain on this image with possibility of 98.8%!

I've developed a service for detecting tags and examining ratio of being NSFW of an image with deep learning (a new machine learning trend) open source tools and models and had interesting results with it I believe. For example it tags today's featured picture as lycaenid: 63.7% and thinks of this image as being 90% possible NSFW.

Currently I've made a gadget that can be enabled from gadget tab of preferences, called "DeepLearningServices" and am listing new uploaded images with NSFW rate more than 0.5 with a bot that follows Wikimedia Commons recent changes on User:Ebrambot/deep-learning report.

I was thinking about extending the just added gadget and services to suggest category for images without any categories or even intelligently suggest categories for images being uploaded. What do you think about the mentioned proposals and what else can be done with these capabilities? −ebrahimtalk 13:23, 3 August 2017 (UTC)

@Ebrahim: It would help to flesh out the descriptions of what the links are and what they do at MediaWiki talk:Gadget-DeepLearningServices.js.   — Jeff G. ツ 17:02, 3 August 2017 (UTC)
Jeff Done, is it now more clear what the tool is doing? −ebrahimtalk 17:39, 3 August 2017 (UTC)
@Ebrahim: Yes, but that page does not mention that the gadget is just an AI tool for advising the user of the probabilities of the presence of attributes like NSFW, not for actually adding file description page tags or categories (possibly in the future?), nor does it mention that the user has to click to activate the tool (possibly automatic in the future?). To be clear, it only recognizes porn as NSFW, and does not address violent or pro-hate types of images as NSFW, right?   — Jeff G. ツ 18:03, 3 August 2017 (UTC)
Yes, I followed the original terminology on NSFW of Yahoo project but that's better to be more cleared. Please apply these as it definitely helps the describing current functionality of the tool. Thanks :) −ebrahimtalk 18:24, 3 August 2017 (UTC)
Can we drop the NSFW title in exchange for a "nudity" title? What's not safe for work depends on location and context; a lot of workplaces would consider that Adam & Eve painting acceptable.--Prosfilaes (talk) 21:07, 4 August 2017 (UTC)
I'd say lets stick with the original terminology of the project, Adam and Eve case is somehow false alarm as I guess they've tried to train the model to not rate it even more. −ebrahimtalk 23:39, 5 August 2017 (UTC)
I don't see why; they didn't invent the word NSFW, nor is their use so dominant. We shouldn't use misleading, inaccurate descriptions instead of neutral clearer descriptions, even if the tools we're using do. That Adam & Eve painting may be NSFW in many places, as well as levels of nudity generally considered okay in the West.--Prosfilaes (talk) 01:43, 8 August 2017 (UTC)
The term NSFW is controversial and will remain so as it is highly subjective. No, a painting with a biblical depiction of Adam and Eve should never be labelled as NSFW. -- (talk) 08:57, 8 August 2017 (UTC)
No actual labelling is going to happen, this is just a patrolling tool and the listed images will be removed from the NSFW page each week. −ebrahimtalk 10:16, 8 August 2017 (UTC)
The results look meaningless. In this case the images were examined for 'fleshiness', which is a very poor indicator of much on Commons, especially if photographs of bats or parts of ships are going to be labelled as NSFW. If more interesting features, such as images with no categories that are highly likely to be simple passages of text, or almost featureless apart from borders, then the results would have much more likely value for maintenance. Experiments like this were done during the summer of code last year, I suggest those past experiments are reviewed as they could save a lot of volunteer time, before trying more. -- (talk) 08:51, 8 August 2017 (UTC)
: No automatic labeling is going to happen and images listed on the page will just be removed after a while, a good ratio of images that archived here are now deleted and this can be helpful for patrolling new images I believe. −ebrahimtalk 10:06, 8 August 2017 (UTC)
Sure, however this is effectively a maintenance category and so the quality of the results should be kept under review. Marking, say, 19th century paintings and drawings in a way that even implies they are NSFW is inappropriate. Regardless of the background off-wiki, I support renaming the report. Oh, BTW, don't get me wrong, I strongly encourage further experiments, and even WMF funded projects, that will result in automated categorization, even if this is just along the lines of "appears blank", or "looks like text". Thanks -- (talk) 10:17, 8 August 2017 (UTC)
: Great, I liked also to get some feedback from community and go for even better experiments. So what about "Possible NSFW report" or something like that? −ebrahimtalk 11:40, 8 August 2017 (UTC)
Sorry, I think the NSFW focus of the report is actually bad for these experiments as it's the wrong starting point. For Wikimedia Commons, a generic NSFW report simply fuels the sort of misplaced porn panic that is a priority for other sites and communities. The priority for Commons should be copyright violations, useless selfies, duplicates and generic out of scope material. For these reasons identifying simple modern text documents and images with several indicators that they have been ripped off from other websites and likely to be copyvios is much more useful as a starting point. Frankly identifying porn happens pretty easily without bots, just give a patroller a screen with 500 tiny thumbnails per page, and they'll spot any real nudity straight away, and be able to do much better than a bot at deciding when photographs with nudity or lots of skin in them are still likely to be within project scope.
Think about retuning and drop the word NSFW altogether. -- (talk) 11:54, 8 August 2017 (UTC)
No problem, do you suggest stopping the current report or using some other wording, maybe "images to keep an eye on"? −ebrahimtalk 12:11, 8 August 2017 (UTC)
@Ebrahim: Since the detection appears to be of skin, how about instead of "NSFW" using the word "Skin"? Also, a supplemental gallery format would ease patrolling.   — Jeff G. ツ 12:37, 8 August 2017 (UTC)
Jeff: You are right about the current state of the report :) moved the page to a less attention making title. Feel free to move that to any other place you feel more right. −ebrahimtalk 13:08, 8 August 2017 (UTC)
@Ebrahim: Thanks. You have some images there that scored 0.01, I thought your cutoff was 0.50. Also, could you reformat such numbers as 1% and 50%?
Fixed the formatting and removed the low ranked ones that was added for some testing. Thanks :) −ebrahimtalk 14:01, 8 August 2017 (UTC)
There are some other ready to use models here, for example flower classifier available there can be helpful for identifying unidentified flower maybe available on Commons. What else can be done with the models there do you think? −ebrahimtalk 14:09, 8 August 2017 (UTC)
The flower classifier was created for flowers commonly occuring in the United Kingdom - 102 of them. Applying that as is on a global data set would probably do more harm than good even if it's just for suggesting names. Also, while some plants are easily determined down to species level from a single photograph of a flower, in most cases that's just not enough information even for experts. --El Grafo (talk) 12:38, 17 August 2017 (UTC)
El Grafo: Thanks for checking out its detail.
There are some open trained models here in addition to ones here, I guess some of them can serve some use for Commons community as a helper or service, like "Image Colorization" (for adding artificial colors to the images), "Image Captioning", "Image Denoising" (it tries upscales images) and "Image Inpainting" (for watermark removal or such). What do you guys think? −ebrahimtalk 18:50, 24 August 2017 (UTC)

I think the NSFW/nudity issue is small compared to the fact we have over two million media needing categories. A more valuable use of this technology would be to create a "Commons game" similar to Wikidata: The Game. The software would propose categories for these backlogged files, and human reviewers would choose yes/no. It also could identify likely copyvios from these images, and their sources, allowing semi-automated deletion nominations. Guanaco (talk) 20:47, 24 August 2017 (UTC)

Great! That makes whole lot of sense! −ebrahimtalk 11:53, 25 August 2017 (UTC)
Guanaco: Just started to work on it here: MediaWiki:CommonsGame.js, is in very initial state but soon some tangible results can be seen :) −ebrahimtalk 12:43, 25 August 2017 (UTC)
Guanaco, Yann, Jeff, : Very initial, but worth to check I guess: CommonsGame, the "Add" button is not implemented yet (now it is but as the inaccuracy of the suggestions shouldn't be used anyway) and other than image deep-learning tags some other sources like the file title, file description and usage can/should be used but I think it is a proof-of-concept of some sort at least. There are also some other models available maybe with a better quality and possibly the file name itself is more reliable (as far as I see) but I believe it is better than nothing at this point at least. −ebrahimtalk 19:40, 25 August 2017 (UTC)
@Ebrahim: Good proof of concept. Unfortunately it seems not to identify anything correctly and has a very dark sense of humor, suggesting punching bag Category:Punching bags for File:面內抱 vs 面內前揹法.png! I think this will need a better model for matching, possibly using the file name as you say. This only solves the first half of the problem. It also has a hard time translating matches to relevant categories. For File:Chuck kosak.jpg (suit: 70%, Windsor_tie: 20%, bolo_tie: 1%, groom: 1%, Loafer: 0%) it suggests Category:Suita, Osaka, Category:Suiten, and Category:Suite Noa Noa. I expect this will need development on a much larger scale, with awareness of the category tree and learning based on typed user input. Guanaco (talk) 20:24, 25 August 2017 (UTC)
Guanaco: Thanks for the feedback! Well, on "suit" case, blame goes to Wikimedia's search as it uses its API, search result (of course I don't expect much more given data available on Commons, just wanted to indicate how the script works currently). As a workaround, Wikidata or English Wikipedia search can be used alternatively and then finding related Commons category on the search result pages. −ebrahimtalk 20:36, 25 August 2017 (UTC)
And of course this is just a started work, others can also participate also if interested, if someone could write a python script with an image as input (using whatever logic preferred, even pywikibot + current deep learning result if sees suitable) I can turn that script into a webservice and use it with this work as doing more of such logic would be easier on a powerful backend available tools I guess. −ebrahimtalk 20:47, 25 August 2017 (UTC)

"In other projects" section in side bar

Hello, the side bar has sections such as "Participate", "Tools", "In other projects", and "In Wikipedia". However, the "In other projects" and "In Wikipedia" sections are not next to each other, but "Tools" is between them. Can you switch the position of "In other projects" and "Tools", so the sections appear in the order I first mentioned? --167.57.216.201 14:22, 28 August 2017 (UTC)

Add Copyright owner and Uploaders info on Mainpage under MOTD and POTD sections.

Wikimedia Commons Mainpage has an daily average page view of 75,000+ and the MOTD as well as POTD are of main importance on it. I would like to propose/request that the Copyright owner name as well as the Uploaders name be added below the images which will give a clear view of the people who has contributed to the image. As Commons follows Attribution-ShareAlike 3.0 Unported {{CC BY-SA 3.0}} License the term of ATTRIBUTION too will satisfy by this step. This POTD AND MOTD acts as an Featured to wikimedia Commons this step can encourage many others to contribute more to the freely-licensed educational media content to all.

  • Why Copyright owner be featured?
The copyright owner has submitted it's creation for the sake of Creative Commons License use rather than maintaining it's copyright status which should be respected.
  • Why Uploders?
Uploaders are the one who have given their efforts to make the media available to Commons. They get more inspiration by such features on Mainpage. (Am sure they would like it).
  • What if Uploders and Copyright owner are the same?
No issues they can be featured in a single title.

--✝iѵɛɳ२२४०†ลℓк †๏ мэ 07:00, 23 August 2017 (UTC)


Feature request - upload files in the order specified by user

When I upload a batch of images, they appear in my uploads list in a different order than the order I specified. For example I select Image 1 then Image 2 then Image 3, but they upload in a different order: 2, 3, 1 or another random order. Please give the user the option to upload the images one by one, in the order they specify, even if that means they have to wait longer (for example the first image might be very big so you have to wait until it finishes uploading). Thanks. Fructibus (talk) 08:30, 29 August 2017 (UTC)

@Fructibus: This could be implemented client-side. What upload client(s) do use for batch uploads?   — Jeff G. ツ 10:27, 11 September 2017 (UTC)
@Jeff G.: I am using the only upload clients I know - the Special:UploadWizard - because there is a link for it on the left side of the window, saying "Upload file". Fructibus (talk) 11:35, 11 September 2017 (UTC)
@Fructibus: Did you "Select media files to share" or "Share images from Flickr"?   — Jeff G. ツ 00:38, 12 September 2017 (UTC)
@Jeff G.: I always just click the blue button "Select media files to share". I don't see any Flickr option. Fructibus (talk) 02:12, 12 September 2017 (UTC)

Limit GIF to 100 MB

SELECT CONCAT("* [[:File:", REPLACE(img_name, "_", " "), 
       "]] (", ROUND(img_size/1024/1024)," MB)") AS Files
FROM image
JOIN page ON page_namespace=6 AND page_title=img_name
WHERE img_media_type="BITMAP"
AND img_major_mime="image" AND img_minor_mime="gif"
AND img_size >= 10e7
ORDER BY img_size DESC;

I propose capping GIF uploads to 100 MB. Animations at these sizes will not animate (see $wgMaxAnimatedGifArea and noc) as the server usage is too great. A proper video codec should be used for animations and archival TIFF/PNG for still images. The abuse filter should direct users to a WebM converter. We should somehow exempt weather maps (video GIFs seem to be uploaded by newbies).

Other services such as Imgur limits files to 200 MB, but transcodes to MP4 on upload (previously 50 MB and 5 MB for untranscoded). —Dispenser (talk) 22:58, 16 August 2017 (UTC)

  •  Support The sensible thing to do. The first one mentioned above is probably a copyvio. Yann (talk) 23:18, 16 August 2017 (UTC)
  •  Oppose High resolution Photographs are encouraged and get prompted, but high resolution GIFs should be banned? Considering that GIFs are really only a minimal percentage of the files on this server I really do not see the logic behind this. Do we want high resolution or not? If storage is the problem limiting JPGs to 20MB would help a thousand times more. --Jahobr (talk) 23:59, 16 August 2017 (UTC)
    High filesize is not always high resolution. If really needed, WebM (for animations) or PNG (for still) can be used, which compresses more efficiently. Poyekhali 12:33, 17 August 2017 (UTC)
  • I have to  Oppose at the moment if for no other reason than the fact that I am loading these and they are in fact animated. —Justin (koavf)TCM 00:07, 17 August 2017 (UTC)
    @Koavf and Jahobr: I am not sure where to put the limit, but the issue is that on 99% of computers, these files won't play. I have a recent PC with ADSL and 8 GB RAM, and the first 2 files do not play for me. Regards, Yann (talk) 00:16, 17 August 2017 (UTC)
    @Yann: I don't think we need one in principle: it's fine for Commons to host files that are maybe impossible to play on any system as long as they are free and educational. —Justin (koavf)TCM 00:29, 17 August 2017 (UTC)
    IMHO "impossible to play" is not compatible with "educational". Regards, Yann (talk) 00:33, 17 August 2017 (UTC)
    @Yann: But that is only temporary. Someone can and will make a player. Remember how inaccessible .ogg was in 2002? The solution wasn't to avoid all audio/video nor to use a non-free format but to wait and make the tools to play it. What is impossible to render or play now will not necessarily be in the near future but if we prohibit an upload because of a file type restriction that is impractical in 2017, we could stop many from having access to it in the future. This is why I am entirely in favor of 3-D models, fonts, etc. being uploaded to Commons ASAP, even if we don't have in-browser support (yet). —Justin (koavf)TCM 00:55, 17 August 2017 (UTC)
    No. Nobody will make a player for an outdated format. See Zhuyifei's message below. Regards, Yann (talk) 00:57, 17 August 2017 (UTC)
    3-D models and fonts aren't outdated tho. Nor are GIFs. And even if something can't play in-browser, it's still worthwhile for us to serve it up here (assuming that it fits our scope, obviously). —Justin (koavf)TCM 01:05, 17 August 2017 (UTC)
    (Edit conflict)Well, you can play GIFs with a proper player like ffplay or possibly VLC, but that is not what GIFs are designed for, simple animations, not videos. You are forced to have low fps, resolution, and/or duration, and a super bad 256 colour space. --Zhuyifei1999 (talk) 01:09, 17 August 2017 (UTC)
    @Zhuyifei1999: Certainly, there are a lot of limitations which can and have been overcome but they are also not widely adopted. Your argument is just as sensible for refusing GIFs carte blanche. —Justin (koavf)TCM 01:17, 17 August 2017 (UTC)
    @Koavf: Well, you can certainly create a motion picture film for a video, and try to scan the film itself into an image; everyone who wish to watch the video would have to reconstruct the video from the movie tape scan frame by frame. Can it work? Yes. Can automated support to reconstruct the video be added? Yes. Does it exist already? Yes, movie projectors. Then why do we not adopt it? Because an image container format, including GIF, that simply store many images, only to recreate some sense of an actual video, is horribly inefficient. Yes, it may work for smaller animations, in which playback support is usually not needed; but for large (in size) videos with large fps (anywhere above a dozen frames per second), duration (anywhere above a dozen seconds), and/or resolution (anywhere above 240p), GIF will not suffice, not will support ever be added. Developers would better spend time to work on a "next-gen open codec", rather than hacking into the browser and adding playback support to GIFs. --Zhuyifei1999 (talk) 03:39, 17 August 2017 (UTC)
    BTW, I have no idea what you mean by "Your argument is just as sensible for refusing GIFs carte blanche". Please explain it in plain English. --Zhuyifei1999 (talk) 03:41, 17 August 2017 (UTC)
    Clarification: technically, on how GIF works (stores animations), it is just as worse as a scan of a motion picture film. --Zhuyifei1999 (talk) 03:47, 17 August 2017 (UTC)
    @Zhuyifei1999: Sorry if I was unclear: Do you think we should get rid of all GIFs then? If not, why just the ones below a certain arbitrary number? —Justin (koavf)TCM 04:01, 17 August 2017 (UTC)
    @Koavf: Small animations are okay, where playback controls are not necessary. GIFs larger than 100MB are definitely not animations. BTW: 100MB is not an "arbitrary" number; it's the threshold before chunked uploading is required to upload the file. --Zhuyifei1999 (talk) 05:25, 17 August 2017 (UTC)
    I'm not sure if the on "99% of computers, these files won't play" estimate is accurate. The full size "original file" version of all of these files load and play on my two year old phone (it has 2 GB RAM), so I suspect many computers will play them just fine. —RP88 (talk) 05:11, 17 August 2017 (UTC)
  •  Strong support IMO GIF is a super bad format. It can't even do 24-bit RGB, but limit you to 256 colours. What's the point of such lossy-ness when you have other formats (eg. VPx) where is quality loss is barely visible? Yes, GIFs are easier to play (automatically) on browsers, but browsers do not usually offer seeking/pausing/resuming/rewinding on GIFs. Even if these large GIFs were possible to play, for a large animation, proper video files with proper playback support are more suitable. --Zhuyifei1999 (talk) 00:44, 17 August 2017 (UTC)
    256 colors is more than sufficient for many animations, animations where GIF will reproduce perfectly the sharp lines and edges that VPx will mangle. There are many old games (some of which have been released freely, like w:Flight of the Amazon Queen) for which GIF would be the only common format that can losslessly reproduce a playthrough.--Prosfilaes (talk) 04:58, 17 August 2017 (UTC)
    Lossless = the format produce no quality loss whatsoever when being transcoded from raw video. GIF will be lossy in colour when transcoded from raw video.
    The question is not whether the format is lossy or lossless (yes, VPx can be much more lossy), but whether the loss is visible. VPx is able to produce a much less lossy video in a much smaller file size. --Zhuyifei1999 (talk) 05:25, 17 August 2017 (UTC)
    That's not what lossless means; it means without loss. In processing certain sources of video, that is video producing by playing old-school games or animations produced with the constraints of GIFs in mind, GIFs will reproduce the original video without loss, whereas VP9 won't. GIFs are not a good tool for video of real life, but many things aren't video of real life or simulation thereof.--Prosfilaes (talk) 06:18, 17 August 2017 (UTC)
    Old video game replays may be watchable in log FPS / resolution, but real life recordings? Certainly not. Face it: none if the GIFs in the list of over 100MB are old video games. If they are "produced with the constraints of GIFs in mind", the recordings are unlikely to exceed 100MB, as it takes around 20 GIFs at 99% mark (5MiB per Dispenser) to produce such a large 100MB. This proposal does not concern those GIFs under 100MB, which video game replays should be.
    Also, colour loss is loss. It is approximation of the original colour. If you transcode a colourful PNG into GIF and back it will be different, just as if you transcode a FLAC into OGG then back. Similarly, just because you can use jpegtran to losslessly crop a JPEG using chunk/block mechanisms does not mean JPEG the format itself is lossless. --Zhuyifei1999 (talk) 06:42, 17 August 2017 (UTC)
    I don't know what you mean by 20 GIFs at 99% mark. 30 FPS for an hour is 108,000 frames, which aren't going to be under 1k per frame, so one hour of old-school game footage is going to be over 100 MB. Yes, color loss is loss. But if you transcode old-school graphics into GIF, you can reproduce them exactly. That's not true for VPx.
    Nothing is lossless; all image and video formats have limits on the range of colors they reproduce. It's all about reproducing the source data correctly, and GIF does that for certain datasets.--Prosfilaes (talk) 10:12, 17 August 2017 (UTC)
    I don't think it's worth to consider a hypothetical "30 FPS for an hour" ... "old-school graphics"/"certain datasets" case. The list of currently known violations does not contain any such datasets, at all. If someone really knowledgeable wants to override this, of course we may consider giving 'autopatrolled' a possibility; but for newbies, no, they are almost never using the appropriate technology for their not-suitable-for-GIF dataset.
    Regarding 20 GIFs at 99% mark, see Dispenser's comment at 02:09, 17 August 2017 below. --Zhuyifei1999 (talk) 10:32, 17 August 2017 (UTC)
    BTW: does footage of old games really need 30 FPS? My experience on old games are almost always low FPS due to slow processing speeds at the time --Zhuyifei1999 (talk) 10:38, 17 August 2017 (UTC)
    The current list of large GIFs is nine files; that doesn't justify an "at all". My statistics is a little weak, but our sample is entirely consistent with 10% of the large GIFs uploaded being old-school playthroughs. Putting up blocks to the rare uploader who uploads large GIFs seems unnecessary.
    Download an old speed run from Youtube and step through it. They're clearly at 25 or 30 FPS. Even the ones that aren't consistently at that rate have elements that update every frame change. Many games, no matter what the system, hit a stable 25/30 FPS; it's a requirement to look half-decent and if you're using a TV as a monitor, you have to match screen refresh rate.--Prosfilaes (talk) 11:18, 17 August 2017 (UTC)
    Well, maybe 10% of large GIFs are indeed a playthrough, and the chance of sampling 9 consecutive non-playthroughs from an infinite population of 10% playthrough and 90% non-playthrough is around 38.7%, which isn't too bad. But what do you think the chance of the newbie-uploaded GIF not being immediately speedied / tagged no-* / filed DR as suspected copyvio, given #Restrict_Video_Uploading? About none, sorry. And no, a newbie is much more likely to upload in a proper video format due to the massive size reduction. --Zhuyifei1999 (talk) 11:57, 17 August 2017 (UTC)
  •  Comment
    GIF: 78 MB, framerate judder (only 25 or 33.3 fps timings), color dithering or banding, autoplay, no browser playback controls (T85838)
    Ogg Theora: 19 MB, 30 fps, supports audio, no autoplay (T116501)
    99.0% of GIFs on commons are under 5 MiB (see table). I've included an example on the side with a 100 megapixel GIF and its video version (please enable Media Viewer [use incognito mode]). So another possibility is to only warn users on large GIF uploads that we have video tools and video tools work better than GIF for certain things. —Dispenser (talk) 02:09, 17 August 2017 (UTC)
  •  Support Even the largest animated GIF from the list above runs without problems in full resolution on my notebook (macbook with 16 GB memory). But I agree that GIFs are not the appropriate format for such movies. They should be converted to one of our supported video formats, i.e. Ogg Theora or WebM, which compress much better than GIFs. The autoplay feature is also a nuisance in case of these monster GIFs. --AFBorchert (talk) 06:49, 17 August 2017 (UTC)
  • Neutral, but... among the European Space Agency uploads there have been several GIFs. These are especially useful to animate a series of large satellite images (or digitally enhanced images) that show sequences of weather or other long timescale events. Fortunately they have been of modest size so far (~10 MB), however were I forced to transcode desirable files like this, I probably would not bother. Providing in-built transcoding facilities for video files should remain a high priority if Commons is ever going be a serious host for multi-media, rather than just a poor cousin to Flickr. -- (talk) 06:52, 17 August 2017 (UTC)
  •  Support Large GIFs are huge pains for old computers and phones. By allowing them to be used on sister wikis like Wikipedia, we are not making Wikimedia an all device-friendly website. Also, those with data plans would be surprised to see that they have to pay 10x more than normal just because a large GIF has been loaded. If really needed, WebM VP9 or VP8 only can be used instead for animations, or PNGs for still (and also JPEG, which would be used in sister wikis). GIF has long been superseded by PNG, not only in bit depth but also in compression and animation (although APNGs are not yet supported by most browsers). --Poyekhali 12:27, 17 August 2017 (UTC)
  •  Oppose It's a free format with some potential uses; like I mentioned above, 256 color games can be losslessly filmed in GIF, whereas we support no format that can losslessly store it. It's suboptimal for many of the files above, but because someone has uploaded a handful of files in a suboptimal format is not worth hardcoding a ban on the format. It would have taken a lot less time to delete those files then have this discussion, and banning the format is not going to magically make those files get uploaded in a useful format.--Prosfilaes (talk) 00:25, 19 August 2017 (UTC)
    Considering that an abuse filter warning message can, in fact, link to a help page, yes, banning the format is able to magically make those files get uploaded in a useful format. --Zhuyifei1999 (talk) 22:54, 19 August 2017 (UTC)
  • Comment/question: is this about GIFs in general or strictly about animated GIFs? Most GIFs are not animated, but some of the discussion above seems to presume animattion. - Jmabel ! talk 15:32, 19 August 2017 (UTC)
    The edit filter can only target GIF as a whole, but people here seem have forgotten what a Megabyte can hold. For reference: w:VLC media player with every codec compiled in — 31 MB, w:Mozilla Firefox with a codebase that broken the VS compiler — 43 MB, w:Blender 3D full fledged 3D modeling, animation, and game engine — 86 MB, MacOS 8 Operating system (about 160 MB, release 1997), Audio CD or CD-ROM 650 MB. Also, Animated PNG are generally more efficient. Dispenser (talk) 16:35, 19 August 2017 (UTC)
    A megabyte can't hold anything worth talking about. Code is much more concise than pictures or video. That CD-ROM, in VCD format, can hold 80 minutes of video, approaching the lousiest quality of any standard home video format. Animated PNG may be more efficient, but it's not standard.--Prosfilaes (talk) 16:47, 19 August 2017 (UTC)
    10,000 by 10,000 pixels @ one byte per pixel would hit the 100MB limit, provided the GIF is uncompressed. Maybe there's some 30,000 x 30,000 pixel GIF that proper LZW compression doesn't bring under the 100MB limit, but it would be an incredibly unusual beast, and almost certainly better in PNG.--Prosfilaes (talk) 16:47, 19 August 2017 (UTC)
    Exactly. Why would anyone store a gigantic still image in GIF instead of PNG? --Zhuyifei1999 (talk) 22:56, 19 August 2017 (UTC)
  •  Support per Zhuyifei1999. --Steinsplitter (talk) 17:04, 19 August 2017 (UTC)


Only warn uploaders

GIFs size distribution
Filesize % of GIF files
<  0.3 MB 84.1%
<  2.6 MB 97.7%
<  4.9 MB 99.0%
< 23.0 MB 99.9%
< 32.2 MB 99.95%

Since the animation size restriction is broken ($wgMaxAnimatedGifArea), the 100 MB limit (which guaranteed no animation when downscaled) only made sense with that context. Since we don't have a transcode on upload facility like Imgur; I suggest creating an edit filter to warn/remind uploaders that video content is better compressed/transmitted/viewed in WebM. I think a good trigger for it would be 5 MB (Imgur's old limit) or 25 MB (< 0.1% of GIF uploads). —Dispenser (talk) 23:17, 23 August 2017 (UTC)

The Phabricator task been resolved, all thumbnails of GIFs > 100 animated megapixels are frozen now. Dispenser (talk) 16:08, 12 September 2017 (UTC)

Finalize Commons:Creator and approve as policy

I would like to finalize Commons:Creator and approve as policy. One of the controversies around Creator templates is the scope of the template, "clearly yes" and "clearly no" sections were never controversial, but many of the following "gray area" cases still cause conflicts, like here or here. I would like to clarify what would be consensus about each type of use, so I do not have to have discussions like this. --Jarekt (talk) 04:21, 28 August 2017 (UTC)

Commons users

Few dozen Commons users created vanity creator templates for themselves. Since we have no policy against it, such templates are usually not deleted, however they are usually tagged with "type=commons user" so they do not clog maintenance categories. There are at least 36 of them and can be seen in Category:User creator templates.

If they meet them, they'd be eligible for a creator templates, just not only for them being Commons contributors – as stated above. (Same for the Flickr photographers.)
What clarity specifically do you miss from the Wikidata notability criteria? (They are very permissive of course.) --Marsupium (talk) 21:16, 28 August 2017 (UTC)
Yes, they are very permissive. I mean that many Commons contributors are eligible for an entry there, and that's not what most people expect here. Regards, Yann (talk) 22:27, 28 August 2017 (UTC)
Marsupium, I think what Yann is saying is that Wikidata Notability criteria are broad and can be hard to interpret in this context. For example person is notable if the item has a sitelink to Commons (or other projects), so if you create a one-image gallery on Commons you can create an item on Wikidata which allows you to have creator template. That creates kind of circular dependency. Maybe we should limit Notability to Wikipedia and Wikisource. Or maybe the we should just accept that there is no simple policy that works for 100% of cases and deal with edge cases on page by page cases. --Jarekt (talk) 00:33, 29 August 2017 (UTC)
OK, I see. I think you're both right. So IMHO:
  • Creator templates should only be used for the author of (so to say primary) media that are itself in the project scope, not of those media just depicting/describing/illustrating something else that is in the project scope. If there is no media that should get a creator template also the template shouldn't get created.
Additionally:
Thus, Category:User creator templates should (as of a spot check I just did) get at least almost emptied. --Marsupium (talk) 08:15, 29 August 2017 (UTC)
To be clearer, I wasn't talking about circular dependency. To be eligible for an entry in Wikidata, you only need an external reference. Anyone who is acting somewhere in an official capacity, and anyone who publishes something somewhere may get one. That's quite large... (I meet the criteria ;oD). Regards, Yann (talk) 08:42, 29 August 2017 (UTC)
  •  Oppose: same opinion. --Marsupium (talk) 21:16, 28 August 2017 (UTC)
  •  Oppose. User templates as creators should be in user namespace. So they cannot have notability in wikidata. -- Geagea (talk) 08:41, 29 August 2017 (UTC)
  •  Strong support I would like to one day see creator templates for most of our content. If a file is in scope, then we want complete information about it. Creator data is part of this. Also, this is an inexpensive way of providing recognition to our contributors, encouraging them to upload more original content. Guanaco (talk) 08:51, 29 August 2017 (UTC)
  • Comment Category:User creator templates is used on templates about people (including me) who also meet the Wikipedia or Wikidata notability criteria. Also, this whole debate may be rendered moot by the coming of structured data - we will need some way of holding data about every creator. Hence @SandraF (WMF): for info. And please don't accuse good-faith contributors of acting out of "vanity". Andy Mabbett (talk) 11:53, 29 August 2017 (UTC)
    • P.S. The template's documentation says that |type=commons user is for "For people directly contributing to Commons, aka Commons users." It says nothing about such use being for people who are only Commons users - there are many Commons users who are also known more widely. Andy Mabbett (talk) 12:26, 29 August 2017 (UTC)
Andy I crossed word "vanity" used above, even if many templates in Category:User creator templates seem to meet that label. I totally agree with you that structured data should rendered this discussion moot. As I said in the opening paragraph I do not want to penalize otherwise notable people for also contributing to Commons, so this discussion does not apply to otherwise notable individuals. --Jarekt (talk) 16:59, 29 August 2017 (UTC)
  •  Comment Wherever creators of media here are stored, at least I don't see a point in maintaining the (not sourced) date of birth of Creator:Alain Meier (sorry, he simply is the first one in the category) here or at Wikidata. (There should to be drawn a line somewhere between Leonardo da Vinci and IPs perhaps.) --Marsupium (talk) 12:51, 29 August 2017 (UTC)
  •  Support I stand with Guanaco here; it's somewhat useful, and it's a cheap way of supporting our contributors. As for Marsupium's comment, I see good reason for maintaining the date of death of Creator:William Starner; he has a couple photos on Commons, and it's possible I may find many other Commons-worthy shots as I go through his photos; and at some point, 48 years in the future, his photos will be out of copyright in Canada and other places and his date of death will let people know that.--Prosfilaes (talk) 16:40, 29 August 2017 (UTC)
  •  Oppose only for creators notable enough to have an article in Wikipedia Christian Ferrer (talk) 18:35, 29 August 2017 (UTC)
  •  Support creator templates (for now). Photographers don't need to be notable themselves (Wikipedia notability is irrelevant in this discussion; we also have lots of photographs by long dead, non-notable photographers), it's about structured information. Currently in a rather primitive way (by using these templates), in the future there will be other ways (als already mentioned), in the meantime we can leave things as they are. Gestumblindi (talk) 20:10, 22 September 2017 (UTC)
  •  Support I was against or indifferent until I read comment by Prosfilaes. That convinced me that there is a useful purpose to having this. Thanks, Amqui (talk) 13:08, 27 September 2017 (UTC)
  •  Support Per Prosfilaes and Guanaco. In addition to photographers, there are users like Sodacan who contribute extremely high-quality original vector images to Commons, such as this featured one: File:Royal Coat of Arms of the United Kingdom.svg. There are others who create original maps and other very valuable files used throughout the Wikipedia projects. Some of the original work here is superior to what you find on commercial clipart CDs and websites, yet they choose to contribute here. I don't see why they should not be allowed to use a creator template if they so desire. The quality of the work is more important than the notability of the creator. Wikimandia (talk) 23:02, 19 October 2017 (UTC)
  • Indifferent. Harmless, IMO. If commons users can have personalized templates anyway, what difference does it make? Wikipedia (which wikipedia?) notability threshold is too vague - how can you decide a gray-area case without creating an article and testing DR consensus in wikipedia? Retired electrician (talk) 00:27, 22 October 2017 (UTC)

Flickr photographers

Some of Flickr photographers have many thousands of images on Commons, some of them have creator templates.

For most Flickr photographers all we know is on their author flickr page. A link to that page should be sufficient. Templates and Wikidata entries are cheap but data about most flickr authors can not be sourced properly. --Jarekt (talk) 16:41, 29 August 2017 (UTC)

Non-individual Creators

We have many creator templates for institutions, projects, factories, manufacturers, multi-generation photo studios, newspapers, corporations, etc. Some of them are in Category:Group creator templates or Category:Corporate creator templates categories. Many fields of creator templates, which were created with people in mind, are not relevant for them.

  • I  Weak oppose such templates, especially new ones. I do not mind creating other templates for them, like Institution templates, but creator templates are a bad fit. --Jarekt (talk) 04:21, 28 August 2017 (UTC)
  •  Oppose: unfit for groups. Nomen ad hoc (talk) 15:25, 28 August 2017 (UTC).
  •  Oppose Better to use Institution templates for these. Regards, Yann (talk) 18:23, 28 August 2017 (UTC)
    •  Weak support Dunno. Some of those are for two brothers, where it's impossible to tell which one took it -- or a father / son combination, etc. Those seem to be within the spirit. Some of those may be a group of authors, whose members are anonymous, which also seems like it fits. If something fits better as an institution template, by all means use that instead, but there are likely some oddball situations which are not individuals, but where an institution template is not appropriate. That template does not have fields for work period, etc., but are more designed as a marker of where a work of art is located today, which is not the same thing as a creator template. It is going to be hard in some situations to define a line where a creator template goes outside the lines, but there will definitely be cases where non-individual templates make some sense. The idea is to collect information about the creators to aid identification, etc. If there is nowhere else for that information to go, then barring them on creator templates is only hurting. Carl Lindberg (talk) 19:33, 28 August 2017 (UTC)
      • Creator templates are expected to be for individuals. So we either expend that, or we create another template for organisation and groups. This needs a major change anyway. Regards, Yann (talk) 22:31, 28 August 2017 (UTC)
        • Given that the template's documentation includes |type=corporation and |type=group, this claim would appear to be false. Andy Mabbett (talk) 12:23, 29 August 2017 (UTC)
          • Early on we identified that non-individual Creator templates cause issues in our maintenance categories and |type= is used to keep track of non-individual creators and exclude them from most maintenance categories. Since template writers were asked not to add categories to the pages, we use |type= to add pages to Category:Creator templates by type subcategories. --Jarekt (talk) 16:28, 29 August 2017 (UTC)
    •  Comment I don't mind whether to store them in one template or another. One could improve the support by {{Creator}} for them or spin them off to a new template. As to institution templates I'm sceptical if they are suitable currently, they are for institutions collecting objects, not creating them, currently used 'work location' and 'work period' make more or less sense for these, not for the others. I think the decision about those non-individuals needs some more investigation on the alternatives. The information should be stored somewhere, thus the most practical way might be the best whatever that is. --Marsupium (talk) 21:16, 28 August 2017 (UTC)
      •  Comment Marsupium, template:Creator is becoming too complicated to add proper support for Non-individual Creators, but I think spinning them off to a new template would be a good idea. We could store such templates in the creator namespace or keep them in the template namespace. --Jarekt (talk) 16:28, 29 August 2017 (UTC)
  •  Comment. As Yann suggested, Institution templates are the proper template for them (which should be improved as well). They are not creators. Only photography companies can be considered as creators.-- Geagea (talk) 08:41, 29 August 2017 (UTC)
  •  Oppose Christian Ferrer (talk) 18:35, 29 August 2017 (UTC)

New idea about Non-individual Creators

@Marsupium, Pigsonthewing, Yann, Clindberg, Geagea, and Christian Ferrer: Maybe the best solution would be for pages in Creator namespace to come in 2 flavors:

  1. one for people - served by current Creator template
  2. one for non-person Creators - served by some new template in order to simplify code and documentation. I do not have a good idea for a name, maybe template:Non-individual Creator, template:Non-person Creator, template:maker, template:Creator entity? Other ideas would be appreciated. It would have following fields:

Such fields would provide minimum of information to wide range of groups, studios, manufacturers, newspapers, corporations, etc. that can be refereed to as " authors" of some work. --Jarekt (talk) 13:27, 19 September 2017 (UTC)

A "GroupCreator" template like that could work, sure. Could also have a field of "members" if we do know of related individuals (if the creator is a father/son combination who only used the last name or company name to mark works, etc., things like that). Work period start / end may be enough -- not sure we need a flourished field. A group could have multiple locations of course (as can individuals). Carl Lindberg (talk) 16:05, 19 September 2017 (UTC)
I like the "GroupCreator" term it could be a good template name. Or maybe template:Creators. Clarification about "Work period" on Wikidata there are work period (start) (P2031), work period (end) (P2032) for describing work period in a for of 2 dates and floruit (P1317) to describe it in the form of a single date, like 19th century. --Jarekt (talk) 03:24, 20 September 2017 (UTC)
No, we don't need two, near-identical templates. As noted above, we already have |type=group to distinguish from templates about individuals. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:59, 20 September 2017 (UTC)
The big issue with that is it has a rather large dependency on birth date and death date, as those are displayed in the header, and you have to jump through a couple of extra hoops to avoid that. Granted, the template coding could also check type=group to fix that as well. I'm not sure how much complication to the rest of the creator system that a second template would add, or if it's better to have conditional behavior in one template. Carl Lindberg (talk) 22:17, 20 September 2017 (UTC)