User talk:Keegan (WMF)

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Welcome to Wikimedia Commons, Keegan (WMF)!

-- Wikimedia Commons Welcome (talk) 04:15, 24 June 2013 (UTC)[reply]

Fall[edit]

Re "Structured Data on Commons Newsletter - Fall 2018 edition" - please avoid using seasons as date indicators; it is spring, not fall, in half of the world right now. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:18, 9 December 2018 (UTC)[reply]

You're absolutely correct, I do know better. Thank you for bringing it to my attention. Keegan (WMF) (talk) 18:14, 10 December 2018 (UTC)[reply]

Potential RfC about copying descriptions to captions?[edit]

With regards [1], what would you think about a RfC about bot-copying descriptions to captions? If it's now accessible through the API (if there's documentation for that somewhere?), then I could probably code up a bot to do that now, so we can see if the community wants to do that. Thanks. Mike Peel (talk) 22:20, 22 January 2019 (UTC)[reply]

By all means, have a go. The start to the documentation is a simple sentence, but following the setLabel link contains more information. Implementation is up to the community of course. Keegan (WMF) (talk) 22:27, 22 January 2019 (UTC)[reply]
Just curious, but isn't there a character limit on "captions" which is lower than the one on "file descriptions"? Ideally all important information from descriptions should be copied, but could this be done with a lower character limit? Maybe it would be possible to also list a higher character limit for file captions in that RfC for optimal information preservation. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 22:45, 22 January 2019 (UTC)[reply]
It may be technically feasible to extend the character limit if the community would like. However, since it's not currently in the plans to do so and the team has the rest of the SDC features to develop, I have no idea how long it would reasonably take to implement such a change. Keegan (WMF) (talk) 18:14, 23 January 2019 (UTC)[reply]
Right now, I'm stuck with this. I can code up reading a caption OK, but if I try to write a new caption back then I get an error message saying "pywikibot.exceptions.Error: API write action attempted without userinfo". I suspect that's a problem with my understanding of how the pywikibot code that accesses the api works, rather than a problem with structured commons, but I can't find a way around it right now. It seems to be quite different from hooking into the Wikidata descriptions through the Wikibase interface. Thanks. Mike Peel (talk) 23:56, 28 January 2019 (UTC)[reply]
Still stuck. I've now posted it on phabricator: phab:T214987. Thanks. Mike Peel (talk) 20:50, 30 January 2019 (UTC)[reply]
Hey @Mike Peel: sorry, I've been offline the past few days. I'll ask around and see what I can find out. Keegan (WMF) (talk) 06:40, 31 January 2019 (UTC)[reply]
@Mike Peel: still waiting to hear but relatedly, this week I'll have the licensing designs for captions. The community may want to talk about how to apply licensing to something like caption migration from descriptions, so maybe we can hang on until we see what happens there. Keegan (WMF) (talk) 17:55, 4 February 2019 (UTC)[reply]
Thanks - I found the bug in my code, so I can now bot-edit captions (example)! I'm happy to wait until after the licensing designs for captions announcement before drafting an RfC, as that does sound relevant here. Thanks. Mike Peel (talk) 20:32, 4 February 2019 (UTC)[reply]

Was the licensing design announcement Commons_talk:Structured_data#CC0_licensing_mockups? If so, then it will help a lot with the licensing of captions for new files, but not what to do with the existing file descriptions.

In terms of the RfC, I'm thinking about asking a simple question at Commons:Village pump/Proposals and seeing how things go from there. How does this sound:

Do we want to bot-copy descriptions to captions?

Structured Data on Commons released its first feature last month: media files can now have captions in different languages. Captions are quite close to descriptions, except that they are structured by language. It is technically possible to bot-copy descriptions to captions (e.g., [2], [3] were copied using pywikibot), so my question is: do we want to do that for all files, and if so, under what constraints? If there are copyright issues, then how can we address them?

Thanks. Mike Peel (talk) 21:58, 5 February 2019 (UTC)[reply]

I would suggest going ahead and getting in front of the copyright issues: captions are CC0, so for a description to be migrated hypothetically it has to be a clearly non-creative statement of fact (to satisfy US law). I think you'd want, at least initially, some sort of project for this with a file set that has clearly defined, simple descriptions (whether imported that way, or manually curated) and set up a process around requesting a bot run for this criteria. I'd similarly work such a proposal, working a narrow scope and seeing where it takes us. If we start out with "do we want to bot migrate all files? what could be a blocker or problem?" I see us having trouble even getting started. Just my thoughts, the WMF has no official opinion as the community controls content policy :) Keegan (WMF) (talk) 22:09, 5 February 2019 (UTC)[reply]
I am coming at this from a technical perspective: I can now copy descriptions to captions using bot code. What to copy, and why, is a community question, hence the RfC proposal. If it's going to come down to copyright, then this gets messy very quickly. Thanks. Mike Peel (talk) 23:10, 5 February 2019 (UTC)[reply]
Sure. I guess what I'm saying is that captions being CC0 is a built-in constraint to the proposal, so I personally would include it rather than leave it out. "I can copy descriptions that seem simple enough to fit CC0 into captions, is this something we should do?" seems like a simpler question for the community to decide in framing. Again, my two cents, write it however you think is best for the community, or ask some others who have been chipping in on the talk pages? Keegan (WMF) (talk) 00:10, 6 February 2019 (UTC)[reply]

I thought about this some more, and have written a v2. How does this sound to you?

Do we want to bot-copy descriptions to captions?
Structured Data on Commons released its first feature last month: media files can now have captions in different languages. Captions are quite close to descriptions, except that they are structured by language. It is technically possible to bot-copy descriptions to captions (e.g., [4], [5] were copied using pywikibot). There is a potential copyright issue here, in that captions are CC-0, which perhaps could be avoided by only copying short descriptions (say, under 200 characters) where they are sufficiently short/simple that they can't be copyrighted (as per WMF legal). Do we want to do that for all files, or are there other concerns that need addressing? Thanks. Mike Peel (talk) 20:24, 6 February 2019 (UTC)

Seems okay to me. @Jarekt: Mike is looking at a proposal for a bot to copy descriptions into captions. Thoughts? Keegan (WMF) (talk) 21:31, 6 February 2019 (UTC)[reply]
I would make sure that it is clear that SDC is CC0 before any mass population of captions. I am afraid that the current captions are CC-BY-SA and we might have to treat them different than future CC0 ones. I would prefer to have it all clearly spelled out before creating many new captions. Also I feel like other attributes, like day, author, camera type, etc. are much easier for bot to do right so if I was running such bot I would start with easy cases before attempting more tricky ones like captions. But that said if we concentrate on descriptions with no more than 4-5 words for images where language is clearly marked than that might be easy enough to have very low percentage of issues. --Jarekt (talk) 03:11, 7 February 2019 (UTC)[reply]
I'm curious, why could current captions that satisfy the legal requirement of simple statements of fact not be moved to a CC0 license? Keegan (WMF) (talk) 18:00, 7 February 2019 (UTC)[reply]
Keegan (WMF), I (and likely you too) have witnessed many copyright food fights, some have valid points and some were ridiculous (in my opinion). I try to stay out of them as much as possible, but that often requires a mix of self-censorship and being devil's advocate to predict what might get me in trouble. I understand that the intention of caption is to provide simple statements of fact which likely is not copyrightable (gray area), however someone might argue that any free text field allows person to input statements which are creative and copyrightable, so the process of changing license on them might require individual inspection of statements, to verify that someone did not sneaked is some en:Haiku there. I think that is a bit of splitting hair, but listening to people opposing harvesting of metadata from Commons or wikipedia infoboxes, because that is CC-BY-SA text and anything derived from it is derivative work covered by the same license, I would prefer not to have the same argument about captions. My preference would have been to clarify SDC license before any rollout, but too late for that. So the second best would be to clarify it before large bot run, before we have large volume of captions. I might be paranoid, but i am just trying to avoid conflict. --Jarekt (talk) 19:15, 7 February 2019 (UTC)[reply]
Right, I understand, we're on the same page as far as concerns go. I also recommend proposals being narrow in scope and conservative in nature. Keegan (WMF) (talk) 19:27, 7 February 2019 (UTC)[reply]
I'm still inclined to post this proposal so that we can see what everyone thinks. If it comes back as a negative, then hopefully it will provide useful input for a second proposal that is more narrow/conservative. Unless you object, I'll post it later today. Thanks. Mike Peel (talk) 14:11, 8 February 2019 (UTC)[reply]
It's all in the hands of community process, y'all do as you see fit :) Keegan (WMF) (talk) 20:29, 8 February 2019 (UTC)[reply]
OK, it's now at Commons:Village_pump/Proposals#Do_we_want_to_bot-copy_descriptions_to_captions?. Thanks. Mike Peel (talk) 21:16, 8 February 2019 (UTC)[reply]
Well ... that didn't go quite as I expected it to. Sorry! Mike Peel (talk) 16:40, 11 February 2019 (UTC)[reply]
Well, you could have led with that... Mike Peel (talk) 20:38, 11 February 2019 (UTC)[reply]
@Mike Peel: my point there is to dispel the notion that how captions are populated comes from some sort of WMF mandate or expectation. I've been reading over the conversations and I'm getting the impression that some people are reading what I've written to say "The WMF says all descriptions can be captions because they're short and all the descriptions must be copied, but the WMF is WRONG and didn't consult ANYONE!" and that's not what I've said at all. Quite the contrary, we're providing the software and the licensing while the process of implementation is in the community's hands. Keegan (WMF) (talk) 21:02, 11 February 2019 (UTC)[reply]

Stuctured data - depict[edit]

Hello, for some time now, I see edits like this in my watch list: https://commons.wikimedia.org/w/index.php?title=File:Dyke_March_Berlin_2019_030.jpg&curid=80869063&diff=367003720&oldid=366029827 "Dyke March Berlin 2019" is an unspecific but correct statemant. However it does make no sense to add this statement to only one file in the category. It should be all files of the category, or none. And if mass edits like adding depict statements to many files (same statement - many picture, or different statements - many files in the same category, or many statements for the same file) these should either not show up in the watchlist at all, or in a way that is not disruptive. Maybe this edits could be marked as bot-edit? --C.Suthorn (talk) 12:07, 19 September 2019 (UTC)[reply]

See User_talk:Reinheitsgebot. Keegan (WMF) (talk) 20:08, 20 September 2019 (UTC)[reply]

Whoops![edit]

Hey Keegan, just FYI: my revert at Commons talk:Structured data was not intentional – my apologies. I guess I must have accidentally hit the rollback button on my watchlist without noticing. Thanks @Multichill: for cleaning up after me. --El Grafo (talk) 10:09, 9 October 2019 (UTC)[reply]

Testing the Computer-aided_tagging[edit]

I just saw this and that's how I got to know about about this tool. I tried to go to Special:SuggestedTags but it is not allowing me. Not sure how to sign up for testing. So I was wondering if you could give me access to this tool. I am not much active with Structured Data but I do revert vandalism associated with SD and sometimes I add captions. Masum Reza📞 17:33, 28 November 2019 (UTC)[reply]

@Masumrezarock100: I think you should now be able to access Special:SuggestedTags. The tool will be released within a week or two. Keegan (WMF) (talk) 18:44, 2 December 2019 (UTC)[reply]
Can you ad me as well please? --DannyS712 (talk) 02:30, 9 December 2019 (UTC)[reply]
@DannyS712: the tool will be live for all logged in, confirmed users in about 48 hours; I don't think any more testers are being added at this point. But your feedback about the tool is welcome when the tool is opened up on Thursday. Keegan (WMF) (talk) 17:28, 10 December 2019 (UTC)[reply]
Would you be willing to add me anyway? I know its like for everyone, but there is a difference between the tester group rights and the normal rights, and I want to figure it out before filing a phab task. Thanks, --DannyS712 (talk) 05:19, 24 December 2019 (UTC)[reply]

SuggestedTags[edit]

I tried out Special:SuggestedTags. Mostly, I was dissatisfied with the vague suggestions (e.g., "Lady" when the subject's name is known). I'd like to see:

I don't know if these are common requests, but I think they'd be helpful to me. WhatamIdoing (talk) 20:23, 29 March 2020 (UTC)[reply]

OAuth against Commons SPARQL Query Service[edit]

Hi! I noticed that on Commons:SPARQL_query_service, you wrote that you can use OAuth with the Commons SPARQL query service. I have only been able to use it from a brower, with a session cookie. See https://stackoverflow.com/questions/65303450/how-to-authenticate-to-wikimedia-commons-query-service-using-oauth-in-python and https://stackoverflow.com/a/65424900/678387 . Is it possible to use OAuth? Is there any example code?

If it's not possible, should the documentation be updated to avoid saying it is, or is it an implementation bug? --FrankieRayRobertson (talk) 08:45, 25 December 2020 (UTC)[reply]

Do you know who updates the dump for the Commons Query Service?[edit]

Hi Keegan

I'm running an event this weekend as part of a hackathon and want to include new SDC we are adding this week. Do you know who updates the dump for the query service so I can request they press the button?

Thanks John Cummings (talk) 09:48, 15 April 2021 (UTC)[reply]

All Wikimedia dumps are owned by User:ArielGlenn. I'd suggest filing a Phabricator ticket and see if it can be done in time. Keegan (WMF) (talk) 15:34, 15 April 2021 (UTC)[reply]
Thanks very much. John Cummings (talk) 22:38, 15 April 2021 (UTC)[reply]

Select items for Add Tag from a table[edit]

I had 80 images to add tag... all of them ceramic Victorian majolica... one at a time. Very time consuming. Time I do not have. I confess I had to give up until I have more time. Most of the suggestions were inappropriate. Could a table be presented, with the option to select items, all of them to be given the same tag, in this case "ceramic". Then repeat the table with the option to select items for which to add the tag "Victorian majolica"?