Commons talk:Structured data/Lua

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Getting started[edit]

@Mike Peel, Multichill, and Jarekt: I'm getting ready to let the community know about Lua, but having some documentation might be helpful. I have no idea what I'm doing or talking about here, so any assistance in drafting this would be appreciated. Keegan (WMF) (talk) 20:09, 4 November 2019 (UTC)[reply]

Keegan, I will start work on documentation. A year ago I was working on Module:Information. My plan was to replace {{Information}} (without SDC support), and once we fix any issues that might come up, add SDC support to Module:Information. As I recall either you or someone from SDC team was opposed to Lua rewrite of {{Information}} since any updates to {{Information}} might be too heave on servers. I still think that is the best path forward, and was thinking about proposing it at Commons:Village pump/Proposals. Any thoughts? Also I will be at WikiConference North America 2019, will you or other SDC team members be there? --Jarekt (talk) 20:24, 4 November 2019 (UTC)[reply]
I will not be attending this year, I don't think anyone from the team is attending this year. @Jdforrester (WMF): had the performance concerns, perhaps he can still speak to them. Keegan (WMF) (talk) 21:57, 4 November 2019 (UTC)[reply]
@Keegan (WMF) and Jarekt: and all: Having mulled this over, I have a probably controversial suggestion. The situation here is different from Lua access to Wikidata: most Lua use here will be in templates that are only edited by a few people, rather than all of the different projects that need Lua access to Wikidata in different ways. You could almost argue that we just need {{Wikidata Infobox}} for categories, {{Structured Data}} for files (after further iteration), and something like {{Wikidata Gallery}} for galleries (although there should be a SDC-queried version of this at some point) - at least that could do 95% of the job. The exception is accessing SDC info from other wikis, like using captions attached to file thumbnails in articles or infoboxes, which I don't think is working yet - try running {{#invoke:WikidataIB | getLabel | M11010176 }} here vs. on enwp. So perhaps it would be worth holding off on this until Lua access from other wikis is available, and we can announce the {{Structured Data}} template as something that's ready to use at the same time? That's not to say that documentation would be useful here, and please feel completely free to disagree! (Sorry for moving the goalposts...) Thanks. Mike Peel (talk) 21:03, 4 November 2019 (UTC)[reply]
Mike I think we are envisioning the same end, but my plan for {{Information}} template would be to:
  • deploy current Module:Information without any SDC support and deal with any issues that might come up
  • add support for Author, date, description, source and license to be provided from SDC, if one of the local parameters is missing. This version would also tag cases when the same redundant metadata is in wikitext and in SDC
  • Bots can remove redundant {{Information}} fields from file wikitext.
  • In the end you might be left with just {{Information}}, or maybe at this point we call it {{Structured Data}}
This approach is similar to what was done to {{Creator}} templates, where in many templates all was stripped away except item ID. --Jarekt (talk) 03:53, 5 November 2019 (UTC)[reply]
I don't have a position on the actual implementation, I'm here to help the community get the support it needs as I can for how it chooses to use things. Keegan (WMF) (talk) 21:57, 4 November 2019 (UTC)[reply]
Keegan You are right, I think it was Jdforrester who argued that we should postpone Lua version of {{Information}}. As for Lua documentation, we could start with what we know; however after few tests i wrote phabricator:T237107 since a some of the functions in mw:Extension:Wikibase Client/Lua either do not work or work differently than documented. Some stuff like access to labels works fine, for example {{#invoke:Wikidata label|getLabel |item=M4184419}} gives "A five year old hanging around bouldering wall in Sportrock climbing gym in Alexandria, Virginia, USA", but access to statements or even access from file to its own statements is still different. By the way Wikidata's statements are in a table called "claims" and on SDC the same table is called "statements" do we know why? --Jarekt (talk) 03:36, 5 November 2019 (UTC)[reply]
There's a Phabricator task with some great, but unresolved, discussion around that question: T149410 For consistency MediaInfo serialization should use "claims" as key, rather than "statements". Keegan (WMF) (talk) 18:46, 5 November 2019 (UTC)[reply]

@Mike Peel, Multichill, and Keegan (WMF): As you might know, I used last weekend to roll out Module:Information as back end to {{Information}} template. The rollout was relatively smooth, with the only hiccup discussed here. More issues might pop out, since at the moment Module:Information is used on 2M pages instead of expected 51M, but it seems like so far most files were unaffected. The current version does not make any use of SDC, but now we can start adding support to fetching content of date, author, description and source from SDC, if they are missing in the {[tl|Information}} template. In the future we might add camera location, and license support. We probably should not roll out Lua codes using SDC fields users can not edit in GUI. So at the moment only description field can tap into SDC captions. Once we can read Information template fields from SDC, we should change upload wizard so it does not add redundant fields to the wikitext, so at least the new files will have minimal wikitext. --Jarekt (talk) 15:54, 13 November 2019 (UTC)[reply]

This is a great start, thanks! Keegan (WMF) (talk) 19:15, 14 November 2019 (UTC)[reply]
The deployed Module:Information had some unpleasant side effects which were breaking some tools, like the one described at phabricator:T238390. I corrected the issue and added some code, that in case description field is missing but we have a caption in one of the languages on the fallback list than we can display that description. So at the moment if you {{Information}} has description in English and has the same caption in English than description can be skipped. I can not add {{en|1=...}} (or other similar template) yet due to phabricator:T238484 error but that is the first step of {{Information}} using SDC data. I guess in case of descriptions coming from SDC we should have some clue about where it is coming from (maybe or ) or edit icon like to take you to caption section. I will start working on author/source/date once they have GUI support. --Jarekt (talk) 21:08, 17 November 2019 (UTC)[reply]

Issues[edit]

@Keegan (WMF) and SandraF (WMF): , Now that I am working with SDC I noticed some serious issues and wrote bug reports on phabricator. Please see

  • phabricator:T238484 - entities returned by mw.wikibase.getEntity lua function differ based on language of the viewer (data returned does no match SDC data stored in some cases)
  • phabricator:T237899 - Wikidata item ID changes caused by merges do not update entities on Structured data on Commons (resulting in many links to redirects)
  • phabricator:T237991 - Changes to Structured Data on Commons should trigger page refresh (the way null edit to wikitext does)

Can your team have a look at those? --Jarekt (talk) 05:16, 17 November 2019 (UTC)[reply]

@Jarekt: sorry I missed this here, you might have seen that work has been progressing on the "rest of" Lua support. You'll see progress throughout this month (and into next if needed, but it should be taken care of this month). Keegan (WMF) (talk) 21:32, 9 December 2019 (UTC)[reply]
@Keegan (WMF): the first ticket is related to Lua (maybe), but other two are not. phabricator:T237899 relates to how do we handle links to renamed/redirected wikidata items. I thought it was done by the the underlying software but apparently it is done by a bot on Wikidata. We need something similar, but I am not sure how to get it started. See also Commons:Bots/Work_requests#update_redirected_wikidata_items_used_by_SDC. phabricator:T237991 is something we also need. --Jarekt (talk) 23:31, 9 December 2019 (UTC)[reply]
Ah, I see. Thanks for the details, I'll look into it. Keegan (WMF) (talk) 17:30, 10 December 2019 (UTC)[reply]

Blog post for Lua[edit]

@Jarekt, Mike Peel, and Multichill: I intended to write a blog post for Lua back when support was first enabled, but I put it on hold until the work was more developed. I'm getting to the blog now, and you're welcome to have a look at the draft * . I'm interested in suggested additions to content, and do note that I'm still writing out the end of the blog. I'm also looking for image suggestions, whether it's file pages, templates screenshots, whatever might help best illustrate the work. Thoughts are welcome here. Keegan (WMF) (talk) 21:36, 9 December 2019 (UTC)[reply]

  • You might have to OAuth into Wikimedia Space to preview the blog post, if you have not created an account there.
@Keegan (WMF): , something is wrong and I can not get OAuth to work on Wikimedia Space, and I an not log in and see the blog. However you should check Commons:Structured_data/Lua#Lua_Modules_accessing_SDC. In case of {{Artwork}} template, thanks to suggestions by @Multichill: , we are at the stage that the content of the whole Artwork template can be stored in Wikidata and SDC. Most goes to Wikidata, but which wikidata item to cocnnect to is stored in SDC and the source url is stored in SDC, so as it can be seen in this file, the artwork template in wikitext just looks like: {{Artwork}}. Similarly missing description in {{Information}} is being now filled by SDC caption, as seen here. I can not detect language yet due to phabricator:T238484 and I might want to add some symbol or icon to indicate that the info is stored in captions. Perhaps linking to the SDC property used. Eventualle I would like to store all the data currently stored in typical {{Information}} template in SDC. --Jarekt (talk) 04:57, 10 December 2019 (UTC)[reply]
I've got a section in there about the new template modifications. Thanks! Keegan (WMF) (talk) 17:26, 10 December 2019 (UTC)[reply]
  • I don't use Wikimedia Space, perhaps you could post the draft on-wiki somewhere? I've been continuing to develop {{Structured Data}} - but the big thing is that there needs to be support in structured data for things like dates, strings, and coordinates before this template could be used more widely. Thanks. Mike Peel (talk) 15:00, 10 December 2019 (UTC)[reply]
@Mike Peel and Jarekt: draft, needs images: User:Keegan_(WMF)/Lua_blog. Keegan (WMF) (talk) 20:43, 10 December 2019 (UTC)[reply]
Thanks! If you want an example of my template in action, I've been using File:Jodrell Bank Mark II 5.jpg to develop it. Mike Peel (talk) 21:02, 10 December 2019 (UTC)[reply]
@Keegan (WMF): The blog seems fine. Some suggestions:
  • We might need to explain the different purposes of {{Information}} template, which is mostly used for photographs by uploaders (and when no more specialized infobox exist) and {{Artwork}}, {{Book}}, and {{Photograph}} templates which are meant for artworks, books or historical photographs respectively. Majority of photographs using {{Information}} template are not notable for inclusion in Wikidata, but most artworks, books or historical photographs are. As a result most metadata about artworks, books or historical photographs can be stored on Wikidata, with the only info which needs to be stored locally in SDC being source and wikidata item ID with the rest of metadata. Now {{Artwork}}, using Module:Artwork, can pull those two pieces of information from SDC and than pull all the rest of metadata from Wikidata. For example File:Arvid Frederick Nyholm - John Ericsson - NPG.66.54 - National Portrait Gallery.jpg is able to pull all the metadata from SDC and Wikidata so the only thing left in wikitext is {{Artwork}} with no fields.
  • Majority of photographs using {{Information}} template will not store any metadata on Wikidata, but we hope to store author, source, date of creation, or license in SDC. So eventually wikitext for a typical file might be just empty {{Information}}. In order to get there, {{Information}} template was recently replaced with new code written in Lua. Latter an ability to pull missing descriptions from captions was added. See for example File:Indoor Climbing Kid.jpg which does not have description in wikitext. In the future we hope to add support for other fields. --Jarekt (talk) 04:13, 11 December 2019 (UTC)[reply]
Thanks all, the blog is up. Keegan (WMF) (talk) 20:23, 12 December 2019 (UTC)[reply]

Status of mw.wikibase.mediainfo?[edit]

I tried to test this and it looks that the mw.wikibase.mediainfo is not currently enabled? (example in Mmullie's sandbox) Is there any information will it be enabled and if there will be remote access to the SDC file info from other wikis? I would like to see something like template:FileSD supported in local wikis. --Zache (talk) 17:20, 21 September 2020 (UTC)[reply]

New Lua module to access a single statement or qualifier[edit]

Cross-posting for people who watch this talk page but not the general one: Commons talk:Structured data § New Lua module to access a single statement or qualifier :) Lucas Werkmeister (talk) 20:43, 23 March 2022 (UTC)[reply]