Module talk:Caller title

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Edit request: use less expensive functions

[edit]

{{Editprotected}}

I've made (what I think are) improvements to the code in Module:Caller title/sandbox—specifically, making the code use native Lua functions instead of preprocessing wikitext—and would appreciate either these changes being implemented or feedback on why they shouldn't be. Pinging @Tacsipacsi and @CptViraj since you've edited this module. —CalendulaAsteraceae (talkcontribs) 11:40, 25 December 2023 (UTC)[reply]

@CalendulaAsteraceae: First of all, the part that uses wikitext processing should run only on pages directly transcluding the module and not on pages that transclude templates transcluding this module, so I wouldn’t care much about its performance. (I’m not against optimizing its performance, though, as long as the code doesn’t become less readable.)
However, your code is wrong at several places:
  • mw.language.getContentLanguage() returns the content language, not the page language. The content language is always English on Commons, the page language is usually English (except on translation pages, but those use the other code path), however not always. For example, try transcluding {{#invoke:Caller title|lang|base={{subst:FULLPAGENAME}}}} and the sandbox version on Kezdőlap – the live version correctly returns hu, your version incorrectly returns en. A correct solution would be using the function to be introduced phab:T161976.
  • mw.title.getCurrentTitle() returns the title of the page it displays on, not the title of the template that directly transcludes it. The whole point of using the module is to be able to query the language of the translated template itself rather than the language of the page the template is transcluded in. If we wanted the latter, we could just use {{SUBPAGENAME}}, without any Lua involved.
  • Calling mw.text.trim() is not wrong, but unnecessary, going against your goal of optimization. Values of named parameters are always trimmed before being passed to modules (or templates).
  • The base ~= '' check is also unnecessary: since the title of the current page will never be empty, the other condition will catch cases when |base= is not provided.
  • (Adding require('strict') also decreases performance, but it serves a purpose, so it may still be worth it.)
  • The extra checks on frame are unnecessary and can lead to silent breakage. It’s impossible not to provide a frame from wikitext, and this module is not designed to be called from other modules – if one does so without providing a frame, it’s their problem; garbage in, garbage out.
Tacsipacsi (talk) 20:34, 25 December 2023 (UTC)[reply]
@Tacsipacsi: Thank you for your feedback! Now that the pageLang function has been introduced, I have an updated proposal in Module:Caller title/sandbox. —CalendulaAsteraceae (talkcontribs) 22:47, 20 February 2024 (UTC)[reply]
@CalendulaAsteraceae: I am pleased to see the arrival of pageLang too (it was a long time coming because MW hooks were screwed up and that affected several MW extensions) but the problem with it is it stupidly always increases the "expensive" count, even when other "expensive" data about the same page has already been fetched. I find it amusing you propose that under this thread heading of "Edit request: use less expensive functions". That said, it does finally solve the issue of obtaining the language of any page in the system without depending on the subpage naming scheme used by some translation architectures. This means base is actually entirely unneeded now and there is no reason to depend on the subpage name at all (which has been pointed out might not be the page language code or any page language code at all). —Uzume (talk) 06:40, 27 April 2024 (UTC)[reply]
@Tacsipacsi: The current deployment could be optimized by changing line 11 to: local parts = mw.text.split(title, '/', true) since that was already done on line 9 above. Of course I have noticed how things that once depended on this have since mostly moved to {{TRANSLATIONLANGUAGE}} after your gerrit:992484. Someone probably ought to look into Special:WhatLinksHere/Template:TRANSLATIONLANGUAGE which looks like may have been caused by subst:. —Uzume (talk) 06:40, 27 April 2024 (UTC)[reply]

I find it amusing you propose that under this thread heading of "Edit request: use less expensive functions". […] This means base is actually entirely unneeded now […].

Actually, it’s probably still cheaper in terms of computing power, even if it increments the expensive parser function count. And it runs (or at least it’s designed to run) only on direct transclusion, not on pages transcluding the transcluding template, so it’s unlikely to hit the limits because of this. Except if you remove the |base= parameter: then, it would run on all 21M pages currently using this module, likely several times per page. (And would rightfully increment the expensive parser function count: every time a page’s language is queried, it means a DB query plus a few hook calls, of which at least Translate also does some DB queries.)

The current deployment could be optimized by changing line 11 to: local parts = mw.text.split(title, '/', true) since that was already done on line 9 above. Of course I have noticed how things that once depended on this have since mostly moved to {{TRANSLATIONLANGUAGE}} after your gerrit:992484.

Indeed, it could be optimized, but every optimization needs reparsing all 21M pages using the module, which is quite expensive, so many pages need to be edited and reparsed subsequently for it to be a net win. Instead of optimizing the module, its usages should be replaced. Maybe a few will remain for whatever reason: at that point, the module could be optimized to allow those to run more quickly, as the optimization won’t cause many reparses anymore (but it won’t actually matter much exactly because the low number of affected pages).

Someone probably ought to look into Special:WhatLinksHere/Template:TRANSLATIONLANGUAGE which looks like may have been caused by subst:.

I think it’s more likely to be copy-paste than subst:, and actually both pages long predate my patch (as my patch title notes, it makes {{TRANSLATIONLANGUAGE}} available outside of translate tags – they have already been usable within translate tags for some time, and these pages probably used them that way). Yes, they should be fixed, but I’m not sure if the fix is replacing the magic word with en or marking the pages for translation.
(CalendulaAsteraceae, sorry for not replying earlier. I was waiting with the answer for my patch to be merged, but forgot to do so by the time it was merged.) —Tacsipacsi (talk) 08:48, 27 April 2024 (UTC)[reply]
@Tacsipacsi, Uzume, and CalendulaAsteraceae: I am trying to work on the backlog of edit requests, but can not tell is this edit request is Done, Not Done or still being worked on. --Jarekt (talk) 15:41, 9 July 2024 (UTC)[reply]
I will just set it to  Not done --Jarekt (talk) 03:21, 12 July 2024 (UTC)[reply]

Invalid language code

[edit]

Tacsipacsi: Template:Statens Museum for Kunst collaboration project gives Category:Pages with invalid language codes due to dir parser function use, can you help me finding the issue? Thanks −Ebrahimtalk 11:09, 7 August 2024 (UTC)[reply]

User:Jarekt: Would you please help on this also, we now have to deal with this Category:Pages with invalid language codes 😊 some need minor fixes like this. It won't get trigged with any invalid language code but when totally unrelated characters are given to it that in noway can be a language code. −Ebrahimtalk
Ebrahim, sure I will work on them. By the way, Did you find any good way of tracking down {{Dir}} templates to replace with #dir? I see you changed some, I changed some as well, but since every page on Commons transcludes {{Dir}}, it is hard to find templates and pages that actually transclude them. --Jarekt (talk) 14:29, 8 August 2024 (UTC)[reply]
User:Jarekt: I used this search query https://commons.wikimedia.org/w/index.php?limit=1000&offset=0&profile=default&search=template%3A+insource%3A%2F%5C%7B%5C%7Bdir%2F+intitle%3Alayout&title=Special:Search and replaced many of the uses manually. After that I wanted to remove them from https://commons.wikimedia.org/w/index.php?limit=1000&offset=0&search=template%3A+insource%3A%2F%5C%7B%5C%7Bdir%2F+intitle%3Ai18n&title=Special:Search but that includes translated pages which aren't the source anyway which needs a 'grep' like thing to remove the translation pages. −Ebrahimtalk 14:36, 8 August 2024 (UTC)[reply]
Also whenever the second and third parameters are right and left and the value is used in text-align, start as an already old CSS3 feature can be used instead (and text-align: end if second and third parameters are left and right). There are mistakes like this which will be nice to fix and whenever possible dir attribute is preferred to direction style and mw-content-{ltr,rtl} class. −Ebrahimtalk 14:41, 8 August 2024 (UTC)[reply]
Ebrahim, thanks for the search query, I keep on forgetting about regular search as a tool. I wander if, pages in Category:Pages with invalid language codes are not due to my changes to Template:Dir as the old version was able to survive bad language codes. Might make sense to restore the original for time being. As for fixing other issues with HTML attributes, I am a bit afraid to touch them as I do not know much about them. --Jarekt (talk) 15:12, 8 August 2024 (UTC)[reply]
User:Jarekt: Oh no need to revert IMO and ignore attribute etc thing, if the category is added anywhere it already was broken and should be fixed and they shouldn't be a lot as I've changed most of main uses myself and at the last resort just send anything you were uncertain to me. −Ebrahimtalk 15:41, 8 August 2024 (UTC)[reply]
User:Jarekt: Oh, you are using a bot, that's fantastic, I'll review your changes, I think you are already skipping places second and third parameters are used? I'll review that anyway. Thanks −Ebrahimtalk 16:08, 8 August 2024 (UTC)[reply]
Also please consider that BCP47 to #bcp47 replacement also. −Ebrahimtalk 16:08, 8 August 2024 (UTC)[reply]
User:Jarekt: Ok, fixes like this is needed as #dir doesn't handle second and third parameter like that, it's ok, we can review them. −Ebrahimtalk 16:17, 8 August 2024 (UTC)[reply]
Ebrahim, Yes I was using pwb.py replace -file:page_list.txt "{{dir|{{{lang}}}}}" "{{#dir:{{{lang|}}}}}" -summary:"use #dir parser function", where page_list.txt is filled with results of your query. the replacement with the {{{lang}}} should skip pages with 2nd and 3rd parameter. The one you found, was done when replacing "{{dir|{{#invoke:Caller" string. I will check those. I will also look into {{BCP47}} replacements. --Jarekt (talk) 16:30, 8 August 2024 (UTC)[reply]
User:Jarekt: Fantastic, so I've already reviewed them. BCP47 changes should be easier to deal with as no complication AFAIK. That /i18n pages should be marked for translation manually. −Ebrahimtalk 16:40, 8 August 2024 (UTC)[reply]
User:Jarekt: So far this has went great, thanks, if possible please run on the entire template namespace using https://commons.wikimedia.org/w/index.php?title=Special:Search&limit=5000&search=template%3A+insource%3A%2F%5C%7B%5C%7BBCP47%2Fi and https://commons.wikimedia.org/w/index.php?title=Special:Search&limit=5000&search=template%3A+insource%3A%2F%5C%7B%5C%7Bdir%2Fi (matching both dir and Dir) if the number of changes are manageable I can review them and manually fix the remaining issues −Ebrahimtalk 17:26, 8 August 2024 (UTC)[reply]
Ebrahim, I think I might be done with BCP47 templates, except for marking for translation. I am working now on DIR template. --Jarekt (talk) 18:45, 8 August 2024 (UTC)[reply]
Jarekt: I've reviewed almost all your changes, the only thing to note which isn't related to your changes is I wish this could be applied automatically, it's so broken, direction is used outside any attribute, there isn't open quotation and only a close one, and it's copy pasted on many places so I wish it ends somewhere. −Ebrahimtalk 19:21, 8 August 2024 (UTC)[reply]
Ebrahim, thank you for reviews. I just finished a large batch where I was reviewing every change. There is a lot of bad code out there, and some of it I did not dare to touch. I created Category:Templates using DIR with extra parameters and Category:Templates using DIR with single parameter maintenance categories, and will try to review some of the templates there. Many like {{Dir|{{int:Lang}}|▶|◀}} and similar I think I will leave as is. I will be going offline for a while. --Jarekt (talk) 20:19, 8 August 2024 (UTC)[reply]
Jarekt: Understandable, thanks for all the help, these categories are great help btw. −Ebrahimtalk 20:44, 8 August 2024 (UTC)[reply]