User talk:MerlIwBot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Whoa![edit]

Seems like this bot went on a spree removing valid interwiki links from Commons cats to appropriate pages in national Wikipedias. E.g., this one: http://commons.wikimedia.org/w/index.php?title=Category:La_Selle-en-Cogl%C3%A8s&diff=prev&oldid=53042462 and dozens of previous edits. Any explanation? -- 15:06, 11 April 2011 (UTC) — Preceding unsigned comment added by Vmenkov (talk • contribs)

Why do you think this was a valid interwiki link? fr:La-Selle-en-Coglès does not exist on frwiki, so removing this interwiki was correct.
On frwiki there exists a page with the title fr:La Selle-en-Coglès (space instead of dash after "La"). But either the bot nor any vistor is able to find this without a full text search. There is also no commonscat template included at frwiki article. So it isn't possible for any bot the modify instead of removing the interwiki. Merlissimo (talk) 16:46, 11 April 2011 (UTC)[reply]
Sorry, I gave a bad example above. How about this one though (which I un-did later)? http://commons.wikimedia.org/w/index.php?title=Category:Beijing_Railway_Museum&diff=53033532&oldid=51114454 The link from Category:Beijing Railway Museum to zh:京奉铁路正阳门东车站 (or zh:京奉鐵路正陽門東車站, for the traditional characters fans) is a valid link, and the latter page does have a "commonscat" template pointing back to Category:Beijing Railway Museum. -- Vmenkov (talk) 18:25, 11 April 2011 (UTC)[reply]
Thats is quite interesing because my bot used the information from api containing the missing attribute. Merlissimo (talk) 20:37, 11 April 2011 (UTC)[reply]
bugzilla:28500 Merlissimo (talk) 23:07, 11 April 2011 (UTC)[reply]
I added a converting script to my bot, but this cannot be the best solution. Merlissimo (talk) 00:07, 12 April 2011 (UTC)[reply]
Thanks for your attention to this matter! -- Vmenkov (talk) 01:06, 12 April 2011 (UTC)[reply]

Additions[edit]

Hi Merlissimo,

Looks like your bot is starting to work. Good news! BTW at Category:Lago di Scanno, the bot might want to link en:Lago di Scanno. --  Docu  at 19:00, 11 April 2011 (UTC)[reply]

Besites your hint, i noticed that a unknown template called commonscat-inline is used on enwiki. Before i only knew of 'Commons category', 'Commoncat', 'Commons2', 'Cms-catlist-up', 'Catlst commons', 'Commonscategory', 'Commonscat' and 'Commons cat'.
Just some statistics: On dewiki there are more than 80000 commonscat links to commons category pages without having any interwiki. Much of work. Merlissimo (talk) 00:00, 12 April 2011 (UTC)[reply]
Even without "commonscat-inline", it could be found through the link at it:Lago di Scanno. --  Docu  at 02:13, 12 April 2011 (UTC)[reply]

BTW will it add new links or just remove dead ones? --  Docu  at 18:40, 18 April 2011 (UTC)[reply]

The bot itself can add/modify/remove interwikis. But at the moment i am using database scans from commonswiki for finding pages having interwikis to not existing pages. Then the bots will work on these pages. Because of this precondition there will be always a removed interwiki on an edit. But if the bot found a new link it can also add this interwiki.
I already wrote a database query for scanning dewiki having commonscat link, but no de-interwiki on commons. I i would start the bot script based on this data there will be much more additions of course. Merlissimo (talk) 21:12, 18 April 2011 (UTC)[reply]
Ok. I think there is quite a lot that could be done based on interwikis too, check e.g. Category:Stephen Collins. --  Docu  at 22:07, 18 April 2011 (UTC)[reply]

non-descriptive edit summary "robot Removing: "[edit]

Hi Merl, currently your bot removes invalid interwikis with "robot Removing: xxxxxxx". Is it possible to add why the iw is removed? Whis would help to understand, check and fix. Apparently it is removing because the iw target is not existent. I have checked some of your edits removing interwikis to dewp. In most cases it was a typo which I have fixed then by simply searching dewp. Good work! :-) Cheers --Saibo (Δ) 21:35, 11 April 2011 (UTC)[reply]

I think all iw bots do that. (except if there are too many changes ..). --  Docu  at 21:40, 11 April 2011 (UTC)[reply]
My question is just about the edit comment. ;) It would be even more beneficial if the bot could make a note in the edit comment if the target article was once existent and is now deleted - maybe even quote the log? Many iws I have fixed now are because of "unused"/"useless" redir deleting at dewp... If we could see here that the target was deleted because it was a "unused"/"useless" redir it would greatly enhance the chance that the interwiki will be fixed here (i.e. the correct be inserted by some human editor here) Cheers --Saibo (Δ) 22:42, 11 April 2011 (UTC)[reply]
(ec)At the moment my bot uses the wiki language as returned from api/query/general/@lang attribute and the messages as return by translatewiki for pywikipediabot in this language [1]. If there is no such language it uses the same fallback language as mediawiki (thats why e.g. on wuuwiki my bot uses zh-hans and pywikipediabot zh). The pywikipediabot does not actually uses these messages - thats why translation bugs are reported to me [2].
The placeholders are replaced by a long (full title) or short (interwiki prefix only) list depending on the size. My bot is also able to use a mixed list (e.g. [3] only na and mk on prefix mode, other in full mode)
Adding a reason could be a new feature for my bot in far future. Perhaps i am also able to use translatewiki one day so that this feature can be used in different languages.
Could you explain how your "why" should look like? Only "missing" (easy) or something like "never exist" or "deleted" (needs some additional server requests).
dewiki interwiki will be readded if i start my bot for the commons task from dewiki. Today it was running with commonswiki as start wiki. Merlissimo (talk) 22:47, 11 April 2011 (UTC)[reply]
Thanks for your reply!
Currently it is:
robot Removing: de:Kalvarienbergkirche in Graz
Good would be
(based on history of Category:Kalvarienbergkirche (Graz)): robot Removing: de:Kalvarienbergkirche in Graz because this page does not exist.
Even better would be
(based on history of Category:Kalvarienbergkirche (Graz)): robot Removing: de:Kalvarienbergkirche in Graz because this page does not exist and did never exist.
(based on history of Category:Church of Saint Adalbert in Kraków): robot Removing: de:Adalbertkirche (Krakau) because this page does not exist (it did exist before).
Even better than better would be:
(based on history of Category:Church of Saint Adalbert in Kraków): robot Removing: de:Adalbertkirche (Krakau) because this page does not exist (it did exist before). Deletion log contains "verschoben".
The same could be reported when the keyboard "Weiterleitung" is found in the deletion log.
The reason why I think this would be good: if you see "does not exist and did never exist" in your watchlist and you see that there is no typo or strange/unsual syntax in the removed interwikilink you do not need to try to correct it. To the contrary if you see "Deletion log contains "verschoben"" in your watchlist. Then you know the article is still there but has a new name. Even "because this page does not exist" would help to understand why the bot is removing. If I do remember correctly it even removed links pointing to a disambiguation page. I am not even sure if such links should be removed (since they nearly are okay for normal users and can be fixed). But if the but just removed and no one cares to check why he removed the interwiki is gone. "because this page is a disambiguation page" would be better than nothing.
Just some crazy ideas. ;) Cheers --Saibo (Δ) 01:02, 12 April 2011 (UTC)[reply]
There are also disambiguation cats on commons. e.g Category:Washington which contains a valid interwiki to disambig de:Washington. So these disambiguation interwikis cannot be generally removed and my bot won't do so. Merlissimo (talk) 01:22, 12 April 2011 (UTC)[reply]
Okay, fine, then I had remembered it incorrectly - I will try to find the reason why I thought this. However, that is just about the disambig pages my other proposals are still valid. Cheers --Saibo (Δ) 02:29, 12 April 2011 (UTC)[reply]
I did not respond to your other proposals because i had to sleep about.
Your third proposal with adding the deletion log entry isn't a good idea. An admin could have deleted this page without a comment, so that the first part of page content is shown on deletion log. And i don't know how admins on 271 wikis are aware of this. So deletion log could contain content that normally should be hidden by admins or oversights and my bot could have copied it before it is suppresed.
I also would like to have the additional summary part as short as possible, because otherwise the summary could contain more than 256 characters. So my suggestion is:
  1. robot Removing: de:Adalbertkirche (Krakau) (moved)
  2. robot Removing: de:Adalbertkirche (Krakau) (deleted)
  3. robot Removing: de:Adalbertkirche (Krakau) (moved/deleted)
  4. robot Removing: de:Adalbertkirche (Krakau) (never exist)
  5. robot Removing: fr:La-Selle-en-Coglès (never exist;Suggestion: fr:La Selle-en-Coglès)
Number 5 would be possible by using opensearch if there is a title having the same normalized form ("-"->" "; "è"->"e") Merlissimo (talk) 17:09, 12 April 2011 (UTC)[reply]
Hi again Merlissimo! :-) I am not sure what you mean by "moved". If the page is moved but a redir is existent? Why does your bot remove the iw then?
What is the problem if we have in some cases more than 256 chars? It just gets cut off, doesn't it? No big problem is those rare(?) cases. I'd like to have this "moved" a bit more explained. Otherwise - I guess - those comments do not help the average user.
However, any added explanation - even your single words - would be great. Especially number 5 This would safe much time! Cheers --Saibo (Δ) 00:13, 13 April 2011 (UTC)[reply]
moved=entry in move log exists (if an admin or bot moved a page with suppressing redirect)
deleted=entry in deletion log exists (page deleted)
moved/deleted=entry in both logs exists (in most cases after move with redirect created and then deletion request on redirect)
On enwiki my bot removed 132 interwikis on one page while doing test edits. Even short form was to long for summary. And to be exact: summary is 256 bytes not chars. e.g. one character could consume up to four bytes (e.g. in chinese). Merlissimo (talk) 04:18, 13 April 2011 (UTC)[reply]
Do what you like to do and what you think is useful and possible. I think you have understood what my request was abou. If you want me to comment another proposal on this issue would like to. If you want to implement these single words it is also fine. Thank you for your bot's work! :-) Viele Grüße --Saibo (Δ) 20:52, 13 April 2011 (UTC)[reply]
Deletion log and move log cannot be queried using only one request. This will need some further work on the framework. So the improvement will be limited to (deleted) and (missing) for not existing pages for now. On removes for existing pages also a reason is added (stronger connected to [[commons:...]]). Merlissimo (talk) 16:03, 18 April 2011 (UTC)[reply]
okay, thanks for your effort! :-) Cheers --Saibo (Δ) 00:01, 19 April 2011 (UTC)[reply]


Finnlines[edit]

Hi Merlissimo,

A few of the subcategories of Finnlines were recently renamed and I tried to add interwikis to fi/sv where available. Could your bot go through these subcategories and add missing interwikis here and add/update Commonscat links at Wikipedia when needed. If it works, we might want to extend this to other parts of Ships by name. --  Docu  at 09:56, 30 April 2011 (UTC)[reply]

I have still not implemented commonscat for other wikis than dewiki. I have started my development version on this category for updating interwikis, but that wasn't a good idea ... (at the moment i am working on adding langlinks to incubator for other projects than wikipedia).
For future i plan to create a web interface on toolserver for my bot so that you could submit page titles to check there. Merlissimo (talk) 10:15, 4 May 2011 (UTC)[reply]

Namespace problem[edit]

Hi Merlissimo,

It seems that your bot is operating in namespace 0. Would you check and fix this? --  Docu  at 04:10, 4 May 2011 (UTC)[reply]

Yes, i already know. On toolserver i have a config file which forbid this. But i forgot to add this on my own computer when doing test edits. On Sunday i have added this config file to my git repository, so that this should not happen again. Merlissimo (talk) 09:08, 4 May 2011 (UTC)[reply]
The second reason why my bot started working on these gallery titles is that there is not primary namespacename for category namespace on toolserver database (select * from toolserver.namespacename where ns_type='primary' and dbname='commonswiki_p' is empty for ns_id 14). So my prerun script which creates a list of titles my bot will work on added Hieracium umbellatum instead of Category:Hieracium umbellatum. Merlissimo (talk) 11:35, 4 May 2011 (UTC)[reply]

template:Commonscat[edit]

I just noticed your bot. This task is really needed, since many pages and categories have missing or invalid interwiki-links. One think that I was concerned about was your mention of use of template:Commonscat somewhere in the above discussion. At some point I was trying to associate some people categories from Commons with biographical articles in DE and EN wikis, as part of copying {{Authority control}} data from DE wiki to commons. What I found that correlations based on interwiki (aka interlanguage) links were mostly reliable, while correlations based on wikipedia template:Commonscat template were not. What seems to happen is that people use template:Commonscat to add links to commons which are often related to some part of the article, instead of the whole think. I do not remember any examples right now, but hypothetically en:Mark Twain article might have a section for each book of his and each section could have en:template:Commonscat to the proper commons category. --Jarekt (talk) 23:09, 28 September 2011 (UTC)[reply]

by the Rhine vs. on the Rhine[edit]

May I ask the responsible person for this bot why it redirected Category:Cities and villages on the Rhine to the linguistically wrong Category:Cities by the Rhine? Of course it's correct to have a category like Category:Cities by river but as soon as you define any given river by its name you must change by to on. I would be grateful if you could revert your edit or, even better, create Category:Cities on the Rhine. --Pingelig (talk) 09:17, 3 November 2011 (UTC)[reply]

User:Siebrand made the redirect. See the history. Cheers --Saibo (Δ) 17:39, 3 November 2011 (UTC)[reply]

Adding also multilingual descriptions[edit]

To alleviate multilingual issues and have the categories searchable in multiple languages, I've been using Sum-it-up to add descriptions and interwiki links to categories (see for example Category:Dacia). But I find the task terribly redundant and repetitive. Can bots like User:MerlIwBot do this for each category and gallery, since articles and their leads in multiple languages appear/change all the time? It currently does it for interwiki links which are also followed and generated by Sum-it-up. I think it would be a tremendous feature. See also the conversation at Commons:Village pump#Language mix in category naming. Thanks--Codrin.B (talk) 15:57, 31 January 2012 (UTC)[reply]

Please don't do that before there is an agreement how it is needs be done. Categories with 270 descriptions on the top are not very practical. --Foroa (talk) 07:06, 3 February 2012 (UTC)[reply]


Please stop removing links to ru wiki[edit]

See http://wikimedia.ru/blog/2012/07/10/zabastovka-vikipedii-na-russkom-yazyke/

Cheers. --  Docu  at 17:15, 10 July 2012 (UTC)[reply]

Bug[edit]

As long as I uniderstand, your bot cannot recognize renaming of categories in each wikiproject. Takabeg (talk) 06:25, 12 July 2012 (UTC)[reply]

I checked several of them, and only occasionally, the destination category is in the edit summary. Exploiting that would however requiring a complex parsing. I think that the bot removes in the first round all invalid links, and will add the new ones during next round (with data from valid items on other wikipedia). --Foroa (talk) 07:25, 12 July 2012 (UTC)[reply]

Bəşər cəmiyyətinin ilkin inkişaf dovrü ibtidai icma quruluşu adlanir. İbtidai cəmiyyət başqa dövrlərə nisbətən uzun muddət davam etmişdir. İbtidai icma quruluşu ilk insanlarin yaranması ilə başlamış və uzun inkişaf yolu keçmişdir. Bu quruluş ayrı-ayrı ərazilərdə müxtəlif dövrlərə qədər davam etmişdir. İcma quruluşunun inkişafı və dağılması hər bir ölkənin təbii-coğrafi şəraiti,təbii ehtiyatları və başqa amillərlə bağlı olmuşdur. Arxeologiya* elminə görə,ibtidai icma quruluşu şərti olaraq üç dövrə bölünür:Daş dövrü,Tunc dövrü və Dəmir dövrü.

de[edit]

Please see Commons:Village_pump#Problems_with_de_interwikis. Not really an issue with your bot. --  Docu  at 06:50, 8 February 2013 (UTC)[reply]

Fehlermeldung zum Bot[edit]

Der Artikel ist nicht gelöscht, sondern nach de:Prince of Wales (Schiff, 1922) verschoben. --Atamari (talk) 10:59, 26 March 2013 (UTC)[reply]

Categories for processing[edit]

Hi, could you please run the bot on this list (non-recursive)? There are many missing iwiki links in these categories. — Ivan A. Krestinin (talk) 18:49, 15 April 2013 (UTC)[reply]