Commons:Bots/Requests/YiFeiBot (20)
Operator: Zhuyifei1999 (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought: For every creator template in Category:Creator templates without Wikidata link: Add Wikidata item ID to the templates. IDs would be fetched from the related Wikipedia pages (first match) specified in the interwiki links (if any) in {{{Name}}} and {{{Alternative names}}} of {{Creator}}.
Automatic or manually assisted: Automatic unsupervised
Edit type (e.g. Continuous, daily, one time run): One time run, may rerun on request
Maximum edit rate (e.g. edits per minute): 6 edits per min
Bot flag requested: (Y/N): N
Programming language(s): Python: pywikibot + mwparserfromhell
Zhuyifei1999 (talk) 10:39, 21 August 2014 (UTC)
Discussion
- Test ran at [1] --Zhuyifei1999 (talk) 10:45, 21 August 2014 (UTC)
- Looks ok for me. --Steinsplitter (talk) 10:48, 21 August 2014 (UTC)
- Looks OK for me. Will be good idea also to match/report mismatch of authority IDs, like VIAF. --EugeneZelenko (talk) 13:36, 21 August 2014 (UTC)
- Separate task? --Zhuyifei1999 (talk) 13:42, 21 August 2014 (UTC)
- I think it's reasonable extension of this task. --EugeneZelenko (talk) 14:02, 22 August 2014 (UTC)
- I'd rather use a separate task to go over all creator template with both wikidata link and authority control data and see if they can match (report to a userpage?). Finding those in Category:Creator templates without Wikidata link is just too inefficient. --Zhuyifei1999 (talk) 01:54, 23 August 2014 (UTC)
- My point was to use identifiers comparison for additional validation. Sure, will be good idea to compare identifiers for all templates. I know that VIAF is quite problematic with lot of duplicates in some cases. --EugeneZelenko (talk) 14:10, 23 August 2014 (UTC)
- From what I can see here:
- Creator:Adolf_Milman has VIAF 96666104, but d:Q4293803 has no identifier information;
- Creator:Adolf_Kaufmann has VIAF 84930403 and GND 1013172744, d:Q362901 has VIAF 84930403, GND 1013172744, and ISNI 0000 0001 2029 7495;
- Creator:Adolf_Hohenstein has VIAF 7657063, d:Q76185 has VIAF 7657063, ISNI 0000 0000 3729 2976, LCNAF n2003045589, GND 124996027, Freebase /m/076t82x, and BnF 14959457j;
- Creator:Adam_Fischer has no identifier information, neither does d:Q9139560;
- Creator:Achille_Beltrame has no identifier information, neither does d:Q3604346.
- I would like to know how to use identifiers comparison for additional validation for these templates. --Zhuyifei1999 (talk) 04:01, 25 August 2014 (UTC)
- @EugeneZelenko: ping? --Zhuyifei1999 (talk) 07:31, 7 September 2014 (UTC)
- I think other people should be involved in discussion. We have different views over usability of identifiers :-) --EugeneZelenko (talk) 14:23, 7 September 2014 (UTC)
- Sure. @Jarekt, Multichill, and Jheald: --Zhuyifei1999 (talk) 15:03, 7 September 2014 (UTC)
- Oh, I'm sorry. I didn't know this was open. So this weekend I wrote a little bot to match things in Category:Creator templates without Wikidata link. When I started it was 8000+ items and now about 4500. So I'm pretty sure I got all the easy ones.
- I used two queries to find the possible hits: creator_no_wikidata_cat and creator_no_wikidata_terms. The first one is completely done. The second leftovers are at Wikidata:User:Multichill/Zandbak.
- As you can see in the query I match on name. So no hits if the name is somehow different. So exactly the same name is the first thing that needs to match. Second thing is one of these (in this order):
- Link to same authority control id (viaf and the likes)
- Year of birth and year of death are the same
- Homecat in {{Creator}} and on Wikidata d:Property:P373 are the same
- The date of birth is exactly the same
- A file on a connected article uses the creator template
- I include on what it's matched in the edit summary. I plan to publish the source code so it can be reused. Multichill (talk) 16:29, 7 September 2014 (UTC)
- @Multichill: Looks like I still have a lot to run :) @EugeneZelenko: Is there something that seriously blocks the BR? Or should I do this manually and check every match (or generate a list and give it to Multichill)? --Zhuyifei1999 (talk) 08:41, 10 September 2014 (UTC)
- Even logging mismatches may be useful, if identifiers could not be used in wider scale. --EugeneZelenko (talk) 14:09, 10 September 2014 (UTC)
- Done these look good? --Zhuyifei1999 (talk) 10:47, 11 September 2014 (UTC)
- Looks useful. I think will be good idea to highlight matches/mismatches in some way (bold, strike, etc). --EugeneZelenko (talk) 14:20, 11 September 2014 (UTC)
- Done (code should work) but no mismatches found --Zhuyifei1999 (talk) 07:08, 12 September 2014 (UTC)
- There were matches in one case, but they were not highlighted. --EugeneZelenko (talk) 14:27, 12 September 2014 (UTC)
- Done next 5 test run caught nothing, but another 5 test run caught moth matches and mismatches. --Zhuyifei1999 (talk) 10:32, 13 September 2014 (UTC)
- Now bot automatically tries to prepend "cb" when comparing (as in User:YiFeiBot/sandbox/3). --Zhuyifei1999 (talk) 11:00, 13 September 2014 (UTC)
- Looks OK for me. --EugeneZelenko (talk) 14:29, 13 September 2014 (UTC)
- Looks excellent. Multichill (talk) 14:45, 13 September 2014 (UTC)
- Looks OK for me. --EugeneZelenko (talk) 14:29, 13 September 2014 (UTC)
- There were matches in one case, but they were not highlighted. --EugeneZelenko (talk) 14:27, 12 September 2014 (UTC)
- Done (code should work) but no mismatches found --Zhuyifei1999 (talk) 07:08, 12 September 2014 (UTC)
- Looks useful. I think will be good idea to highlight matches/mismatches in some way (bold, strike, etc). --EugeneZelenko (talk) 14:20, 11 September 2014 (UTC)
- Done these look good? --Zhuyifei1999 (talk) 10:47, 11 September 2014 (UTC)
- Even logging mismatches may be useful, if identifiers could not be used in wider scale. --EugeneZelenko (talk) 14:09, 10 September 2014 (UTC)
- @Multichill: Looks like I still have a lot to run :) @EugeneZelenko: Is there something that seriously blocks the BR? Or should I do this manually and check every match (or generate a list and give it to Multichill)? --Zhuyifei1999 (talk) 08:41, 10 September 2014 (UTC)
- Sure. @Jarekt, Multichill, and Jheald: --Zhuyifei1999 (talk) 15:03, 7 September 2014 (UTC)
- I think other people should be involved in discussion. We have different views over usability of identifiers :-) --EugeneZelenko (talk) 14:23, 7 September 2014 (UTC)
- From what I can see here:
- My point was to use identifiers comparison for additional validation. Sure, will be good idea to compare identifiers for all templates. I know that VIAF is quite problematic with lot of duplicates in some cases. --EugeneZelenko (talk) 14:10, 23 August 2014 (UTC)
- I'd rather use a separate task to go over all creator template with both wikidata link and authority control data and see if they can match (report to a userpage?). Finding those in Category:Creator templates without Wikidata link is just too inefficient. --Zhuyifei1999 (talk) 01:54, 23 August 2014 (UTC)
- I think it's reasonable extension of this task. --EugeneZelenko (talk) 14:02, 22 August 2014 (UTC)
- Separate task? --Zhuyifei1999 (talk) 13:42, 21 August 2014 (UTC)
- Looks OK for me. Will be good idea also to match/report mismatch of authority IDs, like VIAF. --EugeneZelenko (talk) 13:36, 21 August 2014 (UTC)
If there are no objections, I think task should be approved. --EugeneZelenko (talk) 13:55, 14 September 2014 (UTC)
- Support By the way this task is quite similar to tasks of User:JarektBot/Commons creator maintenance.py, which I run several times. --Jarekt (talk) 13:24, 29 October 2014 (UTC)