Commons:Bots/Requests/ShufaBot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

ShufaBot (talk · contribs)

We have 18,000 of these file to manage and more to come. There are ~30,000 encoded characters, a dozen of major styles, and personnal styles can be documented as well.

Operator: Yug (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • Upload files
  • Rename files
  • Edit pages
    • Create pages

Scope: « Files » being Chinese characters media files part of Commons:ACC, Commons:SO, and Commons:MCC projects. These projects gather 18,000+ files following strict naming conventions and data-rich {{ACClicense}} template to maintain. A bot would be an helpful assistance. Needs userrights with decent ratelimit.

Automatic or manually assisted: Automatic unsupervised.
Runs will be occasional and to be well prepared before hand :

  1. discussed within the project's community
  2. tested with few cases 3~5
  3. run at larger scale (30 to 3,000 file pages affected).

Edit type (e.g. Continuous, daily, one time run): occasional for maintenance or mass uploads.

  • 4th stage: complete ! API:Upload: occasional. When a new batch of files is ready, upload them.
  • 4th stage: complete ! API:Edit: occasional. When project's template such {{ACClicense}} evolve, edit and updates the pages.
  • 4th stage: complete ! API:Move: occasional. When a group of files needs renaming, rename them.

Maximum edit rate (e.g. edits per minute): per community policy of non-urgent tasks (1/5 sec).

Bot flag requested: (Y/N): Yes.

Programming language(s): Javascript, MediaWiki JS, NodeJS.

Yug (talk) 19:15, 19 September 2020 (UTC)[reply]

Discussion

Stale. Can be reopened when ready. --Krd 18:37, 22 November 2020 (UTC)[reply]


Reopened. --Krd 07:13, 25 March 2021 (UTC)[reply]
Thank Krd. And so...
Let's the War begin ! ^0^y Yug (talk) 11:20, 25 March 2021 (UTC)[reply]
The request have been updated. I since got more familiar with bot, running User:Dragons Bot on Wikimedia LinguaLibre.org. My 3 demo scripts are nearly ready to be fired. 3 test types likely this weekend on 5~6 files. Demo edits are the last missing brick of this bot request if i'am not mistaken. Files will have to be removed thereafter. Yug (talk) 11:27, 25 March 2021 (UTC)[reply]
I made an user-rights request for filemovers (Commons:Requests_for_rights#ShufaBot). This is necessary to test the bot on this aspect. Yug (talk) 18:52, 25 March 2021 (UTC)[reply]
Krd, does this rigns any bell to you ? I tried to rename an empty page (failed as expected), then attacked an existing page : 1st rename worked, the revert-renamed was blocked and the bot visibly temporally lost its moving rights.
Error: abusefilter-autopromote-blocked: This action has been automatically identified as harmful, and it has been disallowed. In addition, as a security measure, some privileges routinely granted to established accounts have been temporarily revoked from your account. A brief description of the abuse rule which your action matched is: Pagemove throttle for new users.
Google can't find much on this throttle. I found en:Wikipedia:Edit_filter/Current##68:_Pagemove_throttle_for_new_users, which I actually cannot see : You may not view details of this filter, because it is hidden from public view. Yug (talk) 06:42, 26 March 2021 (UTC)[reply]
@Yug: It happened because of the Special:AbuseFilter/60. I have granted the bot autopatrol flag temporary, so it should be good now. -- CptViraj (talk) 07:42, 26 March 2021 (UTC)[reply]
CptViraj, Thank for the clarification. These roadblocks are unanounced. It may be worth it to create a set of warning tempkate, so when a bot request includes 'renaming', or 'upload', the relevant guidance is provided so we go and resquest auropatrol, filemover rights and any needed userrights.
Your link shows me You may not view details of this filter, because it is hidden from public view.. I guess its visible to admins only. Yug (talk) 08:05, 26 March 2021 (UTC)[reply]

I still don't understand what file moves are to be done with this task. Can you please show an example? --Krd 13:47, 28 March 2021 (UTC)[reply]

Krd hello,
(1.a) As a new category appears organically and grow, we periodically have to review filename conventions via consensus building for a more definitive filename / suffix. It typically involve 100~400 files to rename. Files of such emerging category are barely used accross Wikimedia sites. The naming convention review and possible renaming greenlights template usage within wikimedia sites.
(1.b) Practical example : I plan to upload a new style this summer, initial-draft naming convention will likely be File:{*}-numbered.png. About 250 files in the first batch. Those will not be announced nor used within Wikimedia projects. I will use this batch to open a review discussion on that style : errors, colors, styles, filename. Following consensus I may then have to correct, regenerate the batch or/and rename the (300) files with a smarter filename. In that case, upload and rename rights are necessary. As time pass, updating those pages via bot edits may also be necessary. When the category and batch are mature, we announce them and start integrations into major Wikimedia sites and their templating systems, typically :en, :fr, :de and :wp, :wikt. Other projects then slowly pick it up.
(2) On another axis, the Wikiapi NodeJS framework i use is ongoing a revamp due to my recent arrival on the project. One function i need is still missing and i wish to revamp my bot as well. So i'm delaying a bit my tests. About a week min or two weeks max. Yug (talk) 13:25, 3 April 2021 (UTC)[reply]

Test runs status ✓ Done.

Krd, EugeneZelenko, CptViraj hello. WikiapiJS has been fixed, so I test-ran my scripts minutes ago. All 3 tasks went smoothly (upload, rename, edit), see Special:Contributions/ShufaBot. All those mock files will have to be deleted, so I limited myself to 6 test files. I can do 50 mock uploads, moves, edits if required but doing so just seems to polute around. Yug (talk) 07:25, 5 April 2021 (UTC)[reply]

Please use meaningful edit summaries: edit doesn't tell anything. Same for rename. Templates must be place in separate line, not "glued" to category or placed between them. Bot should also use content categories for uploads, not artificial test ones. --EugeneZelenko (talk) 15:31, 5 April 2021 (UTC)[reply]
Thanks EugeneZelenko for the feedback on templates on separated lines. I missed this one.
I was asked to do a test run so I indeed used mock files, did mock renames and mock edits, using mock categories. Three of the 4 points you raised will naturally be addressed when working on real files, real categories and doing real things.
Can the bot get approved on the basis of the technical ability demonstrated above ? I will then be able to move toward my first batch of 2,000 real cases uploads. Yug (talk) 15:46, 5 April 2021 (UTC)[reply]
Please repeat test run with real work, not mock one. --EugeneZelenko (talk) 15:49, 5 April 2021 (UTC)[reply]
Note: This bot review process has several aspects, wordings, which are confusing applicants.
I do not have actual renaming needs under hand at the moment (see 1.a and 1.b above). Yug (talk) 15:51, 5 April 2021 (UTC)[reply]
At least uploads may be real. --EugeneZelenko (talk) 16:35, 5 April 2021 (UTC)[reply]
EugeneZelenko, thank you for adapting to my situation. Please note that the guideline above states :
« Test run: You can be demanded to make a short test run with your bot account (30–50 edits/uploads) to allow other users to review your bot's tasks. Unauthorized test run is not allowed. »
“Can” suggests optional, while nothing is said about the “Test run” as requiring actual contributions and files. Such requirement strongly constraint the main meaning of test. It's also an impossible task to meet for bot masters who have periodical mass uploads/edits to manage followed by periods of calm, and no action to implement at the time of the bot request.
I met the following previous, predictable, yet unannounced hurdles : 1. Test run bumps into abusefilter-autopromote-blocked → 2. Temp user-rights to request but NOT in Commons:Requests_for_rights → 3. Test run requires actual edits. As an applicant bot-master the feeling given is an ever-moving finish line.
On a constructive note, I'm gathering a review of hurdles and corresponding recommendations to improve, smooth, and shorten this bot request process. I will submit these suggestions after my bot request. Yug (talk) 17:38, 5 April 2021 (UTC)[reply]
Please also note that each time an unannounced hurdle comes, it's likely 3-8hours of unannounced coding work to add to the bot request. Which if not available delay the bot request process by days, weeks or occasionally months. Yug (talk) 19:23, 5 April 2021 (UTC)[reply]
Test run means real work. Otherwise it's hard to judge how correct bot's actions are. --EugeneZelenko (talk) 19:26, 5 April 2021 (UTC)[reply]
EugeneZelenko, CptViraj Test run #2, status | ✓ Done.

There is one strange edit on my bot's user page : I reported it to WikiapiJS' developer.
This should satisfy the requirements stated on the Commons:Bots/Requests. Yug (talk) 15:15, 7 April 2021 (UTC)[reply]

1 July 2021

23 April 2021

21 April 2021

20 April 2021

10 April 2021

E.g. File:髟-kaishu.png: Is this file copyrighted? If yes, who is by which reason the copyright holder? --Krd 22:38, 8 April 2021 (UTC)[reply]
✓ Done. I fixed the template present in all these files, adding the source.
I think the bot side is demonstrated enough. The issue you raise is template, license issue. Yug (talk) 05:45, 9 April 2021 (UTC)[reply]
(A complementary custom PD-font template {{CJK-fonts}} with detailed source and copyright concerns is underway, I'm assisted on this template work but will still require few days. --Yug (talk) 05:54, 9 April 2021 (UTC)[reply]
Uploads seem OK, but please don't make artificial ones.
Please make edit summaries like Help:Gadget-HotCat for category changes tasks and provide meaningful reason for move tasks.
Bot name is useless in summaries. --EugeneZelenko (talk) 14:08, 9 April 2021 (UTC)[reply]
Can you specify the Hotcat edit summary convention (?) you are refering to ?
Yes, actual moves will have more meaningful reasons. As said in the request, those move are discussed within the project, with typical waiting time of 3~12 months between first discussion and implementation, so we are sure of our choice.
True. No need for ShufaBot in edit summaries. Point taken.
Is there a new step in the bot approval process ? Yug (talk) 19:24, 9 April 2021 (UTC)[reply]
See MediaWiki:Gadget-HotCat.js. --EugeneZelenko (talk) 14:09, 10 April 2021 (UTC)[reply]
EugeneZelenko, is this an official requirement for bot status or is this a recommendation which I can work on later ? (The 3000+ lines js code focusing on categories' management doesn't contain clear indication of a summary edits convention. I'am also focused on files management, not on editing categories.)
If I understand well, the technical side of the bot has been demonstrated for edit, upload, move capabilities. I have to finish the related templates: 1) add creator field ; 2) finish the license template. Yug (talk) 19:43, 10 April 2021 (UTC)[reply]
✓ Done 1) and 2) have been fixed. So :
  • The request's form defined the bot properly. Scope, purpose, usecases and project-specific consensus process have been detailed and shared.
  • Two series of test run have been ran proving technical capabilities (April 5th: mock; April 7th: real).
  • Pointed out marginal errors (line jump, license, source, templates) have been corrected via further bot-lead edits and template editing. Current files have no significant error, templates are now mature.
  • Trustworthiness has not been mentioned as a blocking element.
EugeneZelenko, CptViraj, Krd Is there an opposition to bot status for this ShufaBot based on process defined on Commons:Bots/Requests ? Yug (talk) 21:24, 10 April 2021 (UTC)[reply]
It's not requirement, but if there are good example of implementation, it's better to follow it than reinvent a wheel. I'd like to see test new test run for categorization. Please not that test runs must be short (5-10 is enough). --EugeneZelenko (talk) 13:58, 11 April 2021 (UTC)[reply]
Hi Eugene,
(I did not find the Hotcat's edit summary convention to follow, if there is a clean best practice somewhere I would like to check it out for sure. Example of clean best practices I authored : ACC.)
About test runs... Test run on 6 mock files is easy. If I have a clear edit to do : test run on all real files of a category is easy ; test run on 5-10 real files of a category creates problems.
Categorization tests are, technically, equals to doing edit tests. Special:Contributions/ShufaBot already has 200+ such real edit tests (ex).
Commons:ACC and the projects cited to gain support from ShufaBot mainly uses templates to add categories. All our files use the templates {{ACClicense}}, {{SOlicense}}, {{MCClicense}}, which dispatch the relevant categories.
Do you have a specific action (categorization change) you wish to see ? I can otherwise think out a categorization change... I could remove Category:ShufaBot test: upload, my working category. I initially wanted to keep it for few months in case I find some error but I can find a workaround via {{MCClicense}}. Alternatively, an alternative would be to mock edit the letters series.
Fine tuning and mentoring is a life long process, I would prefer we close this bot request and similarly continue fine-tuning thereafter. Yug (talk) 16:50, 11 April 2021 (UTC)[reply]
« I'd like to see test new test run for categorization. »✓ Done ex. Third test run. Yug (talk) 23:26, 11 April 2021 (UTC)[reply]
File:Letter-e-colorful.svg still has no source, a questionable author (call me humorless), and a suboptimal summary in the recent edit. I agree fine tuning can and shall be a continuous process, but basics should be met as soon as possible for mass edits. --Krd 06:31, 12 April 2021 (UTC)[reply]
It have been previously shared on April 5th that « All those mock files will have to be deleted, so I limited myself to 6 test files. » The letter series is a temporary, PD basic shapes, {{PD-ineligible}}-tagged, {{Own}} mock files series heading to speedy deletion. It has only motivational and temporary use to tests actions requested by the reviewers but for which I have no actual case at the moment. This series was used to demonstrate technical capabilities:
  • upload (6+)
  • edit (6+)
  • move (30+)
The page content was not the focus, as previously stated, and are heading to deletion.
For actual and full tests, the *-kaishu.png series have been provided where content capability and edit summary capability are displayed. Those basic capabilities have been demonstrated on 214 files already, via upload and multiple edits. Example of typical history:
  • 21:40, 10 April 2021‎ ShufaBot talk contribs‎ m 184 bytes +17‎ Add fontseries parameter to provide better source.
  • 20:16, 10 April 2021‎ ShufaBot talk contribs‎ m 167 bytes +11‎ Add file creator-author
  • 19:35, 8 April 2021‎ ShufaBot talk contribs‎ 156 bytes +156‎ Upload Chinese radical 龠 in Kaishu style, raster format.
Edit capability had already been demonstrated 500 times already when a reviewer requested to lead categorization edits ―a request which is redundant with edits already done―. I nevertheless satisfied this artificial additional request and it has been done.
Now 1) the mock file are criticized for not being properly documented. Indeed: they are mock files. 2) Reviewer request includes cleaner edits summary while not providing practical example nor properly defining requirement to satisfy this bot request. Given the 30-characters changes for most (all?) modification edits, a more detailed edit-summary would be the edits themselves. ⇒⇒ I do not understand what you want from me. ⇐⇐ If you have better convention, mentor me by showing me and i will then learn.
Commons:Bots/Requests pose the test run as optional. Where does this test run stops, where are written the criteria to satisfy, sincerely.
The current balance of no written criteria to reach approval, artificial requests, unclear requests, unpractical requests, micro scrutiny and criticisms without example solution is confusing.
Yug (talk) 07:46, 12 April 2021 (UTC)[reply]
The announced temporary test files with mock author pseudo (ex: "Yug Queen of testers") and no source parameter have been 1) fixed by bot ; 2) requested speedy ; 3) speedy deleted. AFAIK, there is no more files with mock or incomplete content.
Overall, the following technical and content capacities have been demonstrated through 3 test runs :
  • upload (200+)
  • edit (400+)
  • rename (30+)
… for the very scope defined in the early bot request. Yug (talk) 09:57, 12 April 2021 (UTC)[reply]

I suggest to close this request as declined. I think bot owner should start from task smaller scale and stricter definitions. --EugeneZelenko (talk) 14:21, 2 May 2021 (UTC)[reply]