Commons:Bots/Requests/Embedded Data Bot (aggressive algorithm)

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Embedded Data Bot (talk · contribs) (aggressive algorithm)

Operator: Zhuyifei1999 (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information) & Steinsplitter (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Alternate algorithm for Commons:Bots/Requests/Embedded Data Bot: Aggressively (instead of passively) find archive signatures / magic numbers in newly uploaded files in RC, parse the data according to archive format specs to confirm the archive, and tag the file in a similar way treat the file in the same way as the original task. Reupload or deletion are not the direct actions of this algorithm unless the current algorithm want those. (I'll file another BRFA if this new aggressive algorithm seems very stable and deletion can be added to its actions.) Deletion will not happen automatically during test run, but will once this algorithm is approved.

The identification of the embedded archives is getting harder and harder with current passive algorithm. I cannot say the reason very publicly due to anti-abuse reasons, but I'm fine if you want me to reveal it via private means. An aggressive algorithm will mimic a decompressor, the common way in which the embedded archives are abused.

Automatic or manually assisted: Automatic unsupervised

Edit type (e.g. Continuous, daily, one time run): Continuous via RC

Maximum edit rate (e.g. edits per minute): 6 per min

Bot flag requested: (Y/N): N

Programming language(s): python: pywikibot

Zhuyifei1999 (talk) 13:53, 10 April 2017 (UTC)[reply]

Discussion

Hopefully I'll get a test run ready by this weekend. --Zhuyifei1999 (talk) 13:53, 10 April 2017 (UTC)[reply]
As per Commons:Bots#Permission_to_run_a_bot no request is needed for minor modifications. I think it is the case here, and standard practice. --Steinsplitter (talk) 14:35, 10 April 2017 (UTC)[reply]
Another algorithm different in its principles is definitely not a minor change. I'd say it's almost a different task --Zhuyifei1999 (talk) 23:58, 10 April 2017 (UTC)[reply]
The change is live and running with support of RAR v4 and 5. Results will be marked as "via Magic" (indicating magic numbers) --Zhuyifei1999 (talk) 14:37, 15 April 2017 (UTC)[reply]

Approved. --Krd 00:32, 18 April 2017 (UTC)[reply]