Commons:Bots/Requests/SamoaBot 2

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

SamoaBot 2 (talk · contribs)

Operator: Ricordisamoa (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: detecting and logging (in a user subpage) SVG images without a proper xmlns declaration on the root <svg> element: those images aren't viewable at all in some browsers, and should be properly fixed (this can be done later, either manually or automatically)

Automatic or manually assisted: automatic, supervised

Edit type (e.g. Continuous, daily, one time run): intermittently

Maximum edit rate (e.g. edits per minute): about 1 edit per minute (max)

Bot flag requested: (Y/N): N (already flagged, see Commons:Bots/Requests/SamoaBot)

Programming language(s): JavaScript, with Ajax (own code, will be published soon)

Ricordisamoa 04:56, 26 March 2013 (UTC)[reply]

Discussion

Here can be seen the log of all images detected so far.--Ricordisamoa 06:02, 26 March 2013 (UTC)[reply]

I think will be good idea to add maintenance template to image page too.
Edit summary for log action could be just file name with a link, log page name is self-explanatory.
EugeneZelenko (talk) 14:44, 27 March 2013 (UTC)[reply]
Does a proper template exist for these cases? (and I'll work on edit summary) --Ricordisamoa 06:14, 28 March 2013 (UTC)[reply]
{{BadSVG}} may be adapted for this purpose, or {{Cleanup image}} may be used. --EugeneZelenko (talk) 14:47, 28 March 2013 (UTC)[reply]
  •  Info In Firefox, you can use the sendAsBinary() method of your XHR-instance to upload the SVG. This avoids encoding issues (while some SVGs also work with the send method, this will screw up others). Downloading is very easy through a GET since the cross-origin-issue is resolved.
  • I created a sample that works with FF 19 (tested): User:Rillke/fastTransfer.js. It downloads and uploads a file immediately when invoked. You have access to the raw data in function _gotFile so you can manipulate that data. -- Rillke(q?) 20:43, 28 March 2013 (UTC)[reply]
  • I had a look with Firefox 19. It displayed two of the nine files on the list.
(E)-pent-2-ène.svg does not display
(Z)-pent-2-ène.svg does not display
(±)-Ethyl-2-methylbutyrate_Structural_Formulae_V.1.svg has been re-uploaded several times; version of 13:06, 26 March 2013 does not display and Mediawiki did not generate a thumbnail for it; other versions okay
1-Chlornaphthalin.svg displays correctly
1-jpg.svg Mediawiki says there's an error in the file; does not display in browser
1025arud.svg Mediawiki says there's an error in the file; does not display in browser
10th_Panzer_Division_logo_1.svg has been re-uploaded, but both versions displayed correctly
1422_Zeta_in_the_Serbian_Despotate_after_death_Balsa_III.svg does not display
1885ArmenianFlag.svg has since been re-uploaded; original file did not display
Rybec (talk) 22:21, 28 March 2013 (UTC)[reply]

OK, now I'm going to run a fixed version of the script; let's see... --Ricordisamoa 23:16, 28 March 2013 (UTC)[reply]

  • I tried to view the new additions to the list, with similar results: most of the files did not display, but one did. I don't know enough about the subject to say definitively that there are false positives.
1969_draft_lottery_scatterplot.svg file has been replaced; old version did not display in Firefox
1988_Illinois_Constitutional_Convention_Vote_pie_chart.svg file has been replaced; old version did not display in Firefox
1st_Panzer_Division_logo.svg displays correctly in Firefox
2-propil-amine.svg does not display in Firefox
201globe.svg file has been replaced; old version did not display in Firefox
250x250Feld.svg does not display in Firefox
2NOGCMOS.svg does not display in Firefox
Even with the possible false positives, the usefulness of this list is apparent. If making the list is all that this request covers, I fully support it (if the request is also about automatically fixing the problems that are found, I don't support that part: some unneeded changes might be made, and test edits fixing the SVG files haven't yet been done). Rybec (talk) 23:55, 30 March 2013 (UTC)[reply]
After 1st_Panzer_Division_logo.svg I changed the code, so it should work well now. --Ricordisamoa 10:54, 31 March 2013 (UTC)[reply]

I  Support this request, but I'm leaning against applying the bot flag. If it's only editing one page, and even that only once per minute, I don't see it as necessary. Am I missing a reason you need the flag? --99of9 (talk) 11:10, 31 March 2013 (UTC)[reply]

The purpose of the bot flag is to avoid flooding RCs (if the bot's speed is out of control)... anyway, there's also the first request. --Ricordisamoa 13:57, 31 March 2013 (UTC)[reply]
What is final functionality? Will bot re-upload fixed files, or only log problematic files? If last is true, will be clean-up template added to file page? --EugeneZelenko (talk) 14:26, 31 March 2013 (UTC)[reply]
In case it is only logging, I can try to write a JavaScript-Bot which fixes the issue that does not need any other host than a compatible browser to run. IMHO a logging-bot does not need a bot-flag but it would be great if we could combine this functionality (detection and fixing). Then, we also do not need to log each occurrence or adding a template. -- Rillke(q?) 09:16, 1 April 2013 (UTC)[reply]
You get my {{Support}} --Ricordisamoa 11:15, 1 April 2013 (UTC)[reply]
I need your source code. -- Rillke(q?) 09:54, 2 April 2013 (UTC)[reply]
User:Ricordisamoa/XMLNSense.js --Ricordisamoa 10:23, 2 April 2013 (UTC)[reply]

Thank you. → User:Rillke/MwJSBot.SVGXmlNSFixer.js. Log and continue-params are written to my user space by default but this can be customized by creating an own instance of window.SVGXmlNSFixer. Detection of svg root in its own svg namespace like for this file has to be fixed.

-- Rillke(q?) 20:35, 3 April 2013 (UTC)[reply]

So could you operate a real bot for this task? --Ricordisamoa 20:52, 3 April 2013 (UTC)[reply]
I've no dedicated server/computer for this task, if that is your question. But I think if I or someone else continue(s) running this JavaScript over all 666,540+ SVG files, it would be a good idea to ask whether other issues with SVG files should be considered as well. Also the speed/bandwidth is not really sufficient… -- Rillke(q?) 16:58, 4 April 2013 (UTC)[reply]
I meant "a flagged bot account"; BTW, I got the flag (for another task). --Ricordisamoa 18:47, 4 April 2013 (UTC)[reply]
I've no flagged bot account where the task would be appropriate to run under. -- Rillke(q?) 20:18, 4 April 2013 (UTC)[reply]
Would your script work on Chromium/Chrome? --Ricordisamoa 21:16, 4 April 2013 (UTC)[reply]
It seems so. At least the loop runs and the parser does its job. Not entirely sure whether the upload will work but it's very likely (it's using the usual XHR.send() as SVGs are UTF-8 encoded). -- Rillke(q?) 22:00, 5 April 2013 (UTC)[reply]
Should I continue running the script (e.g. under RillkeBot (not flagged)), would you like continuing running it, or do we want to split the load, are there questions left? -- Rillke(q?) 20:44, 14 April 2013 (UTC)[reply]
For me it's ok, but you should get a bot flag; I'm thinking it may be better to swich to Pywikipediabot, though. --Ricordisamoa 13:30, 15 April 2013 (UTC)[reply]

How many files are we approximately looking at? If it is of the order of a hundred or less then I suggest to just go ahead with it. --Dschwen (talk) 21:26, 9 May 2013 (UTC)[reply]

It would have to check all files on the Commons, and upload at least several hundred images... However, I have PWB installed, and will starting coding something tomorrow. --Ricordisamoa 21:54, 9 May 2013 (UTC)[reply]
First of all you'll have to check all SVG files. Secondly, when you code please make sure to do a http range request! Try downloading just the first kilobyte of each file you are analyzing. This should speed things up. Let me know if you need help with that in python, I've implemented it before. --Dschwen (talk) 22:38, 9 May 2013 (UTC)[reply]

Update: I finally succeeded in uploading a fixed file with PWB: T Third Street logo.svg; presently, I'm not going to run it unsupervised until it becomes stable. However, I'd like to have some DB reports about missing XMLNS attributes. Thanks, --Ricordisamoa 16:53, 12 September 2013 (UTC)[reply]

Great! Are you using a SAX parser? -- Rillke(q?) 17:03, 12 September 2013 (UTC)[reply]
The bot's version of File:T Third Street logo.svg displays correctly in Firefox 23, whereas the previous version does not. Rybec (talk) 18:44, 12 September 2013 (UTC)[reply]
Great. Can we close this as good to go, or are you hoping for further feedback? --99of9 (talk) 13:06, 13 September 2013 (UTC)[reply]
I think will be good idea to have Rillke question answered and wait for couple more days for other possible comments. --EugeneZelenko (talk) 15:09, 13 September 2013 (UTC)[reply]
Ping @User:Ricordisamoa. Can you answer Rillke's question. --99of9 (talk) 03:27, 9 January 2014 (UTC)[reply]
@99of9: I found that "Simple API for XML" doesn't appear so simple to me... and again: could we get a database report about all SVG files missing that attribute? It would be very helpful to me, and it shouldn't be very hard to obtain it via WMFlabs. Then, I could make some tests and improve the current code. --Ricordisamoa 01:24, 10 January 2014 (UTC)[reply]
I know nothing about this sorry. Back to you User:Rillke or User:EugeneZelenko. --99of9 (talk) 02:43, 10 January 2014 (UTC)[reply]
Pinging @Rillke and @EugeneZelenko again, as they can provide some professional advice here :-) odder (talk) 15:17, 10 February 2014 (UTC)[reply]