Commons:Tools/pywiki file description cleanup

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

This is a fix to adapt pywikipediabot's cosmetic_changes.py module for Commons. All most all standard features of the module are not used.

It could be used by any bot running pywikipediabot. It would do these changes only if it does other edits.

The code is currently being debugged. For discussion, see Commons:Bots/Work requests#Changes to allow localization (i18n) file description pages cleanup.

Settings[edit]

setting for user_config.py
cosmetic_changes = True
cosmetic_changes_mylang_only = True
Settings to enable special "cosmetic changes" for Commons

Check cosmetic_changes.py for:

    def change(self, text):
        """
        Given a wiki source code text, returns the cleaned up version.
        """
        oldText = text

and insert this after the "oldText = text" line:

        if self.site.sitename() == 'commons:commons':
            if self.namespace == 6:
                text = self.standardizeCategories(text)
                text = self.commonsfiledesc(text)
        else:

Then indent the remaining code by one level until "if self.debug:".

Change the edit summary to:

msg_append = {
    'en': u'; [[Commons talk:Tools/pywiki file description cleanup|desc page fmt]]',
}

Fixes[edit]

Fixes for Commons (insert this into cosmetic_changes.py too): If you are using a newer version of pywikipediabot, you might have to replace all occurences of "wikipedia" with "pywikibot" before you insert this passage into cosmetic_changes.py

    #TEST CODE: NO WARRANTY GIVEN OR IMPLIED
    def commonsfiledesc(self, text):
        # section headers to {{int:}} versions
        exceptions = ['comment', 'includeonly', 'math', 'noinclude', 'nowiki',
                      'pre', 'source', 'ref', 'timeline']
        text = pywikibot.replaceExcept(text,
                                       r"([\r\n]|^)\=\= *Summary *\=\=",
                                       r"\1== {{int:filedesc}} ==",
                                       exceptions, True)
        text = pywikibot.replaceExcept(
            text,
            r"([\r\n])\=\= *\[\[Commons:Copyright tags\|Licensing\]\]: *\=\=",
            r"\1== {{int:license-header}} ==", exceptions, True)
        text = pywikibot.replaceExcept(
            text,
            r"([\r\n])\=\= *(Licensing|License information|{{int:license}}) *\=\=",
            r"\1== {{int:license-header}} ==", exceptions, True)

        # frequent field values to {{int:}} versions
        text = pywikibot.replaceExcept(
            text,
            r'([\r\n])\|([Ss])ource( *)\=( *)([Oo]wn work by uploader|[Oo]wn work|[Ee]igene [Aa]rbeit)( *)([\r\n])',
            r'\1|\2ource\3=\4{{own}}\7', exceptions, True)
        text = pywikibot.replaceExcept(
            text,
            r'\|( *)Permission( *)=( *)([Ss]ee below|[Ss]iehe unten)( *)([\r\n])',
            r'|\1Permission\2=\6', exceptions, True)

        # added to transwikied pages
        text = pywikibot.replaceExcept(text, r'__NOTOC__', '', exceptions, True)

        # tracker element for js upload form
        text = pywikibot.replaceExcept(
            text,
            r'<!-- *{{ImageUpload\|(?:full|basic)}} *-->',
            '', exceptions[1:], True)
        text = pywikibot.replaceExcept(text, r'{{ImageUpload\|(?:basic|full)}}',
                                       '', exceptions, True)

        # duplicated section headers
        text = pywikibot.replaceExcept(
            text,
            r'([\r\n]|^)\=\= *{{int:filedesc}} *\=\=(?:[\r\n ]*)\=\= *{{int:filedesc}} *\=\=',
            r'\1== {{int:filedesc}} ==', exceptions, True)
        text = pywikibot.replaceExcept(
            text,
            r'([\r\n]|^)\=\= *{{int:license-header}} *\=\=(?:[\r\n ]*)\=\= *{{int:license-header}} *\=\=',
            r'\1== {{int:license-header}} ==', exceptions, True)

        return text