Commons:Batch uploading/Fonds Ancely

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Fonds Ancely[edit]

This upload is part of a partnership between Wikimédia France and the Library of Toulouse. It consists of 2085 public domain files. You may see general notes and work in progress on User:Jean-Frédéric/Ancely.

The metadata is held in a OAI PMH repository. The code explores it and retrieves records ; then if applicable the various fields are matched to a manual alignement of Commons categories and tags, community curated. This is then fed to a data ingestion templates which translates the metadata to {{Artwork}}. Actual upload is made with Pywikipedia-rewrite by User:AncelyBot.

In its current state, the categorisation system with the alignment outputs 31,801 categories (1,694 distinct) − the drawback is that many are high-level categories (“Shawls”, “men”, etc.)

Looking forward your thoughts, Jean-Fred (talk) 22:49, 6 March 2013 (UTC)[reply]

Opinions[edit]

Conclusion[edit]

Dupes[edit]

The following files were already on Commons − we might want to update their file descriptions (current: 33)

Errors[edit]

The following files failed to upload (current: 11)

Categorisation statistics[edit]
Per category[edit]

30266 categories, 1760 distincts Mean: 17.1965909091 Median: 2.0 Max 1045 // Min 1

Top 10: [(u'Mountains in art', 1045), (u'Men in art', 992), (u'Women in art', 878), (u'Trees in art', 780), (u'Houses in art', 736), (u'Pyr\xe9n\xe9es-Atlantiques', 693), (u'Hautes-Pyr\xe9n\xe9es', 617), (u'Pyrenees', 470), (u'National costumes in art', 468), (u'Rivers in art', 440)]

Lose 10: [(u'Estrades', 1), (u'Pierre Bayle', 1), (u'Morla\xe0s', 1), (u'Louis-Fran\xe7ois Couch\xe9', 1), (u'Jean Racine', 1), (u'Faience in France', 1), (u'Marmite', 1), (u'Corsica', 1), (u'Dordogne River', 1), (u'Esera River', 1)]

Per file[edit]

Mean: 14.5160671463 Median: 13.0 Max 47 // Min 0

Top N: [('B315556101_A_LEVASSEUR_066', 47), ('B315556101_A_LEVASSEUR_068', 46), ('B315556101_A_LEVASSEUR_018', 44), ('B315556101_A_LEVASSEUR_056', 42), ('B315556101_A_LEVASSEUR_057', 42)]

Lose N: [('B315556101_A_BERTHIER_010', 1), ('B315556101_A_BERTHIER_024', 0), ('B315556101_A_BERTHIER_021', 0), ('B315556101_A_BERTHIER_018', 0), ('B315556101_A_BERTHIER_013', 0)]

Assigned to Job Progress
Jean-Frédéric Metadata pre-processing Status:    Done
Jean-Frédéric, Symac, Léna, PierreSelim Metadata alignment Status:    Done
User:Jean-Frédéric Upload Status:    Done
Dupes and errors processing Status:    todo