Commons:Structured data/Computer-aided tagging/Blocklist

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

 Request new items on the talk page.

Purpose of the blocklist[edit]

The Suggested Tags feature receives suggestions from a machine learning system built by Google. Some of those suggestions are not quite suitable for Commons for various reasons, so the teams developing the tool implemented a blocklist feature that excludes them when they are suggested.

There is currently a known issue where tags for some files don’t get updated when the blocklist does. We’re working on a solution for that, but most new uploads will not be affected.

The blocklist feature exists behind the scenes, so for now items can be added to the blocklist by developers. Editors can request blocklist additions on this talk page.

Blocklist[edit]

The canonical list actually lives in InitialiseSettings.php, so this list may be out of date.

Gender[edit]

  • 'Q467', // woman
  • 'Q3010', // boy
  • 'Q3031', // girl
  • 'Q8441', // man
  • 'Q1378024', //lady

Colors[edit]

  • 'Q1088', // blue
  • 'Q23445', // black
  • 'Q3142', // red
  • 'Q3133', // green
  • 'Q943', // yellow
  • 'Q39338', // orange
  • 'Q428124', // violet
  • 'Q3257809', // purple
  • 'Q23444', // white
  • 'Q42519', // gray
  • 'Q47071', // brown
  • 'Q843607', // beige
  • 'Q10296772', // carmine
  • 'Q373058', // Azure
  • 'Q317802', // silver
  • 'Q1670336', // tan
  • 'Q2778382', // bronze
  • 'Q208045', // gold
  • 'Q1078214', // teal
  • 'Q906936', // lime
  • 'Q543923', // rust
  • 'Q2015138', // salmon
  • 'Q2268159', // scarlet
  • 'Q767608', // sepia
  • 'Q3518540', // terra cotta
  • 'Q5960345', // turquoise
  • 'Q3014419', // wine
  • 'Q454847', //amaranth
  • 'Q679355', //amber
  • 'Q1324818', //apricot
  • 'Q372973', //aquamarine
  • 'Q797446', //burgundy
  • 'Q5043987', //carnelian
  • 'Q2541418', //celadon
  • 'Q313120', //cerulean
  • 'Q3309916', //chocolate
  • 'Q2936397', //cinnamon
  • 'Q2411228', //coral
  • 'Q2730433', //cream
  • 'Q303826', //crimson
  • 'Q180778', //cyan
  • 'Q5005364', //fuchsia
  • 'Q5967009', //indigo
  • 'Q650770', //khaki
  • 'Q2720565', //lemon
  • 'Q2294993', //lilac
  • 'Q3276756', //magenta
  • 'Q25203611', //Mango
  • 'Q25393814', //maroon
  • 'Q864152', //olive
  • 'Q533047', //rose

Adjectives[edit]

  • 'Q48997611', // vintage
  • 'Q10770146', // monochrome
  • 'Q6453656', // monochrome
  • 'Q838368', // black-and-white
  • 'Q296001', // close-up

Odd nouns[edit]

  • 'Q1301433', // land vehicle
  • 'Q1802779', // land vehicle
  • 'Q6478447', // properties of water
  • 'Q2083958', // pattern
  • 'Q738168', // pattern
  • 'Q696160', // psychedelic art
  • 'Q7239', // organism
  • 'Q638', // music
  • 'Q11634', // art of sculpture
  • 'Q11629', // art of painting
  • 'Q16502', // world
  • 'Q1792644', // art style
  • 'Q1190554', // occurrence
  • 'Q295469', // ecoregion
  • 'Q1049799', // water resource
  • 'Q334166', // mode of transport
  • 'Q42889', // vehicle
  • 'Q82821', // tradition
  • 'Q3248864', // Terrestrial animal
  • 'Q53875', // parallelism
  • 'Q37073', // pop music
  • 'Q101998', // biome
  • 'Q2262382', // masai lion
  • 'Q721221', // serpent
  • 'Q1634416', // stock photography
  • 'Q1395149', // demonstration
  • 'Q12554', // Middle Ages
  • 'Q83180', // roof
  • 'Q167510', // bitumen
  • 'Q43619', // natural environment
  • 'Q309', // history
  • 'Q11016', // technology
  • 'Q1172903', // loch
  • 'Q165848', // wind wave
  • 'Q211778', //Lake District
  • 'Q43238', //Poaceae
  • 'Q5135744', //religious institute
  • 'Q486972', //human settlement
  • 'Q826939', //canard

Time-oriented terms[edit]

  • 'Q1187312', // yesterday
  • 'Q988377', // day before yesterday
  • 'Q1036448', // day after tomorrow
  • 'Q1209716', // tomorrow
  • 'Q3151690', // today

Implementation[edit]

This blocklist is currently implemented via the $wgMachineVisionWikidataIdBlacklist config variable in InitialiseSettings.php. The code that handles the actual filtering lives in GoogleCloudVisionClient::filterIdBlocklist() in the MachineVision extension.

WithHoldImageList[edit]

In addition to the blocklist, as part of a temporary hold on images of people, the system utilizes a separate list of identifiers that indicate a person is in the image and the image should be excluded from the suggestion queue. This list is currently implemented via the $wgMachineVisionWithholdImageList config variable in InitialiseSettings.php.

  • 'Q467', // woman
  • 'Q3010', // boy
  • 'Q3031', // girl
  • 'Q8441', // man
  • 'Q1378024', // lady
  • 'Q255274', // white collar worker
  • 'Q327968', // facial expression
  • 'Q41055', // forehead
  • 'Q2472587', // people
  • 'Q1155908', // elder
  • 'Q15173', // lip
  • 'Q82714', // chin
  • 'Q1886338', // makeover
  • 'Q3080415', // Jheri curl
  • 'Q170579', // laughter
  • 'Q28472', // hair
  • 'Q327496', // hairstyle
  • 'Q371174', // gesture
  • 'Q82714', // chin
  • 'Q9633', // neck
  • 'Q1886338', // Makeover
  • 'Q1922956', // black hair
  • 'Q1255864', // fun
  • 'Q14130', // long hair
  • 'Q202466', // blond hair
  • 'Q327496', // hairstyle
  • 'Q37017', // face
  • 'Q43748', // eyebrow
  • 'Q1190554', // occurrence
  • 'Q82821', // tradition
  • 'Q182832', // concert
  • 'Q132241', // festival
  • 'Q349', // sport
  • 'Q2755547', // individual sport
  • 'Q874405', // social group
  • 'Q327245', // team
  • 'Q12068677', // selfie
  • 'Q23640', // head
  • 'Q83360', // thumb
  • 'Q36864', // nail
  • 'Q33767', // hand
  • 'Q319604', // passenger
  • 'Q205398', // social work
  • 'Q7242', // beauty
  • 'Q12684', // fashion
  • 'Q749212', // gentleman
  • 'Q639669', // musician
  • 'Q184485', // performing arts
  • 'Q855091', // guitarist
  • 'Q159992', // surfing
  • 'Q2021379', // wakesurfing
  • 'Q911069', // boardsport
  • 'Q8037570', // wrangler
  • 'Q273283', // beggar
  • 'Q134307', // portrait