User:TabulistBot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Purpose[edit]

The purpose of this bot is to copy data from Wikidata to Data: namespace tables using SPARQL queries. This will allow all wikis to use complex query results from Wikidata in the articles with Lua scripts (e.g. lists, tables) and Graphs. TabulistBot is similar to the Listeria bot though it generates tabular data instead of Wikitext.

Operation[edit]

To use this bot, you will need to create an empty tabular dataset page within Commons' Data: namespace, and add a {{Wikidata tabular}} template to the corresponding talk page. The dataset page has to contain all the appropriate headers for the dataset, but leave the "data": [] element as an empty array.

The bot will automatically detect all pages with {{Wikidata tabular}} templates and update the data periodically from the Wikidata database. Do not edit the dataset page's data section manually, as it will be overwritten with the next update. If you want the bot to stop updating the dataset, remove or comment out the data source template from the talk page.

Data page[edit]

First, create a tabular dataset page in Data: namespace and .tab extension. You might want to prefix the name with Sandbox/<username>/ until it is ready, and later rename it. For example, Data:Sandbox/Smalyshev/test.tab might be:

{
  "license": "CC0-1.0+",
  "description": {
    "en": "Behold the list of cats in Wikidata. This list is generated by TabulistBot."
  },
  "schema": {
    "fields": [
      {
        "name": "qid",
        "type": "string",
        "title": {"en": "QID"}
      },
      {
        "name": "name",
        "type": "localized",
        "title": {"en": "Name"}
      },
      {
        "name": "desc",
        "type": "localized",
        "title": {"en": "Description"}
      },
      {
        "name": "birth",
        "type": "string",
        "title": {"en": "Date of birth"}
      },
      {
        "name": "death",
        "type": "string",
        "title": {"en": "Date of death"}
      }
    ]
  },
  "data": []
}

Data source template[edit]

Data source template is an instance of {{Wikidata tabular}} template and is placed in the talk page of the corresponding data template. The important parameters are sparql, which defines the SPARQL query to run, and columns, which defines which columns are exported to the table. Note that each column thus exported must have corresponding definition in the schema in the data template, and vice versa. See the template documentation for more information.

Note that currently the query should contain ?item column that specifies the item QID the row belongs to. This requirement may be removed in the future.

See the example: Data talk:Sandbox/Smalyshev/test.tab

More datasets[edit]

Source[edit]