Commons:Batch uploading/Rijksdienst voor het Cultureel Erfgoed

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
  • Source to upload from: image bank RCE in Europeana
    • URL pattern: Yes this for the files. The API below is useful
    • API: yes here is an XML database, from that we can find the image links.
    • Did you contact the site owner? We are in contact.
  • Description: 550.000 images from Monuments (buildings) in the Netherlands (of which 3000 in other countries). 50-80% is probably a Rijksmonument. Around 30% is identified. Another part could be identified based on address information


  • license: CC-BY-SA-3.0-NL, see here. 1200x1200px release in OTRS 2012121010014322.


  • Templates There is a template {{RCE-license)) and a template for linking to the database {{RCE-source}}

Opinions

[edit]

Question: Did I understand it correct: only images up to max. 800x800 px are unter CC-BY-SA-3.0-NL? So we can use only up to this dimensions? --Slick (talk) 11:48, 1 December 2012 (UTC)[reply]

We are still figuring that out because it's unclear, it seems that all sizes are free, but if you want to download images over 800x800 pixels there is a problem with downloading costs. The site states that images up to 800x800 are available under a free license and that all images are free (so it states both after eachother). For 800x800 it states that they can be downloaded freely, for other sizes it does not state this. Basvb (talk) 11:01, 2 December 2012 (UTC)[reply]

If you like to download the full size, just look at the html code and analyse the requests do by Flash-Viewer:

Step 1) Find the numeric picture id, i.E. in the URL: http://beeldbank.cultureelerfgoed.nl/20312817 -> ID=20312817
Step 2) get the URL: http://beeldbank.cultureelerfgoed.nl/index.php?option=com_memorixbeeld&view=record&format=topviewxml&tstart=0&id=<ID> you will get a XML output [1]. The values of interest are filepath and the layer with the scalefactor=1:
...
<filepath>39abc504-df68-c0ad-0c2b-b33296769b30.tjp</filepath>
...
<layer no="5" starttile="45" cols="9" rows="12" scalefactor="1" width="2075" height="2880"/>
...
Step 3) Read the layer line. Now you know that the picture is split in 9 cols, 12 rows and the starttile is 45. You now just download all tiles started by 45 up to 45+(cols*rows) and join them together by cols and rows. To get a single tile use: http://images.memorix.nl/rce/getpic?<FILEPATH>&<TILENUMBER>, i.E. http://images.memorix.nl/rce/getpic?39abc504-df68-c0ad-0c2b-b33296769b30.tjp&102

All should be very easy do this by a script. To check you joined images is fine, match it with the given width and height.

We didn't know the Tile/col procedure but we did know easy ways to download 1600x1600px files (just change the links). Permissions are the problem there, we are trying to clear that up but as it seems now we will only get permission to download the 800x800 (or maybe 1200x1200px) files. Basvb (talk) 12:54, 6 December 2012 (UTC)[reply]
I got an explicit release up to 1200x1200px, I will forward this to OTRS. Basvb (talk) 20:34, 10 December 2012 (UTC)[reply]

Some notes from me:

Some json used to play around:

{"adlibJSON":
 {"recordList":
  {"record":
   [
    {"@attributes": {"priref":"80000109","created":"2011-04-05T19:07:04","modification":"2011-04-05T21:47:18","selected":"False"},
     "Description":
      [
       {"description":
        ["Schildering op de schoorsteenboezem in de Renzumaborg in Uithuizermeeden.\u000d\u000a- begane grond, linker achterkamer: landschap."]
       }
      ],
      "Monument":
       [
        {"monument.complex_number":["515612"],
         "monument.geographical_keyword":[""],
         "monument.house_number":["3"],
         "monument.name":["Rensumaborg"],
         "monument.number":["21320"],
         "monument.number.x_coordinates":["6.71402490110"],
         "monument.number.y_coordinates":["53.41522222070"],
         "monument.place":["Uithuizermeeden"],
         "monument.province":["Groningen"],
         "monument.record_number":["279499"],
         "monument.street":["Rensumalaan"],
         "monument.type":[""],
         "monument.zipcode":["9982 BH"]
        }
       ],
       "object_number":["100109"],
       "priref":["80000109"],
       "Reproduction":
        [
         {"reproduction.reference":
          ["d6071e44-eb0a-4bb0-f345-d0a311489ae6"]
         }
        ]
       }
      ]
     },
     "diagnostic":{"hits":"1","xmltype":"Grouped","first_item":"1","search":"priref Equals 80000109","sort":"","limit":"1","hits_on_display":"1","response_time":"0","xml_creation_time":"15,6229","link_resolve_time":"15,6229","dbname":"collect","dsname":"","cgistring":"images"}}}


{"adlibJSON":
 {"recordList":
  {"record":
   [{"@attributes":{"priref":"20000001","created":"2009-04-19T11:05:45","modification":"2012-10-12T15:25:58","selected":"False"},
    "collection":["Fotocollectie"],
    "Content_subject":[{"content.subject":["Grachtenpand"]}],
    "creative_commons":[{"value":["RCE","CC-BY-SA","CC-BY-SA"]}],
    "Description":[{"description":["Exterieur, overzicht voorgevel pand Vrouwenverband"]}],
    "Monument":
     [
      {"monument.complex_number":["518301"],
       "monument.geographical_keyword":[""],
       "monument.house_number":["15"],
       "monument.name":["Vrouwenverband"],
       "monument.number":["518303"],
       "monument.number.x_coordinates":["4.89397111487"],
       "monument.number.y_coordinates":["52.36897955310"],
       "monument.place":["Amsterdam"],
       "monument.province":["Noord-Holland"],
       "monument.record_number":["417272"],
       "monument.street":["Turfdraagsterpad"],
       "monument.type":[""],
       "monument.zipcode":["1012 XT"]
      }
     ],
     "object_number":["321.954"],
     "priref":["20000001"],
     "Production":
      [
       {"creator":
        [
         {"value":["Dukker, G.J."]}
        ],
        "creator.role":["Fotograaf"]
       }
      ],
      "Production_date":[{"production.date.start":["1998-07"]}],
      "Reproduction":[{"reproduction.reference":["d99c8594-4a8c-acf9-b498-6f3a0a4e5f4b"]}],
      "Rights":[{"rights.notes":["http:\/\/creativecommons.org\/licenses\/by-sa\/3.0\/"]}],
      "Technique":[{"technique":["zwart wit negatief"]}]}]},

Multichill (talk) 23:10, 16 December 2012 (UTC)[reply]

User:Basvb/Current RCE images - List of images from the database not uploaded by the bot (afterwards have to be watched for duplicates.)

First test running

[edit]

Created a lot of templates:

Looping over the images from 20000000. Only uploading images which have Rights_rights.notes==http://creativecommons.org/licenses/by-sa/3.0/ , data gets flattened into key values from json. Some fields occur more than once (see {{RCE data ingestion layout}}). Some of the fields available for normal users are not available in the api (for example municipality). What do do:

Multichill (talk) 22:17, 23 December 2012 (UTC)[reply]

Assigned to Progress Bot name Category
Multichill, Basvb Mostly done BotMultichill

Threat and deletion

[edit]

In June 2014 me and some other users got threatened by the RCE to delete 5000 images "or else". The strong arming worked and the images were deleted in July 2014. I don't respond kindly to threats so this marked the end of this project. In the end about 465,000 images were uploaded. Multichill (talk) 12:30, 6 July 2014 (UTC)[reply]