Commons:Batch uploading/Los Angeles County Museum of Public Art

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
  • Source to upload from:

collections.lacma.org

  • Describe the works to be uploaded in detail (audio files, images by …):

« The Los Angeles County Museum of Public Art has released some 20,000 PD images of their collection ([1], example: [2]). » Jean-Fred (talk) 14:00, 13 March 2013

  • Which license tag(s) should be applied?


Opinions

[edit]
  • Unless someone wants to pick this up early, I would be happy to look at it in a few weeks, it seems right up my alley. -- (talk) 17:00, 16 March 2013 (UTC)[reply]
  • How it will be done? There are not only PD works. I think LACMA could create some xml file for us. Dominikmatus (talk) 06:20, 20 March 2013 (UTC)[reply]
    • A easy test seems to be whether the image is marked as "Image not zoomable due to copyright restrictions"1, has a copyright note (looking like <div class="field-name-field-copyright-text">© John Baldessari</div>[3]) or whether it has a download link. The text does not quite match the Terms of Use[4] and for that reason I would email LACMA just to explain what Wikimedia Commons is and which photographs were going to be uploaded. I doubt that the LACMA could offer much more that is already in the online gallery as curator and conservation notes and so forth, are likely to have unclear copyright. (BTW some images have extensive curator notes available.[5] I would check whether the full text is intended to be reusable, off-limits as "Protected Content" as defined in the Terms of use, or whether a limited extract might be okay, such as the first 50 words as I have done with other batch uploads.) -- (talk) 07:52, 20 March 2013 (UTC)[reply]
    • An easy filter in LACMA's website search is "has_unrestricted_image", so this might either be better than the above checks or be run in addition to them. See this example search http://collections.lacma.org/search/site/?f[0]=bm_field_has_image%3Atrue&f[1]=im_field_chronology%3A14337&f[]=bm_field_has_unrestricted_image%3Atrue for unrestricted images of ancient artefacts (1,816 images); in practice the upload might usefully be staged by chronology as by default this starts with the least possibly contentious in terms of copyright. -- (talk) 09:09, 21 March 2013 (UTC)[reply]
  • Question I am close to finishing a nice mapping using BeautifulSoup to general image description pages, but I have problem in the way LACMA appear to have "updated" their website. We have an prior upload of an 18th C. waistcoat at a very high resolution of 4,000x6,000+ px. The original source is at [6] but I cannot see a way of getting from the current catalogue entry [7] to that old version. The new system shows 5 images, the first duplicates the old upload but is half the resolution (when the expanded button is selected) whilst the other 4 are good detail shots that appear clipped from the high resolution one we already have. Unfortunately id=159291 is the only relevant old reference and there is no mention of that number anywhere on the new catalogue entry, or the id's for its images. -- (talk) 22:37, 21 March 2013 (UTC)[reply]
I think, we should write email to LACMA with this problem. It is not good (for SEO) to change URL without redirection. Dominikmatus (talk) 09:42, 22 March 2013 (UTC)[reply]
Yes, I was coming to the conclusion that should be my next step. It might not be solvable technically, so if LACMA cannot, or don't have the time, to help, then the solution might be to go ahead with the batch upload even if a few files will be scaled down (but still high quality) duplicates of some high resolution photos we already have. I'll do my best not to be left in that position and I'll start drafting up an email - no hurry as I don't expect a same day answer from the museum on a Friday. :-) -- (talk) 10:29, 22 March 2013 (UTC)[reply]
I have written today to the web contact at LACMA and asked about how to use the d/b id to track down the large resolution image and whether it is okay to scrape the text from the catalogue entry (such as curator notes). -- (talk) 10:35, 26 March 2013 (UTC)[reply]
I did a bunch of those high-res downloads (by hand). I'll be interested to see what LACMA says. - PKM (talk) 01:12, 6 April 2013 (UTC)[reply]
No response yet. I might get on with an initial batch for testing as soon as the mobile upload problem is resolved, rather than expecting a reply. -- (talk) 01:56, 6 April 2013 (UTC)[reply]
My recent experience in handling non-identical duplicates may help with the lack of an obvious unique ID, I will trial some passive testing to check this out in a couple of weeks time. -- (talk) 11:57, 5 July 2013 (UTC)[reply]
First 1,000 uploads look good now, after a few tweaks. Going ahead with upload. Non-identical duplicates are detected by searches like this of "LACMA" with the accession number. If there is any match then the image upload is skipped. This might skip a few valid alternatives, but it will probably be a very small number. -- (talk) 15:42, 11 July 2013 (UTC)[reply]
Exceptions: File:Pendant LACMA M.76.97.948.jpg appears to have digital damage at source, comment raised with LACMA. -- (talk) 06:27, 12 July 2013 (UTC)[reply]

Progress

[edit]
Job Assigned to Progress Links
Code and initial batch (some ancient artefacts) (talk) Status:    Done Images from LACMA uploaded by Fæ

More than 500 uploads. One glitch turned out to be due to the same image on the source site being displayed as different views. Identifying the date in the text turned out to be a challenge and needed a bit of hacking, but seems to be working consistently now.

Resolve multiple view artefacts (talk) Status:    Done Examples: Ax2 Bx2 Cx3 Dx4 Ex11 Fx12 Gx35
Inform LACMA (talk) Status:    Done
Create digestion template (talk) Status:    Not needed Defaulting to {{Artwork}}
Complete upload (talk) Status:    Done

110% completed (estimate)

   

Promote to community (talk) Status:    Done Notice on COM:VP#20,000 photographs for Art History enthusiasts