English subtitles for clip: File:Internet-archive-brewster-kahle-2013-0329.webm

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
1
00:00:03,113 --> 00:00:05,272
This is the scanning center of the Internet Archive.

2
00:00:05,272 --> 00:00:10,424
We operate about 30 of these in eight countries.

3
00:00:10,424 --> 00:00:12,816
Most of them are smaller, a couple of them are larger.

4
00:00:12,816 --> 00:00:17,652
In this one we're doing books, microfilm and movie film.

5
00:00:17,652 --> 00:00:22,153
We designed and built our own scanners. We tried the robots,

6
00:00:22,153 --> 00:00:27,276
couldn't get them to work. Umm, there's two professional-grade

7
00:00:27,276 --> 00:00:30,829
digital cameras, museum-grade lighting, and they raise and

8
00:00:30,829 --> 00:00:36,106
lower glass to go and get a really good image, to flatten the image.

9
00:00:36,106 --> 00:00:39,742
You could and go and take more pictures and then try to get rid of it with software,

10
00:00:39,742 --> 00:00:43,537
but it ends up looking all kind of crappy, well, kind of like, well Google Books.

11
00:00:43,537 --> 00:00:47,666
Umm, so we are really interested in getting really, really good images,

12
00:00:47,666 --> 00:00:53,715
and then we crop, deskew, upload, process them for about 12 hours,

13
00:00:53,715 --> 00:00:58,652
to go into optical character recognition as well as compression

14
00:00:58,652 --> 00:01:04,457
into different formats, PDF. Then we make it available to the MOBI

15
00:01:04,457 --> 00:01:09,452
for Kindle and different kinds of ePubs for the Nooks, etc.

16
00:01:09,452 --> 00:01:13,005
And if it's Public Domain anybody can download them in bulk.

17
00:01:13,005 --> 00:01:16,549
We actively <i>encourage</i> people like Aaron Swartz to go and download

18
00:01:16,549 --> 00:01:20,758
millions of books at a time. We publish tools on how to do it.

19
00:01:20,758 --> 00:01:27,494
This is what libraries are for! So we think that there's a reason

20
00:01:27,494 --> 00:01:32,664
that we should be accessible in bulk as well as one-sy, two-sy.

21
00:01:35,322 --> 00:01:40,895
This is a movie scanner. You know, this is YouTube before YouTube, right.

22
00:01:40,895 --> 00:01:46,886
These are home movies, often, and they give a very direct view

23
00:01:46,886 --> 00:01:50,778
of what it is the 20th century was like. We have a very visual generation,

24
00:01:50,778 --> 00:01:56,522
so when I think of my generation, myself as very literate, literature-oriented

25
00:01:56,522 --> 00:02:00,487
towards imagining and understanding what the world was like, but if we

26
00:02:00,487 --> 00:02:04,463
can basically use moving images as a mechanism to understand what

27
00:02:04,463 --> 00:02:07,827
the world was like, that communicates with this new generation.