Commons:YouTube files/Downloading
This article describes how you can download videos and subtitles from YouTube.
YouTube's own download option
[edit]If you are downloading a video from your own YouTube account, you can try following the instructions here: "Download your own YouTube videos"
Software, tools or scripts
[edit]yt-dlp
[edit]Single video download
[edit]yt-dlp (download page) is an open-source tool available as package in many GNU/Linux distributions, as well as macOS and Windows. However, yt-dlp does not offer a graphical user interface.
Command line example for download of a single video:
yt-dlp -o "%(title)s-%(id)s.%(ext)s" --match-filters "license=Creative Commons Attribution license (reuse allowed)" --embed-metadata --embed-subs --parse-metadata "CC-BY-3.0:%(meta_license)s" --parse-metadata "%(tags)l:%(meta_keywords)s" --parse-metadata "%(categories)l:%(meta_subject)s" --parse-metadata "%(uploader_url)s:%(meta_artist_url)s" --parse-metadata "%(channel_url)s:%(meta_channel_url)s" --sponsorblock-remove default -f bestvideo+bestaudio --recode-video webm --continue --retries 4 --ignore-errors https://www.youtube.com/watch?v=aaaaaaaaaaa
Important arguments to note are:
- -f bestvideo+bestaudio downloads both video and audio tracks. For some videos, removal of the audio track may be desired as the audio track may be unnecessary background music or may have an incompatible license to the Creative Commons licensed video track. Change this -f argument to -f bestvideo to ignore the audio track and only download the video track.
- --embed-metadata embeds basic YouTube video metadata including source URL, YouTube user handle, video title, description and chapter markers for longer videos. This metadata can sometimes include advertisements or promotional links that are advisable to remove either by referring to the yt-dlp manual for metadata modification arguments, or by manual metadata modification process applied after download. Additional --parse-metadata arguments can be added to extract and perform basic manipulation of other YouTube video metadata. The WEBM container format uses the metadata fields defined in the Matroska Media Container Format Specifications.
- --sponsorblock-remove default uses the crowd-sourced SponsorBlock database to remove segments from a video that are tangential to the content, including opening animations, closing credits and requests to subscribe, like and comment on YouTube videos. Although unlikely to occur in Creative Commons licensed content, embedded advertisement segments are also removed in this default mode. Refer to the yt-dlp manual for other SponsorBlock options for which tangential content of a video to keep or remove.
- --match-filters "license=Creative Commons Attribution license (reuse allowed)" double checks that the video metadata in YouTube has a Commons-compatible CC-BY-3.0 license. If you are sure the video is in the public domain or otherwise has a different Commons-compatible license, omit this argument.
- --parse-metadata "CC-BY-3.0:%(meta_license)s" should be changed to a different SPDX license identifier if the video is in the public domain or otherwise has a different Commons-compatible license.
Bulk video download
[edit]Sometimes an entire channel or all videos of a particular user account may be Creative Commons licensed. To download an entire channel or user account of videos at once, use yt-dlp's --download-archive argument which creates a text file database to keep track of which videos have been downloaded. It is then possible to cancel this archiving command and resume it at a later time.
Command line example for downloading all videos of a channel or user handle:
yt-dlp -o "%(title)s-%(id)s.%(ext)s" --match-filters "license=Creative Commons Attribution license (reuse allowed)" --embed-metadata --embed-subs --parse-metadata "CC-BY-3.0:%(meta_license)s" --parse-metadata "%(tags)l:%(meta_keywords)s" --parse-metadata "%(categories)l:%(meta_subject)s" --parse-metadata "%(uploader_url)s:%(meta_artist_url)s" --parse-metadata "%(channel_url)s:%(meta_channel_url)s" --sponsorblock-remove default -f bestvideo+bestaudio --recode-video webm --continue --retries 4 --ignore-errors --download-archive example_user_archive.txt https://www.youtube.com/@example_user/videos
Video metadata review
[edit]After completing the download of YouTube videos per the examples provided above, metadata of the resulting WEBM files can be reviewed with the following ffprobe (utility bundled with ffmpeg open-source software) command line example:
ffprobe downloaded_video.webm
Single audio download
[edit]yt-dlp supports downloading YouTube videos with a Creative Commons licensed or public domain audio track as an audio file (no video track). This may be desired particularly if the video track is a static image or unrelated animation which provides no value in addition to the audio track sought.
Command line example for download of a single audio track:
yt-dlp -o "%(title)s-%(id)s.%(ext)s" --match-filters "license=Creative Commons Attribution license (reuse allowed)" --embed-metadata --parse-metadata "CC-BY-3.0:%(meta_license)s" --parse-metadata "%(tags)l:%(meta_keywords)s" --parse-metadata "%(categories)l:%(meta_subject)s" --parse-metadata "%(uploader_url)s:%(meta_artist_url)s" --parse-metadata "%(channel_url)s:%(meta_channel_url)s" --sponsorblock-remove default -f bestaudio -x --audio-format opus --continue --retries 4 --ignore-errors https://www.youtube.com/watch?v=aaaaaaaaaaa
Important arguments to note are:
- The OPUS container format uses the metadata fields defined in the Ogg Vorbis I format specification: comment field and header specification. For basic metadata fields such as title, artist and license, there is no difference with the Matroska metadata fields used for the WEBM container format.
- --sponsorblock-remove default uses the crowd-sourced SponsorBlock database to remove segments from an audio track that are tangential to the content, including non-music portions of music videos. Although unlikely to occur in Creative Commons licensed content, embedded advertisement segments are also removed in this default mode. Refer to the yt-dlp manual for other SponsorBlock options for which tangential content of a audio track to keep or remove.
Bulk audio download
[edit]Command line example for downloading all audio tracks of videos within a channel or user handle:
yt-dlp -o "%(title)s-%(id)s.%(ext)s" --match-filters "license=Creative Commons Attribution license (reuse allowed)" --embed-metadata --parse-metadata "CC-BY-3.0:%(meta_license)s" --parse-metadata "%(tags)l:%(meta_keywords)s" --parse-metadata "%(categories)l:%(meta_subject)s" --parse-metadata "%(uploader_url)s:%(meta_artist_url)s" --parse-metadata "%(channel_url)s:%(meta_channel_url)s" --sponsorblock-remove default -f bestaudio -x --audio-format opus --continue --retries 4 --download-archive example_user_archive.txt https://www.youtube.com/@example_user/videos
Audio metadata review
[edit]After completing the download of YouTube audio tracks per the examples provided above, metadata of the resulting OPUS files can be reviewed with the following ffprobe command line example:
ffprobe downloaded_audio.opus
JDownloader
[edit]JDownloader allows downloading all videos from a specific account (useful for large numbers of videos) and specific formats, but is partly proprietary.
Youtube Subtitle Downloader
[edit]If the YouTube video has subtitles, Youtube Subtitle Downloader can be used to download them in SRT format. This is an userscript, you need to install Greasemonkey, Tampermonkey or Violentmonkey browser extension before using the userscript.
Website / Web app
[edit]You can search "youtube downloader" or "download youtube video" on Google to find websites / web apps that allow you to enter a YouTube video URL and get download links in many video formats.
To download subtitles, you can search "youtube subtitle download".
Note that these sites could track what you are trying to get, but should be less risky than "unknown code from unknown 3rd party" browser extensions.
Here is a popular Firefox addon for downloading Youtube videos that has been around for years:
Conversion
[edit]Files not available in WebM have to be converted into WebM or Ogg Theora. See: Help:Converting video. If a file from youtube uses VP9 video, but vorbis audio, it can be converted using ffmpeg (or avconv) using a command like:
avconv -i inputfile.webm -acodec copy -vcodec libvpx out.webm
If you can get the (typically best) MP4 dash video, you might also want the M4A audio, to mux and transcode both into OGV or WebM, example:
ffmpeg -i input.mp4 -i input.m4a -f ogg -c:v libtheora -q:v 9 -c:a libvorbis -q:a 6 output.ogv
WebM
[edit]In May 2010, Google introduced the free WebM video encoding format, and many videos from YouTube are available under this format. As of November 2012, Wikimedia Commons accepts WebM uploads. If videos are available in .webm format on YouTube, they may not need to be converted to a different format before uploading them to Commons. However, often the VP9 video streams from youtube do not have audio, and need to be recombined with an audio stream before being uploaded to Commons.
Moving to Commons with youtube2mediawiki
[edit]If the YouTube video is available in the WebM format and you are comfortable with running Python scripts on your machine, one possibility is to use youtube2mediawiki:
- Download and install python 2.7
- Download youtube2mediawiki (e.g. as ZIP-file)
- Extract the archive and use the command line to start the python script.
See also
[edit]- Commons:Video
- Commons:Chunked uploads for files with 100MB or more
- Commons:video2commons
- Help:Converting video#WebM with mkvtoolnix
- Look at the Frequently Asked Questions.
- If you place
{{helpme}}
on your talk page, a volunteer will visit you there as soon as possible! - Join the #wikimedia-commons IRC channel for real-time chat. New to IRC? Click here to be connected instantly!
- Go to the Commons Help Desk.