Commons:Requests for comment/Technical needs survey/Video with multiple audio tracks

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Previous proposal Overview page Next proposal

Video with multiple audio tracks[edit]

Description[edit]

  • Problem description: i was just thinking of making a video tutorial for v2c, or it could be a cooking recipe. in consideration of different languages, it would be best to make a video suitable for audio commentary in all languages. then other users can make their dubs and add them to the videos.

    here comes the problem. to add additional soundtracks to a video, the whole video has to remuxed. it's a daunting task for many people, and each new edit would mean creating new versions of a video, i.e. lots of redundant big files.--RZuo (talk) 18:40, 29 December 2023 (UTC)[reply]

  • Proposal type: feature request
  • Proposed solution: additional soundtracks can be hosted separately, and during video playback, users can choose which tracks to play. that would allow playing different languages, or enjoying a movie (with the original sound) while listening to a commentary soundtrack.

    youtube is just experimenting multiple tracks https://support.google.com/youtube/thread/129769858/updates-to-captions-and-audio-features-on-youtube .--RZuo (talk) 18:40, 29 December 2023 (UTC)[reply]

    i think i should clarify my concept.
    it's probably not to enable a video file with multiple soundtracks, because adding soundtracks to a video is harder?
    it's to have a video file (with or without a single soundtrack), and separate audio files that are soundtracks to this video.
    during playback, users can select which soundtrack to play, just like how users can select which timedtext file for cc now. RZuo (talk) 10:44, 7 January 2024 (UTC)[reply]
  • Phabricator ticket:
  • Further remarks: copied from https://commons.wikimedia.org/w/index.php?title=Commons:Idea_Lab&oldid=836423569#Video_with_multiple_audio_tracks .

Discussion[edit]

Votes[edit]

  •  Support It's a large strain on storage space and just inconvenient and a burden on users and contributors to have separate video files for each languages rather than audio tracks. See the categories about redubbed explanatory videos in this new cat as an example of how this can be useful: Wikimedia projects and audio.
It would make the site more multilingual, improve global education, access, reduce storage requirements (example) etc. Moreover, there really should be machine translated video captions for all languages where that's feasible, people could use that for manmade caption, for example because the texts only need to be edited and are already set to the right timings. That's a separate issue though. Moreover, once a video has captions and separate audio tracks, AI-generated voice, which recently dramatically improved, could be used for auto-redubbed videos (audio-tracks per language) – sometimes also manually renarrated videos by WMC narrators – which could substantially improve global education and the usefulness of files of WMC. That's also another issue. Copied this over from Commons:Idea Lab. However, I wouldn't consider this a top important issue, just an important one where the sooner it's done the better and one where the potential benefit can be large mainly due to easily redubbed videos. --Prototyperspective (talk) 23:50, 29 December 2023 (UTC)[reply]
...Machine translation and AI voices aren't nearly at a level where that's feasible. Adam Cuerden (talk) 05:51, 3 February 2024 (UTC)[reply]
Say that again after using DeepL translator and listening to these free AI-narrated audiobooks. I didn't say humans aren't involved anymore, it just speeds up things by a lot. People already have the roughly translated text for the exact timing rather than writing and timing everything from scratch and only need to slightly edit things. I didn't say we're there yet but we're certainly close and you'd have to give a good reason for why the two examples above don't demonstrate we're actually are there already. Would be useful to have e.g. explanatory videos redubbed this way. It's already very feasible with an issue being that this AI voice tech not yet being free. Prototyperspective (talk) 11:44, 3 February 2024 (UTC)[reply]
Basically, it's a niche tool, and, while it would be amazing in that niche, the WMF track record hasn't been great for such improvements. Simple projects like the reply buttons on talk pages? Great. Something they'll put all possible resources into fixing, like VisualEditor, amazing. But this is in the same middle ground as problematic projects like the half-broken MediaViewer and abandoned Flow (a talk page rethink that failed). I don't think that we could reasonably get the massive resources needed. Adam Cuerden (talk) 05:48, 3 February 2024 (UTC)[reply]
Briefly: AI things are just related to but not part of the proposal. There are very many videos with just voice and these are usually the most useful. Concrete example: all of these. The number of useful free videos we get may rise if we'd offer this. Regarding the cost-benefit ratio: a) consider the redundancy & storage space avoided this way b) it depends on the cost of implementation and maybe somebody could come up with a low-effort way for this (e.g. just an ffmpeg command being run if somebody clicks on "upload new audio track"). Prototyperspective (talk) 11:51, 3 February 2024 (UTC)[reply]
Basically, I do agree the idea is extremely useful in the specific set of cases it's valid for. I just have doubts as to whether there's enough resources available to actually do it when it only replis to a relatively small set of videos. It'd probably also need a certain amount of review of the new audio track uploads for accuracy.
If the WMF had a better track record with this sort of thing, I'd be more enthusiastic. Adam Cuerden (talk) 19:25, 3 February 2024 (UTC)[reply]