[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: how to save video on web page



On Tuesday, April 7, 2020 12:27:20 PM PDT David Wright wrote:

> On Tue 07 Apr 2020 at 19:11:57 (+0300), Anastasios Lisgaras wrote:

> > Youtube-dl <https://github.com/ytdl-org/youtube-dl> is indeed a

> > powerful and very good software for this job with many features and

> > options, but can you download videos *from anywhere ?*

> >

> > What I want to say is that there are many web pages which greatly

> > hinder (prohibit) this possibility.

> > In this case, what can we do? Can we always find the hidden link

> > (source) of the video? If so, how?

> > If the page requires you to be logged in, what can we do?

>

> I'm not sure what the implications are of having to login to a site.

> But in general you need different tools for different web sites.

> The BBC iplayer and youtube-dl are two such tools, and sometimes

> a download link is even available, which either the browser or

> wget can use (the latter preserving the metadata).

>

> Where videos exist in their entirety, some sites still play them

> by downloading to a temporary file (and you can see the

> download in the progress bar, ahead of what's actually playing.

> A technique there is to examine /proc/N/fd where N is the

> process number of the browser tab. (The process name used to

> be xul-runner, Web Content etc, and looks as if it's currently

> /usr/lib/firefox-esr/firefox-esr -contentproc.)

> If you find an fd number F that's pointing to a file (deleted) in

> /tmp, then try copying that /proc/N/fd/F (following links). Do it

> when the download progress bar has reached the end, but the file

> is still playing. (Sometimes everything disappears as soon as the

> end is reached.)

>

> Another technique is where the source is streaming (and might be

> open-ended). Here, the video can end up as fragments in your

> browser cache. How you handle them depends on whether they are

> audiovisual or in two seperate streams, and whether they are

> timestamped. Some are, some aren't. The former are relatively

> easy to reassemble with ffprobe to read the timings and ffmpeg

> to concatenate the pieces (and merge audio/video if necessary).

>

> Where there's no internal timestamping, you can sometimes rely

> on the filesystem's own to figure out the correct ordering.

> But I prefer to run a script that watches files in the cache

> as they are closed (with inotifywait), and immediately copies

> them out (if the filetype is of interest) with a sequence

> number and the file type in the filename. The relevant segments

> can then be concatenated quite easily. A timeformat of

> %Y%m%d-%H%M%S works well as a more meaningful sequence number,

> particularly if you append %N to include nanoseconds for the

> necessary time resolution.

>

> Be aware that the fragments in your cache might not all be

> identified by the file program's defaults. For example, I use

> 0 string G@ TS transport stream

> in ~/.magic to pickup files that file might otherwise label

> as 'data'.

>

> Sometimes, even then, you have to use a little ingenuity for

> the quiet life: eg there's a UK railway site that has three

> webcams (two stations and the yard) which run simultaneously

> on the same web page. Fortunately, each webcam runs with a

> different frame speed, so it's quick and easy to distinguish

> their files and divide them up.

>

> Finally, when all else fails, and if you've read this far,

> you can just capture the screen contents with ffmpeg's

> x11grab and record it to an mpg file. The disadvantages are

> that you capture extraneous screen decorations, and you've got

> to dedicate the whole screen to watching the video, remembering

> to increase your blanking timeout too. If you can only record

> audio through the microphone, you get more extraneous rubbish

> there too.

>

 

That is one comprehensive write up!

Thanks David, today I learned something new thanks to you.

 

--

Ihor Antonov

https://useplaintext.email


Reply to: