Machine learning algorithms for video processing typically work on frames (images) rather than video.
In a typical use-case, FFmpeg can be used to extract images from video – in this example, a 50-frame sequence starting at 1:47:
>ffmpeg -i input.vid -vf “select=’gte(t,107)*lt(selected_n,50)'” -vsync passthrough ‘107+%06d.png’
Omit the -vf option if extracting the entire video. . . . → Read More: Using FFmpeg to replace video frames