FWIW: The issue is keyframes. If there aren't enough of them, then (I think) moving to another frame requires pix_film to look for the preceding keyframe and render n deltas on top of that. Assuming that was the case, if you increase the number of keyframes, then the number of deltas required could be controlled.
So I just recoded part of a video so that every other frame is a keyframe: ffmpeg -ss (secs_to_skip) -i (original file).mp4 -t 20 -vcodec libx264 -an -x264-params keyint=2 (output file).mp4
and Pd's CPU usage with pix_film dropped from maxing out one core down to 9-10%.
Maybe that will help someone later.
I have a feeling, for the next iteration of this course, that I'll probably use Ofelia with your abstractions. (Edit: FWIW, performance of random frame access in Ofelia is no better than in Gem -- though Ofelia seems to use multiple threads distributed between cores. In any case, in Ofelia as well, it seems to be necessary to encode with a lot of keyframes.)
hjh