GPU based DCP encoding

Anything and everything to do with DCP-o-matic.
mytbyte
Posts: 20
Joined: Fri Apr 08, 2016 8:03 pm

Re: GPU based DCP encoding

Post by mytbyte »

Carsten wrote: Mon Feb 05, 2018 1:33 pm Hi Guddu - we have a Ryzen 1700 benchmark here - it's impressing for a single CPU machine...

https://dcpomatic.com/benchmarks/input.php?id=1

- Carsten
Oh, I'm so happy you mentioned Ryzen...I know it's not exactly the topic, but I am underwhelmed - I have Asus ROG with Ryzen 7 1700, total core usage is mostly in 35-40% range, 8 threads are running much lower than other cores, none are anywhere 100% , while average FPS is only 10-11 from an SSD (500 MB/s), 8 fps from hard disk and only 2 at moments...they're 16 bit uncompressed tiffs...I thought frames are not loaded fast enough from hard disk to saturate Ryzen, so I moved them to SSD, but increase in speed is marginal. Number of threads is set to 16.

Any ideas? Why are the cores not saturated like with, FX8320 or at least near 100% ? What can I do to improve this?
Carsten
Posts: 2804
Joined: Tue Apr 15, 2014 9:11 pm
Location: Germany

Re: GPU based DCP encoding

Post by Carsten »

Did you try the Sintel benchmark to find out how it compares to that Ryzen 7 in the list? But at 10-11 fps with TIFFs, I guess you are not so much off.
Try to configure some more threads, like 20-32


- Carsten
mytbyte
Posts: 20
Joined: Fri Apr 08, 2016 8:03 pm

Re: GPU based DCP encoding

Post by mytbyte »

Thanks...I put 32 threads - no change...I'll try Sintel to compare, and I'll try compressed TIFF to see if smaller files improve the fps (guess CPU usage will be higher due to the need to decompress)...
Carsten
Posts: 2804
Joined: Tue Apr 15, 2014 9:11 pm
Location: Germany

Re: GPU based DCP encoding

Post by Carsten »

Hmm, core load should reach near 100% when you add 50-100% more threads...if that doesn't happen, the encoding may be starved at the beginning already.

- Carsten
mytbyte
Posts: 20
Joined: Fri Apr 08, 2016 8:03 pm

Re: GPU based DCP encoding

Post by mytbyte »

I tried 16-bit LZW compressed TIFFs. Performance is even less than with uncompressed TIFFs - 7.4 fps (???) with 25% CPU usage ceiling. Basically, the performance doesn't scale beyond 4 threads (it's a 16 thread CPU), no matter how many threads are specified above 4. The Bunny benchmark was 14 fps average, Sintel was 12.

However, both benchmarks did saturate all threads to constant 100% CPU usage (I could heat the room with laptop's exhaust :mrgreen: ) but I suspect h.264 decoding and scaling made up for the difference to 100%, so basically DCP encoding performance could be roughly the same...

I will test with some other file types when I get the time to see if it's really the source flow that starves the CPU...or the fact that files are 16 bit...
Carsten
Posts: 2804
Joined: Tue Apr 15, 2014 9:11 pm
Location: Germany

Re: GPU based DCP encoding

Post by Carsten »

That's not too far from a previous Ryzen 7 1700x benchmark. You may try to enable High Performance Mode in windows, as that will kill some powercycling penalty. However, for a notebook, you simply can't expect the same performance as from a desktop system.

I don't see why you should expect more performance, compared to other CPUs. Using 2k 16Bit tiff series will need a disc throuput close to 200Mbyte/s. You are pushing it. How much memory do you have installed?

- Carsten
mytbyte
Posts: 20
Joined: Fri Apr 08, 2016 8:03 pm

Re: GPU based DCP encoding

Post by mytbyte »

OK, even with compressed TIFFs which are pretty small (flat-color animation), obviously reading separate files from disk is a bottleneck after all, regardless of file size (I knew that, but this is a bit ridiculous...) ...however, as I just found out, DPX files, be it, 8, 10 or 12 bit, saturate the threads to a constant 100%, just like a MPEG-based single file would... fps is still in cca 14 fps range...

I have 16 GB 2400 DDR4 RAM...I accept the fps compared to other systems, I'm just expected more considering I tested from an SSD and it's a 16 thread CPU...
Carsten
Posts: 2804
Joined: Tue Apr 15, 2014 9:11 pm
Location: Germany

Re: GPU based DCP encoding

Post by Carsten »

It's still not bad for a single CPU, the i7-7820 currently leading the single-CPU-single-machine list cost's twice as much.

I talked a bit with Carl about GPU encoding at times, and currently we are not sure which aspects of the whole processing could pose a bottleneck even if there was a very fast GPU encoding available. We have seen that a decent machine can feed 50fps in the BigBuckBunny network benchmark, but this sends low resolution 8 bit images to many CPUs. It may be that a fast GPU could do 50+ or 100+ frames per second, but that the pre- or post-processing then would still need more time than the j2c encoding.

Interesting to see that DPX is faster. There were some issues around DPX interpretation lately that made me hesitate about recommending DPX (too many gamma/range encoding choices).

Most of the initial file handling is done by ffmpeg, and especially image series are not speed optimized, because they are rarely used within a realtime workflow I guess.

Then of course, we know that many software/compilers are not yet optimizing for Ryzen.

I have seen Ryzen AM4 mainboards for as little as 60€, I guess it's still cheap to build a low-cost render slave if e.g. you use some low cost components around a Ryzen. I have done a lot of benchmarking, but so far never got my hands on a Ryzen for testing.

- Carsten
mytbyte
Posts: 20
Joined: Fri Apr 08, 2016 8:03 pm

Re: GPU based DCP encoding

Post by mytbyte »

I have no special knowledge as to how DPX is different to other picture formats and why decoding it takes all threads to the max...someone should ask ffmeg developers, I guess

Further testing, including moving source files to RAMDISK, doesn't do anything for performance(!) nor does skipping colorspace conversion (also I guess there is no scaling going on since it's just padding form 1920x1080 to 1998x1080)... Seems that the whole pre-encoding process may be compromised i.e not optimised for picture files and/or multiple threads)...please note that playing uncompressed 16 bit TIFFS from SSD and RAMDISK in Resolve results in full 25 fps playback (no caching) from the go.

That said, GPU encoding does seem a bit doubious. Perhaps for MPEG sources, decoding could be kept on CPU while j2C encoding in GPU so that the CPU doesn't need to do both decoding and encoding - that might bring about some performance gain with GPU encoding...
checcontr
Posts: 5
Joined: Tue Feb 04, 2020 6:10 pm

Re: GPU based DCP encoding

Post by checcontr »

hi everyone,

i m so sorry but i m desperate...i m spending 16-18 hours for making a dcp because my pc is old and old.... but my question is: does amazing dcpomatic still uses the cpu or starts to use the gpu for rendering? all the best