Hi,
I am running into some speed issues with conversion to dcp. Some of the posts talk about switching back to 2.11.9 but that is not available on the download link. Is there another link that I missed to this version ?
Thanks
Access to older versions of dcpomatic
-
- Site Admin
- Posts: 2548
- Joined: Thu Nov 14, 2013 2:53 pm
Re: Access to older versions of dcpomatic
I don't keep all the old versions around as it's too expensive to pay for the hard disc space on the server! Especially with test versions.
Can you link the post that suggests going back to 2.11.9, or describe your problems in a bit more detail?
Can you link the post that suggests going back to 2.11.9, or describe your problems in a bit more detail?
-
- Posts: 10
- Joined: Fri Jun 16, 2017 3:32 pm
Re: Access to older versions of dcpomatic
I have several encoding servers that when run stand-alone add up to 60-70ps but when running through the batch processor over a 10gb network I only get 25-30fps.
The post had talked about a test that someone had done that caused a dcp conversion to slow down from 9fps to 4fps when they upgraded to 2.6.4 but reverted back to 9fps when they downgraded to 2.11.8 or 9 . I will look for the post.
I used the process that Carsten used to approximate the cpu passmark rating to the number of fps.
All tests used the sintel movie at 2k
Thanks
The post had talked about a test that someone had done that caused a dcp conversion to slow down from 9fps to 4fps when they upgraded to 2.6.4 but reverted back to 9fps when they downgraded to 2.11.8 or 9 . I will look for the post.
I used the process that Carsten used to approximate the cpu passmark rating to the number of fps.
All tests used the sintel movie at 2k
Thanks
-
- Posts: 2804
- Joined: Tue Apr 15, 2014 9:11 pm
- Location: Germany
Re: Access to older versions of dcpomatic
I think you are misinterpreting a bug that caused DCP conversion to MP4/ProRes ('Export') to be slow - not conversion into J2K/DCP. That was somewhere between 2.11 and 2.12. That is fixed. I am not aware of any version related J2K encoding slowdown.
When you are in the 30-50fps ballpark, I don't think you can increase encoding speed much with remote encoders, not even with 10GE I think. What is your source footage format?
We have seen network encoding speeds of around 50fps with 1GE - but that has been with BigBuckBunny Benchmark - that sends 'tiny' 853/480/8Bit images to encoding servers. I think it is tough to achieve the same with higher quality source footage. I think Carl once mentioned that source images are sent to remote encoding servers compressed lossless - but I guess that compression has it's limits as well as far as bandwidth saving and master/slave CPU resources go. Maybe with 10GE, that could be a bottleneck over raw images. Maybe Carl can add some config options to allow proper diagnosing. Maybe if you send an encoding log to him, that would help as well.
We recently had a few threads here and private discussions about very fast remote encoding servers not performing as expected. I think it's not easy to overcome this. The only solution, and especially if you go 12/16 Bit images, and 4k, is to have as many local cores as possible, that is, a dual CPU Xeon machine.
I think commercial encoding solutions solve this by using source content strictly in reels organized on a very fast shared network. So each remote encoder has very fast access to it's own reel, and the content doesn't need to travel over the network at encoding time - neither source, nor MXF/J2K.
Would be happy to dive into this more deeply, but I think with more faster/cheaper multicore CPUs arriving, and increasing need for 4k, we will experience bottlenecks there, even with 10GE
- Carsten
When you are in the 30-50fps ballpark, I don't think you can increase encoding speed much with remote encoders, not even with 10GE I think. What is your source footage format?
We have seen network encoding speeds of around 50fps with 1GE - but that has been with BigBuckBunny Benchmark - that sends 'tiny' 853/480/8Bit images to encoding servers. I think it is tough to achieve the same with higher quality source footage. I think Carl once mentioned that source images are sent to remote encoding servers compressed lossless - but I guess that compression has it's limits as well as far as bandwidth saving and master/slave CPU resources go. Maybe with 10GE, that could be a bottleneck over raw images. Maybe Carl can add some config options to allow proper diagnosing. Maybe if you send an encoding log to him, that would help as well.
We recently had a few threads here and private discussions about very fast remote encoding servers not performing as expected. I think it's not easy to overcome this. The only solution, and especially if you go 12/16 Bit images, and 4k, is to have as many local cores as possible, that is, a dual CPU Xeon machine.
I think commercial encoding solutions solve this by using source content strictly in reels organized on a very fast shared network. So each remote encoder has very fast access to it's own reel, and the content doesn't need to travel over the network at encoding time - neither source, nor MXF/J2K.
Would be happy to dive into this more deeply, but I think with more faster/cheaper multicore CPUs arriving, and increasing need for 4k, we will experience bottlenecks there, even with 10GE
- Carsten
-
- Posts: 10
- Joined: Fri Jun 16, 2017 3:32 pm
Re: Access to older versions of dcpomatic
In lots of discussions you point to benchmarks using 12 iMacs to render. Is there a level at which the host machine can not hand out enough data to keep the remote servers busy. What is the best kind of machine suited to being the host? If so is there a way to limit a host to specific encoding servers so that I can setup multiple host/encode groups.
All of source footage is sintel benchmark.
We are just starting into this years processing for the Traverse City Film Festival and are topping out at half of the theoretical output ( obtained by running source on each machine to get local fps then adding all results to get theoretical total )
Maybe I could run some 16fps servers in pairs to hopefully achieve real-time processing .
Any recommendations would be helpful for best host and batch config processing would be appreciated.
Thanks
All of source footage is sintel benchmark.
We are just starting into this years processing for the Traverse City Film Festival and are topping out at half of the theoretical output ( obtained by running source on each machine to get local fps then adding all results to get theoretical total )
Maybe I could run some 16fps servers in pairs to hopefully achieve real-time processing .
Any recommendations would be helpful for best host and batch config processing would be appreciated.
Thanks
-
- Posts: 2804
- Joined: Tue Apr 15, 2014 9:11 pm
- Location: Germany
Re: Access to older versions of dcpomatic
That 12iMac benchmark is the one using BigBuckBunny - it uses only 853/480/24 Bit images. If DCP-o-matic actually uses on the fly compression, this footage will also compress nicely. 24fps at 853/480/24Bit uses only 30MByte/s network bandwith to send the source footage to the remote encoding servers. The J2C images that come back will also be comparably small (about 0.5Mbyte each). So, well below the saturation level of a 1GE network.
Unfortunately, there doesn't seem to be benchmark data for this network with the Sintel video. You would say that 2048*872/24 for Sintel would make a difference, but not as much as you experience.
I guess for a festival situation, with many individual files, and very fast physically separate servers, the best approach would be not to distribute the source footage frames over the network, but to copy the individual submission videos to remote servers and issue a local conversion there. That would be a very different batch converter.
Can you give us more details on all the systems you use?
Part of the encoding process on the MasterGUI/Batch converter is single threaded. As such, I would think that the best setup would be to have a CPU with a decent amount of HW threads (like a 6c/8c HT machine), that is able to clock very high automatically when less threads are used. Some of the more recent Intel and AMD Desktop CPUs allow this.
https://www.cpubenchmark.net/singleThread.html
You could still allow the local machine to participate in the J2K encoding or not. The more remote encoding power you need to feed, the less J2K encoding should be allowed on the Batch master in order for it to not starve the remote servers. I am very interested in that sort of optimization and benchmarking, but I don't have access to such systems, unfortunately. Some performance optimization can be done by closely looking at the CPU load on local and remote systems, as well as network and mass storage throughput.
Maybe Carl could give us more insight into which parts of the local software threads are now multithreaded and would benefit from more cores, and which are single threaded and benefit from a higher clock speed.
- Carsten
Unfortunately, there doesn't seem to be benchmark data for this network with the Sintel video. You would say that 2048*872/24 for Sintel would make a difference, but not as much as you experience.
I guess for a festival situation, with many individual files, and very fast physically separate servers, the best approach would be not to distribute the source footage frames over the network, but to copy the individual submission videos to remote servers and issue a local conversion there. That would be a very different batch converter.
Can you give us more details on all the systems you use?
Part of the encoding process on the MasterGUI/Batch converter is single threaded. As such, I would think that the best setup would be to have a CPU with a decent amount of HW threads (like a 6c/8c HT machine), that is able to clock very high automatically when less threads are used. Some of the more recent Intel and AMD Desktop CPUs allow this.
https://www.cpubenchmark.net/singleThread.html
You could still allow the local machine to participate in the J2K encoding or not. The more remote encoding power you need to feed, the less J2K encoding should be allowed on the Batch master in order for it to not starve the remote servers. I am very interested in that sort of optimization and benchmarking, but I don't have access to such systems, unfortunately. Some performance optimization can be done by closely looking at the CPU load on local and remote systems, as well as network and mass storage throughput.
Maybe Carl could give us more insight into which parts of the local software threads are now multithreaded and would benefit from more cores, and which are single threaded and benefit from a higher clock speed.
- Carsten