View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0002636 | DCP-o-matic | Bugs | public | 2023-10-19 21:37 | 2023-10-22 21:46 |
| Reporter | carl | Assigned To | carl | ||
| Priority | normal | Severity | minor | Reproducibility | N/A |
| Status | acknowledged | Resolution | open | ||
| Target Version | 2.16.x | ||||
| Summary | 0002636: Check vectorisation of rgb_to_xyz in libdcp | ||||
| Description | Does it get vectorised? If not, can we re-arrange it? And does doing any of that make any difference in a benchmark? | ||||
| Tags | No tags attached. | ||||
| Branch | |||||
| Estimated weeks required | |||||
| Estimated work required | Undecided | ||||
|
|
Changed CXXFLAGS to conf.env.append_value('CXXFLAGS', ['-O3', '-msse2', '-fopt-info-vec-missed', '-ftree-vectorize', '-fno-trapping-math']) I think only -O3 is necessary..? Typical runs of benchmark in the region of carl@shankly:~/src/libdcp$ time run/benchmark rgb_to_xyz real 0m11,145s Not vectorised because "control flow in loop" Comment out the / Out gamma LUT / and the clamping and carl@shankly:~/src/libdcp$ time run/benchmark rgb_to_xyz real 0m1,708s ~10x quicker (and no mention in the diagnostics of why it couldn't vectorise). Maybe it's better not to use a LUT for that? it's piecewise and requires a ternary |
|
|
Does |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2023-10-19 21:37 | carl | New Issue | |
| 2023-10-19 21:37 | carl | Assigned To | => carl |
| 2023-10-19 21:37 | carl | Status | new => acknowledged |
| 2023-10-20 01:56 | carl | Note Added: 0006038 | |
| 2023-10-20 01:56 | carl | Note Added: 0006039 | |
| 2023-10-22 21:46 | carl | Target Version | 2.16.67 => 2.16.x |
| 2023-10-22 21:46 | carl | Estimated work required | => Undecided |