November 17, 2014

The maturity of H.265/HEVC video compression on the study of X265 v.1.4

In this post I share my test results on x265 coding efficiency to track the development progress. The x265 version 1.4 was used for this test, as well as the JCT-VC test sequences. It is worth mentioning that the Class E sequence set has changed, as described in JCTVC-O0022. However, for backward compatibility of the test results I still use the old test sequences set.

The two configurations of the x265 encoder are prepared. The "ultra fast" preset is used to measure the maximum achievable performance of the encoder. The command line arguments for this configuration are the following:
x255 -p ultrafast -t psnr --rect --amp --keyint -1 --bframes 3 --ipratio 1.0 --pbratio 1.0 --b-adapt 0
The second configuration uses "very slow" preset to estimate the maximum quality that x265 is able to provide. The command line arguments follow.
x265 -p veryslow -t psnr --rect --amp --keyint -1 --bframes 3 --ipratio 1.0 --pbratio 1.0 --b-adapt 0
Both configurations produce the same GOP structure with one P frame followed by three B frames. All the frames in a sequences are coded with the same quantization parameter, no rate control is applied.
The test computer is based on Intel Core i7-4700HQ CPU, 2.4 GHz and has 8GB RAM. The x265 encoder is built in x64 mode.

The compression efficiency is compared to the HM reference encoder v. 13.0. The "Low Delay Main" configuration file is used as the basis, and the GOP structure was changed to follow PBBB pattern. The comparison to the HM reference encoder is provided in terms of Bjontegaard delta rate (BD-RATE). To be short, the BD-RATE provides an estimation of the average increase (for positive BD-RATE values) or decrease (for negative values) of the compression bitrate of one test set over another. In this very experiment the BD-RATE means the percentage of the HM bitrate provided by the x265 encoder. For instance, the BD-RATE value of 15.89% means that x265 on average produces 15.89% bitrate overhead at the same PSNR quality level compared to the HM. The agerage YUV PSNR is calculated as follows:

\(PSNR_{YUV}=\frac{1}{8}\cdot (6\cdot PSNR_{Y}+PSNR_{U}+PSNR_{V})\)

The BD-Y-RATE is the bitrate change with respect to the Y-PSNR (luma) quality level. The BD-Y-PSNR is the Y-PSNR quality difference for the x265 and the HM encoders at the same bitrate value. Obviously, the BD-UV-PSNR is the UV-PSNR (average chroma PSNRs) quality difference for the x265 and the HM encoders at the same bitrate value.
FPS stands for 'frames per second' and provides the estimation of the average compression speed.

Table 1. The performance results for x265 v.1.4 in "Very Slow" preset
Class Sequence Resolution
to HM PBBB
BD-RATE (%) BD-Y-RATE. % BD-Y-PSNR. dB BD-UV-PSNR. dB FPS
A Traffic 2560×1600 15.89 12.15 -0.37 -0.58 0.44
PeopleOnStreet 12.13 5.96 -0.26 -1.13 0.20
B Kimono 1920×1080 7.95 1.59 -0.04 -0.65 0.53
ParkScene 20.45 15.52 -0.45 -0.77 0.74
Cactus 10.60 6.20 -0.11 -0.24 0.67
BQTerrace -2.48 -14.02 0.26 -0.75 0.75
BasketballDrive 17.75 10.54 -0.18 -0.62 0.54
C RaceHorses (C) 832×480 8.68 4.23 -0.18 -0.78 1.05
BQMall 20.13 15.95 -0.63 -1.00 2.09
PartyScene 18.33 14.06 -0.69 -1.11 1.38
BasketballDrill 21.58 18.35 -0.68 -1.00 2.11
D RaceHorses (D) 416×240 12.03 7.53 -0.39 -1.02 2.71
BQSquare 7.77 3.46 -0.14 -0.66 4.59
BlowingBubbles 16.68 13.23 -0.54 -0.86 3.62
BasketballPass 25.25 20.47 -0.90 -1.50 6.99
E Vidyo1 1280×720 29.05 24.59 -0.65 -0.74 3.58
Vidyo3 23.37 17.11 -0.51 -0.77 2.64
Vidyo4 28.92 21.47 -0.49 -0.88 2.60
F BaskeballDrillText 832×480 20.76 17.15 -0.68 -1.12 2.21
ChinaSpeed 1024×768 26.62 22.30 -1.30 -1.60 1.14
SlideEditing 1280×720 29.78 28.82 -4.23 -2.42 3.03
SlideShow 10.82 12.94 -0.89 -0.19 2.65

The performance results for "very slow" preset are provided in Table 1. As can be seen, the bitrate overhead varies from 8.68% for RaceHorces (C) test sequence to 29.78% for Vidyo1 test sequence. The compression speed is rather slow, but provides the ability to do file-based transcoding of the SD and maybe HD video. Nevertheless, the compression efficiency outlined by the BD-RATE estimation is significantly worse compared to the HM. It may be compensated by good rate control though.

Table 2. The performance results for x265 v.1.4 in "Ultra Fast" preset
Class Sequence Resolution
to HM PBBB
BD-RATE (%) BD-Y-RATE. % BD-Y-PSNR. dB BD-UV-PSNR. dB FPS
A Traffic 2560×1600 58.15 58.80 -1.50 -0.81 10.09
PeopleOnStreet 57.31 52.96 -1.89 -1.57 3.82
B Kimono 1920×1080 62.10 60.37 -1.51 -1.00 8.33
ParkScene 51.18 51.72 -1.28 -0.77 16.10
Cactus 66.26 62.11 -1.06 -0.65 14.95
BQTerrace 67.79 59.40 -1.15 -0.77 20.53
BasketballDrive 75.55 69.34 -1.04 -1.08 11.74
C RaceHorses (C) 832×480 57.62 53.00 -1.89 -1.51 30.52
BQMall 66.94 64.75 -2.09 -1.46 62.34
PartyScene 62.36 59.46 -2.30 -1.70 43.72
BasketballDrill 72.62 70.14 -2.14 -1.73 60.07
D RaceHorses (D) 416×240 56.65 52.65 -2.20 -1.82 102.59
BQSquare 78.93 79.91 -2.52 -1.07 241.60
BlowingBubbles 63.45 62.16 -2.04 -1.46 146.21
BasketballPass 50.23 47.88 -1.84 -1.53 270.74
E Vidyo1 1280×720 75.05 73.47 -1.64 -0.98 93.92
Vidyo3 91.89 93.06 -2.09 -1.09 68.43
Vidyo4 85.45 82.20 -1.48 -1.26 68.51
F BaskeballDrillText 832×480 72.24 69.75 -2.26 -1.94 60.14
ChinaSpeed 1024×768 104.44 104.90 -4.37 -2.63 27.37
SlideEditing 1280×720 113.78 117.56 -11.52 -4.63 62.47
SlideShow 86.47 85.50 -4.74 -3.91 37.21

Table 2 provides the performance results fot the x265 encoder in "ultra fast" preset. The compression efficiency is drastically worse even compared to "very slow" preset. This configuration does not provide the benefits of the HEVC video compression technology and may be used only to test and debug real time compression systems. On a more powerful CPU the x265 encoder in "ultra fast" preset is able to provide real time compression for FullHD 1080p@30 Hz video sequences.

Regarding the compression performance some interesting results were found here. The encoder performance is measured on the FullHD video sequence, a trailer to the movie "Max Schmeling". The "fast" preset is used for testing, and the general x265 configuration is:
--crf 20 --preset fast
where CRF controls the quality-based VBR.


Table 3. The x265 v.1.4 performance on different CPUs at FullHD video coding
Processor model Type* Cores Freq. (GHz) RAM RAM freq. (MHz) RAM channels Average FPS x64
Intel Core i7-5960X DT 8 4.5 DDR4 3000 4 32.42
Intel Core i7-5960X DT 8 4.4 DDR4 2666 4 31.35
Intel Core i7-5820K DT 6 4.5 DDR4 2666 2 25.44
Intel Core i7-5820K DT 6 4.37 DDR4 3000 2 24.61
Intel Core i7-4930K DT 6 3.9 DDR3 1600 2 17.89
Intel Core i7-4790K DT 4 4.4 DDR3 2000 2 17.73
Intel Core i5-4670 DT 4 4.5 DDR3 2200 1 15.86
Intel Core i7-4790 DT 4 4.0 DDR3 1600 2 15.28
Intel Core i7-4770 DT 4 3,9 DDR3 1600 2 15.08
Intel Core i7-3770K DT 4 4.5 DDR3 2200 2 15.02
AMD FX-8350 DT 8 4.67 DDR3 1600 2 14.97
AMD FX-8320 DT 8 4.63 DDR3 1600 2 14.95
AMD FX-8320 DT 8 4.4 DDR3 2200 2 14.46
Intel Core i7-2600K DT 4 4.5 DDR3 1600 2 14.23
AMD FX-8320 DT 8 4.2 DDR3 1600 2 13.68
Intel Core i7-4770K DT 4 3.9 DDR3 1600 2 12.80
Intel Core i7-3770 DT 4 3.9 DDR3 1600 2 12.30
Intel Core i7-3570K DT 4 4.2 DDR3 1600 2 12.04
Intel Core i5-2500K DT 4 4.3 DDR3 2200 2 11.61
Intel Core i7-4700MQ M 4 3.4 DDR3 1600 2 11.60
Intel Core i7-2600K DT 4 3.6 DDR3 1333 2 10.69
Intel Core i5-3470 DT 4 3.6 DDR3 1600 2 9.76
AMD FX-8120 DT 8 3.4 DDR3 1333 2 9.70
Intel Core i7-860 DT 4 3.35 DDR3 1333 2 8.10
Intel Core i7-920 DT 4 2.67 DDR3 1066 3 8.01
Intel Core i3-4370 DT 2 3.8 DDR3 1600 2 7.93
AMD Opteron 6234 WS 12 2.4 DDR3 1333 2 7.88
Intel Core i5-2300 DT 4 2.8 DDR3 1333 2 7.79
Intel Core i7-2630QM M 4 2.8 DDR3 1333 2 7.43
AMD A8-7600 DT 4 3.8 DDR3 1800 2 6.12
Intel Core2 Quad Q9650 DT 4 3.0 DDR2 800 2 6.02
Intel Core i5-3230M M 2 3.2 DDR3 1600 2 5.07
Intel Core i7-3517U M 2 2.4 DDR3 1333 1 4.50
Intel Core i5-460M M 2 2.53 DDR3 1066 2 3.94
Intel Core i5-480M M 2 2.93 DDR3 1066 2 3.90
AMD A4-5300 DT 2 3.6 DDR3 1333 2 2.89
AMD Phenom II X6 1090T DT 6 3.4 DDR3 1600 2 2.84
Intel Core i5-4200U M 2 2.3 DDR3 1333 2 2.79
Intel Core 2 Quad Q6600 DT 4 3.6 DDR2 800 2 2.61
Intel Celeron 2955U M 2 1.4 DDR3 1600 2 2.36
AMD Athlon 64 X2 3800+ DT 2 2.0 DDR2 400 2 0.46
* WS — workstation, DT — desktop, M — mobile/notebook processor


No comments:

Post a Comment