Commit Graph

8 Commits

Author SHA1 Message Date
Nigel Tao
8874bef159 vector: change ϕ from 10 to 9.
This slight loss in quality allows us to use int32 math exclusively
throughout raster_fixed.go, instead of occasionally dropping into int64
math. The change in ϕ doesn't affect the benchmarks noticably, but
staying in int32 does. The net effect:

name                              old time/op  new time/op  delta
GlyphAlpha16Over-8                3.36µs ± 0%  2.99µs ± 0%  -10.89%         (p=0.000 n=10+9)
GlyphAlpha16Src-8                 3.26µs ± 0%  2.89µs ± 1%  -11.34%         (p=0.000 n=9+10)
GlyphAlpha32Over-8                5.20µs ± 0%  4.53µs ± 0%  -12.76%         (p=0.000 n=8+10)
GlyphAlpha32Src-8                 4.81µs ± 1%  4.14µs ± 0%  -13.91%          (p=0.000 n=9+9)
GlyphAlpha64Over-8                10.2µs ± 0%   9.0µs ± 1%  -11.99%         (p=0.000 n=9+10)
GlyphAlpha64Src-8                 8.62µs ± 0%  7.42µs ± 1%  -13.89%         (p=0.000 n=9+10)
GlyphAlpha128Over-8               24.1µs ± 0%  21.8µs ± 0%   -9.32%          (p=0.000 n=9+9)
GlyphAlpha128Src-8                17.9µs ± 0%  15.6µs ± 0%  -12.68%         (p=0.000 n=9+10)
GlyphAlpha256Over-8               70.1µs ± 0%  66.3µs ± 1%   -5.44%        (p=0.000 n=10+10)
GlyphAlpha256Src-8                45.2µs ± 1%  41.2µs ± 1%   -8.92%        (p=0.000 n=10+10)
GlyphRGBA16Over-8                 5.12µs ± 0%  4.75µs ± 0%   -7.15%         (p=0.000 n=10+9)
GlyphRGBA16Src-8                  4.57µs ± 1%  4.20µs ± 0%   -8.18%          (p=0.000 n=9+8)
GlyphRGBA32Over-8                 12.1µs ± 0%  11.4µs ± 0%   -5.50%         (p=0.000 n=10+9)
GlyphRGBA32Src-8                  10.0µs ± 0%   9.3µs ± 1%   -6.80%         (p=0.000 n=10+9)
GlyphRGBA64Over-8                 37.2µs ± 0%  36.0µs ± 0%   -3.17%          (p=0.000 n=9+8)
GlyphRGBA64Src-8                  29.0µs ± 1%  27.9µs ± 1%   -4.05%         (p=0.000 n=9+10)
GlyphRGBA128Over-8                 134µs ± 1%   131µs ± 0%   -1.85%          (p=0.000 n=9+9)
GlyphRGBA128Src-8                  100µs ± 1%    98µs ± 0%   -2.27%         (p=0.000 n=10+9)
GlyphRGBA256Over-8                 506µs ± 0%   503µs ± 0%   -0.56%         (p=0.000 n=10+8)
GlyphRGBA256Src-8                  373µs ± 0%   370µs ± 0%   -1.01%         (p=0.000 n=10+9)

Change-Id: Ie02afac6fd6fa95f090bf3fe0a5c744799ea9dc5
Reviewed-on: https://go-review.googlesource.com/31532
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-20 04:12:57 +00:00
Nigel Tao
fa54d6fa1c vector: simplify fixedLineTo computation.
name                              old time/op  new time/op   delta
GlyphAlpha16Over-8                3.38µs ± 0%   3.36µs ± 0%   -0.54%        (p=0.000 n=10+10)
GlyphAlpha16Src-8                 3.28µs ± 0%   3.26µs ± 0%   -0.69%         (p=0.000 n=10+9)
GlyphAlpha32Over-8                5.23µs ± 1%   5.20µs ± 0%   -0.58%         (p=0.000 n=10+8)
GlyphAlpha32Src-8                 4.83µs ± 0%   4.81µs ± 1%   -0.46%         (p=0.001 n=10+9)
GlyphAlpha64Over-8                10.2µs ± 0%   10.2µs ± 0%   -0.21%         (p=0.003 n=10+9)
GlyphAlpha64Src-8                 8.68µs ± 1%   8.62µs ± 0%   -0.70%         (p=0.000 n=10+9)
GlyphAlpha128Over-8               24.2µs ± 1%   24.1µs ± 0%   -0.58%         (p=0.001 n=10+9)
GlyphAlpha128Src-8                18.0µs ± 1%   17.9µs ± 0%   -0.61%         (p=0.001 n=10+9)
GlyphAlpha256Over-8               70.3µs ± 0%   70.1µs ± 0%   -0.37%        (p=0.019 n=10+10)
GlyphAlpha256Src-8                45.4µs ± 0%   45.2µs ± 1%   -0.38%         (p=0.041 n=8+10)
GlyphRGBA16Over-8                 5.14µs ± 0%   5.12µs ± 0%   -0.43%         (p=0.000 n=9+10)
GlyphRGBA16Src-8                  4.59µs ± 0%   4.57µs ± 1%   -0.43%          (p=0.005 n=9+9)
GlyphRGBA32Over-8                 12.2µs ± 1%   12.1µs ± 0%   -0.70%        (p=0.000 n=10+10)
GlyphRGBA32Src-8                  10.0µs ± 0%   10.0µs ± 0%     ~            (p=0.092 n=7+10)
GlyphRGBA64Over-8                 37.5µs ± 1%   37.2µs ± 0%   -0.75%         (p=0.000 n=10+9)
GlyphRGBA64Src-8                  29.1µs ± 0%   29.0µs ± 1%     ~            (p=0.243 n=10+9)
GlyphRGBA128Over-8                 135µs ± 0%    134µs ± 1%   -0.72%          (p=0.000 n=9+9)
GlyphRGBA128Src-8                  101µs ± 1%    100µs ± 1%     ~           (p=0.197 n=10+10)
GlyphRGBA256Over-8                 511µs ± 0%    506µs ± 0%   -0.97%        (p=0.000 n=10+10)
GlyphRGBA256Src-8                  374µs ± 0%    373µs ± 0%   -0.29%        (p=0.002 n=10+10)

Change-Id: Ic05a900935cb59e55711374db1e62b055d75c8e3
Reviewed-on: https://go-review.googlesource.com/31116
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-16 03:06:30 +00:00
Nigel Tao
f72412cfe3 vector: fix overflow when rasterizing a 30 degree line.
There are some obvious TODOs, but they will be follow-up commits.

This is about correctness, not performance, but for the record:

name                              old time/op  new time/op   delta
GlyphAlpha16Over-8                3.16µs ± 0%   3.38µs ± 0%   +6.96%         (p=0.000 n=9+10)
GlyphAlpha16Src-8                 3.06µs ± 0%   3.28µs ± 0%   +7.21%        (p=0.000 n=10+10)
GlyphAlpha32Over-8                4.92µs ± 1%   5.23µs ± 1%   +6.24%        (p=0.000 n=10+10)
GlyphAlpha32Src-8                 4.53µs ± 1%   4.83µs ± 0%   +6.69%        (p=0.000 n=10+10)
GlyphAlpha64Over-8                9.60µs ± 1%  10.21µs ± 0%   +6.36%        (p=0.000 n=10+10)
GlyphAlpha64Src-8                 8.04µs ± 0%   8.68µs ± 1%   +7.99%         (p=0.000 n=9+10)
GlyphAlpha128Over-8               23.1µs ± 0%   24.2µs ± 1%   +5.08%         (p=0.000 n=9+10)
GlyphAlpha128Src-8                16.8µs ± 0%   18.0µs ± 1%   +6.76%        (p=0.000 n=10+10)
GlyphAlpha256Over-8               68.6µs ± 1%   70.3µs ± 0%   +2.50%        (p=0.000 n=10+10)
GlyphAlpha256Src-8                43.6µs ± 0%   45.4µs ± 0%   +4.08%         (p=0.000 n=10+8)
GlyphRGBA16Over-8                 4.92µs ± 0%   5.14µs ± 0%   +4.48%          (p=0.000 n=9+9)
GlyphRGBA16Src-8                  4.39µs ± 0%   4.59µs ± 0%   +4.60%          (p=0.000 n=8+9)
GlyphRGBA32Over-8                 11.8µs ± 0%   12.2µs ± 1%   +2.89%         (p=0.000 n=9+10)
GlyphRGBA32Src-8                  9.79µs ± 1%  10.03µs ± 0%   +2.49%         (p=0.000 n=10+7)
GlyphRGBA64Over-8                 36.7µs ± 1%   37.5µs ± 1%   +2.23%        (p=0.000 n=10+10)
GlyphRGBA64Src-8                  28.5µs ± 0%   29.1µs ± 0%   +2.09%        (p=0.000 n=10+10)
GlyphRGBA128Over-8                 133µs ± 0%    135µs ± 0%   +1.51%         (p=0.000 n=10+9)
GlyphRGBA128Src-8                 99.1µs ± 0%  100.5µs ± 1%   +1.47%         (p=0.000 n=9+10)
GlyphRGBA256Over-8                 505µs ± 0%    511µs ± 0%   +1.18%         (p=0.000 n=9+10)
GlyphRGBA256Src-8                  372µs ± 0%    374µs ± 0%   +0.69%         (p=0.000 n=9+10)

Change-Id: Ice1d77de5bc2649f8cd88366bcae3c00e78d65c2
Reviewed-on: https://go-review.googlesource.com/31113
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-14 22:25:13 +00:00
Nigel Tao
beb9675609 vector: fix overflow when rasterizing wide lines.
Change-Id: Iea92b74ca9533de2ef17534ee3acf4f40c3d03ef
Reviewed-on: https://go-review.googlesource.com/30899
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-14 02:01:44 +00:00
Nigel Tao
8edbaf3f9e vector: add SIMD versions of xxxAccumulateOpOver.
name                             old time/op  new time/op  delta
GlyphAlpha16Over-8               3.55µs ± 0%  3.17µs ± 0%  -10.58%  (p=0.000 n=10+10)
GlyphAlpha32Over-8               6.73µs ± 1%  4.94µs ± 0%  -26.55%   (p=0.000 n=10+9)
GlyphAlpha64Over-8               16.4µs ± 0%   9.6µs ± 0%  -41.30%   (p=0.000 n=9+10)
GlyphAlpha128Over-8              47.3µs ± 0%  23.1µs ± 1%  -51.09%    (p=0.000 n=9+9)
GlyphAlpha256Over-8               159µs ± 0%    69µs ± 0%  -56.82%    (p=0.000 n=9+8)

A comparison of the non-SIMD and SIMD versions:

name                              time/op
FixedAccumulateOpOver16-8          579ns ± 0%
FixedAccumulateOpOverSIMD16-8      183ns ± 0%
FloatingAccumulateOpOver16-8       670ns ± 1%
FloatingAccumulateOpOverSIMD16-8   242ns ± 0%
FixedAccumulateOpOver64-8         9.61µs ± 0%
FixedAccumulateOpOverSIMD64-8     2.72µs ± 0%
FloatingAccumulateOpOver64-8      11.1µs ± 0%
FloatingAccumulateOpOverSIMD64-8  3.65µs ± 0%

Change-Id: I08273c40ac5445f39b77a88fb8b6b07fd3e5f84b
Reviewed-on: https://go-review.googlesource.com/30831
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-13 03:04:36 +00:00
Nigel Tao
746988e7a2 vector: add SIMD versions of xxxAccumulateOpSrc.
name                         old time/op  new time/op  delta
GlyphAlpha16Src-8            3.37µs ± 0%  3.07µs ± 1%   -8.86%    (p=0.000 n=9+9)
GlyphAlpha32Src-8            6.01µs ± 1%  4.55µs ± 0%  -24.28%   (p=0.000 n=10+9)
GlyphAlpha64Src-8            13.2µs ± 0%   8.1µs ± 0%  -38.69%   (p=0.000 n=10+9)
GlyphAlpha128Src-8           32.9µs ± 0%  16.9µs ± 0%  -48.85%   (p=0.000 n=10+9)
GlyphAlpha256Src-8           98.0µs ± 0%  43.6µs ± 1%  -55.50%  (p=0.000 n=10+10)

A comparison of the non-SIMD and SIMD versions:

name                             time/op
FixedAccumulateOpSrc16-8          368ns ± 0%
FixedAccumulateOpSrcSIMD16-8     86.8ns ± 1%
FloatingAccumulateOpSrc16-8       434ns ± 0%
FloatingAccumulateOpSrcSIMD16-8   119ns ± 0%
FixedAccumulateOpSrc64-8         6.12µs ± 0%
FixedAccumulateOpSrcSIMD64-8     1.17µs ± 0%
FloatingAccumulateOpSrc64-8      7.15µs ± 0%
FloatingAccumulateOpSrcSIMD64-8  1.68µs ± 1%

Change-Id: I58e5c7a3ecd12e536aab8e765e94275453d0eac8
Reviewed-on: https://go-review.googlesource.com/30431
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-10 07:32:32 +00:00
Nigel Tao
72141d56a2 vector: re-order some functions.
There are no code changes, just a re-ordering so that these files are
consistent with others in this package: OpOver, OpSrc, Mask.

Change-Id: Ib1d46a8e912dae0c760af655e919b77023688189
Reviewed-on: https://go-review.googlesource.com/30111
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-02 02:10:30 +00:00
Nigel Tao
992afa5d48 vector: add a fixed point math implementation.
name                      old time/op  new time/op  delta

GlyphAlpha16Over-8        4.48µs ± 1%  3.56µs ± 0%  -20.70%   (p=0.000 n=9+10)
GlyphAlpha16Src-8         4.17µs ± 0%  3.38µs ± 1%  -19.09%  (p=0.000 n=10+10)
GlyphAlpha32Over-8        9.03µs ± 0%  6.74µs ± 0%  -25.33%   (p=0.000 n=9+10)
GlyphAlpha32Src-8         7.46µs ± 1%  5.98µs ± 0%  -19.80%   (p=0.000 n=10+9)
GlyphAlpha64Over-8        21.3µs ± 0%  16.4µs ± 0%  -22.84%  (p=0.000 n=10+10)
GlyphAlpha64Src-8         16.2µs ± 1%  13.1µs ± 0%  -19.33%  (p=0.000 n=10+10)
GlyphAlpha128Over-8       59.8µs ± 0%  47.2µs ± 0%  -21.11%    (p=0.000 n=9+9)
GlyphAlpha128Src-8        41.3µs ± 1%  33.0µs ± 0%  -20.26%   (p=0.000 n=9+10)
GlyphAlpha256Over-8        197µs ± 0%   158µs ± 0%  -19.44%   (p=0.000 n=9+10)
GlyphAlpha256Src-8         124µs ± 0%    98µs ± 0%  -21.17%    (p=0.000 n=9+9)

GlyphAlphaLoose16Over-8   4.73µs ± 0%  3.97µs ± 1%  -16.06%  (p=0.000 n=10+10)
GlyphAlphaLoose16Src-8    4.41µs ± 0%  3.64µs ± 1%  -17.50%  (p=0.000 n=10+10)
GlyphAlphaLoose32Over-8   9.62µs ± 0%  8.47µs ± 0%  -11.95%  (p=0.000 n=10+10)
GlyphAlphaLoose32Src-8    8.25µs ± 0%  7.19µs ± 0%  -12.88%    (p=0.000 n=9+9)
GlyphAlphaLoose64Over-8   25.6µs ± 0%  22.2µs ± 0%  -13.01%    (p=0.000 n=9+9)
GlyphAlphaLoose64Src-8    20.2µs ± 0%  17.2µs ± 1%  -14.98%  (p=0.000 n=10+10)
GlyphAlphaLoose128Over-8  83.4µs ± 1%  68.2µs ± 0%  -18.27%  (p=0.000 n=10+10)
GlyphAlphaLoose128Src-8   59.8µs ± 0%  47.4µs ± 0%  -20.77%   (p=0.000 n=10+9)
GlyphAlphaLoose256Over-8   273µs ± 1%   239µs ± 0%  -12.52%   (p=0.000 n=10+9)
GlyphAlphaLoose256Src-8    187µs ± 0%   155µs ± 1%  -16.91%   (p=0.000 n=9+10)

GlyphRGBA16Over-8         5.99µs ± 0%  5.24µs ± 1%  -12.60%   (p=0.000 n=9+10)
GlyphRGBA16Src-8          5.48µs ± 0%  4.68µs ± 0%  -14.68%   (p=0.000 n=9+10)
GlyphRGBA32Over-8         14.6µs ± 0%  13.5µs ± 0%   -7.60%    (p=0.000 n=9+9)
GlyphRGBA32Src-8          12.6µs ± 0%  11.4µs ± 0%   -9.62%    (p=0.000 n=9+9)
GlyphRGBA64Over-8         44.8µs ± 0%  42.2µs ± 0%   -5.69%    (p=0.000 n=9+9)
GlyphRGBA64Src-8          36.6µs ± 1%  33.5µs ± 1%   -8.55%    (p=0.000 n=9+9)
GlyphRGBA128Over-8         162µs ± 0%   148µs ± 1%   -8.85%   (p=0.000 n=10+9)
GlyphRGBA128Src-8          129µs ± 1%   114µs ± 0%  -11.61%   (p=0.000 n=9+10)
GlyphRGBA256Over-8         588µs ± 0%   573µs ± 0%   -2.53%   (p=0.000 n=9+10)
GlyphRGBA256Src-8          455µs ± 0%   426µs ± 1%   -6.51%   (p=0.000 n=9+10)

GlyphNRGBA16Over-8        27.0µs ± 4%  26.3µs ± 2%   -2.65%   (p=0.001 n=9+10)
GlyphNRGBA16Src-8         19.4µs ± 3%  18.6µs ± 1%   -4.35%   (p=0.000 n=9+10)
GlyphNRGBA32Over-8        97.4µs ± 3%  96.8µs ± 2%     ~      (p=0.447 n=9+10)
GlyphNRGBA32Src-8         66.6µs ± 3%  64.5µs ± 1%   -3.21%   (p=0.000 n=10+9)
GlyphNRGBA64Over-8         372µs ± 3%   368µs ± 1%     ~     (p=0.105 n=10+10)
GlyphNRGBA64Src-8          235µs ± 1%   234µs ± 1%     ~       (p=0.130 n=8+8)
GlyphNRGBA128Over-8       1.45ms ± 2%  1.48ms ± 3%   +2.06%    (p=0.014 n=9+9)
GlyphNRGBA128Src-8         926µs ± 3%   937µs ± 1%     ~      (p=0.113 n=10+9)
GlyphNRGBA256Over-8       5.76ms ± 2%  5.90ms ± 3%   +2.29%   (p=0.001 n=9+10)
GlyphNRGBA256Src-8        3.59ms ± 1%  3.86ms ± 1%   +7.46%   (p=0.000 n=9+10)

Change-Id: I72f25193b5be4e57af09e9eea4eee50545a34cbf
Reviewed-on: https://go-review.googlesource.com/29972
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-09-30 06:47:22 +00:00