golang-image/vector/acc_amd64.go
Nigel Tao 8edbaf3f9e vector: add SIMD versions of xxxAccumulateOpOver.
name                             old time/op  new time/op  delta
GlyphAlpha16Over-8               3.55µs ± 0%  3.17µs ± 0%  -10.58%  (p=0.000 n=10+10)
GlyphAlpha32Over-8               6.73µs ± 1%  4.94µs ± 0%  -26.55%   (p=0.000 n=10+9)
GlyphAlpha64Over-8               16.4µs ± 0%   9.6µs ± 0%  -41.30%   (p=0.000 n=9+10)
GlyphAlpha128Over-8              47.3µs ± 0%  23.1µs ± 1%  -51.09%    (p=0.000 n=9+9)
GlyphAlpha256Over-8               159µs ± 0%    69µs ± 0%  -56.82%    (p=0.000 n=9+8)

A comparison of the non-SIMD and SIMD versions:

name                              time/op
FixedAccumulateOpOver16-8          579ns ± 0%
FixedAccumulateOpOverSIMD16-8      183ns ± 0%
FloatingAccumulateOpOver16-8       670ns ± 1%
FloatingAccumulateOpOverSIMD16-8   242ns ± 0%
FixedAccumulateOpOver64-8         9.61µs ± 0%
FixedAccumulateOpOverSIMD64-8     2.72µs ± 0%
FloatingAccumulateOpOver64-8      11.1µs ± 0%
FloatingAccumulateOpOverSIMD64-8  3.65µs ± 0%

Change-Id: I08273c40ac5445f39b77a88fb8b6b07fd3e5f84b
Reviewed-on: https://go-review.googlesource.com/30831
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-13 03:04:36 +00:00

28 lines
634 B
Go

// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build !appengine
// +build gc
// +build !noasm
package vector
func haveSSE4_1() bool
var haveFixedAccumulateSIMD = haveSSE4_1()
const haveFloatingAccumulateSIMD = true
//go:noescape
func fixedAccumulateOpOverSIMD(dst []uint8, src []uint32)
//go:noescape
func fixedAccumulateOpSrcSIMD(dst []uint8, src []uint32)
//go:noescape
func floatingAccumulateOpOverSIMD(dst []uint8, src []float32)
//go:noescape
func floatingAccumulateOpSrcSIMD(dst []uint8, src []float32)