amd64: Implement VFMADD213 for Iop_MAddF32 and Iop_MAddF64
commita5693c1203c3a26443af13182a8082c2e9152f6c
authorMark Wielaard <mark@klomp.org>
Sat, 13 Apr 2024 12:33:19 +0000 (13 14:33 +0200)
committerMark Wielaard <mark@klomp.org>
Sat, 13 Apr 2024 12:49:21 +0000 (13 14:49 +0200)
treeae4f055b611bfd42782f2becaffa346add0db319
parent176e46cd6cfa83361bc323b60a1f85deb646b473
amd64: Implement VFMADD213 for Iop_MAddF32 and Iop_MAddF64

Speed up F32 and F64 FMA on amd64. Add priv/host_amd64_maddf.c
implementing h_amd64_calc_MAddF32_fma4 and h_amd64_calc_MAddF64_fma4
to be used instead of the generic variants h_generic_calc_MAddF32
and h_generic_calc_MAddF64 when host has VEX_HWCAPS_AMD64_FMA4.
Add fma3 and fma4 detection m_machine.c (machine_get_hwcaps).

This patch also fixes the memcheck/tests/vcpu_fnfns and
none/tests/amd64/fma testcases when run on a x86-64-v3 system.

Patch contributed by Grazvydas Ignotas <notasas@gmail.com> and
Bruno Lathuilière <bruno.lathuiliere@edf.fr>

https://bugs.kde.org/show_bug.cgi?id=481127
https://bugs.kde.org/show_bug.cgi?id=463463
https://bugs.kde.org/show_bug.cgi?id=463458
Makefile.vex.am
NEWS
VEX/priv/host_amd64_defs.c
VEX/priv/host_amd64_defs.h
VEX/priv/host_amd64_isel.c
VEX/priv/host_amd64_maddf.c [new file with mode: 0644]
VEX/priv/host_amd64_maddf.h [new file with mode: 0644]
VEX/priv/main_main.c
VEX/pub/libvex.h
coregrind/m_machine.c