Home * Hardware * x86 * SSE4

SSE4 is a set of Intel and AMD ambiguous and almost disjoint x86 instruction set extensions, SSE4.1, SSE4.2 both by Intel, and SSE4a by AMD.

Intel


SSE4.1

Intel introduced SSE4.1 with the Penryn Core 2 brand of the Core microarchitecture in 2007 with 47 new instructions.
Mnemonic
Description
C-Intrinsic
pcmpeqq
packed compare equal qword
_m128i
_mm_cmpeq_epi64
(_m128i a, _m128i b)

see Vulnerable on distant Checks with SSE4.

SSE4.2

SSE4.2 of the Nehalem-based Core i7 was introduced in 2008 with 7 new instructions.

STTNI

SSE4.2 includes five String and Text New Instructions (STTNI) working on 128-bit XMM SIMD as well as general prupose registers and flags to perform character searches and comparison on two operands of 16 bytes at a time , i.e. PCMPESTRI (Packed Compare Explicit Length Strings, Return Index) [1].

ATAI

Popcnt and crc32, working on general purpose registers, were dubbed Application-Targeted Accelerator Instructions (ATAI) as subset of SSE4.2 [2] [3], but should considered as disjoint instruction set concerning SSE4 compiler optimizations.

Mnemonic
Description
C-Intrinsic
popcnt
Population Count
int
_mm_popcnt_u64
(unsigned _int64 a)

AMD SSE4a

SSE4a was introduced by AMD with the K10 (Barcelona) microarchitecture.

SIMD

Two new SIMD instructions, working on XMM registers were combined mask-shift instructions (EXTRQ/INSERTQ) and scalar streaming store instructions (MOVNTSD/MOVNTSS). These instructions are not available in Intel's SSE4.

Advanced Bit Manipulation

The two important instructions work on general purpose registers. Leading Zero Count was not available in Intel's Application-Targeted Accelerator Instructions of SSE4.2, but later incorporated with BMI.

Mnemonic
Description
C-Intrinsic
lzcnt
Leading Zero Count
unsigned _int64
_lzcnt64
(unsigned _int64 a)
popcnt
Population Count
unsigned _int64
_popcnt64
(unsigned _int64 a)

See also


Manuals


Forum Posts


External Links


References

  1. ^ PCMPESTRI — Packed Compare Explicit Length Strings, Return Index
  2. ^ MSDN - Streaming SIMD Extensions 4 Instructions, 2.3 SSE4.2 INSTRUCTION SET, 2.3.3. Application-Targeted Accelerator Instructions
  3. ^ Application Targeted Accelerators Intrinsics

What links here?


Up one Level