SSE4.2 includes five String and Text New Instructions (STTNI) working on 128-bit XMM SIMD as well as general prupose registers and flags to perform character searches and comparison on two operands of 16 bytes at a time , i.e. PCMPESTRI (Packed Compare Explicit Length Strings, Return Index) [1].
ATAI
Popcnt and crc32, working on general purpose registers, were dubbed Application-Targeted Accelerator Instructions (ATAI) as subset of SSE4.2 [2][3], but should considered as disjoint instruction set concerning SSE4 compiler optimizations.
SSE4a was introduced by AMD with the K10 (Barcelona) microarchitecture.
SIMD
Two new SIMD instructions, working on XMM registers were combined mask-shift instructions (EXTRQ/INSERTQ) and scalar streaming store instructions (MOVNTSD/MOVNTSS). These instructions are not available in Intel's SSE4.
Advanced Bit Manipulation
The two important instructions work on general purpose registers. Leading Zero Count was not available in Intel's Application-Targeted Accelerator Instructions of SSE4.2, but later incorporated with BMI.
Table of Contents
SSE4 is a set of Intel and AMD ambiguous and almost disjoint x86 instruction set extensions, SSE4.1, SSE4.2 both by Intel, and SSE4a by AMD.
Intel
SSE4.1
Intel introduced SSE4.1 with the Penryn Core 2 brand of the Core microarchitecture in 2007 with 47 new instructions.see Vulnerable on distant Checks with SSE4.
SSE4.2
SSE4.2 of the Nehalem-based Core i7 was introduced in 2008 with 7 new instructions.STTNI
SSE4.2 includes five String and Text New Instructions (STTNI) working on 128-bit XMM SIMD as well as general prupose registers and flags to perform character searches and comparison on two operands of 16 bytes at a time , i.e. PCMPESTRI (Packed Compare Explicit Length Strings, Return Index) [1].ATAI
Popcnt and crc32, working on general purpose registers, were dubbed Application-Targeted Accelerator Instructions (ATAI) as subset of SSE4.2 [2] [3], but should considered as disjoint instruction set concerning SSE4 compiler optimizations.AMD SSE4a
SSE4a was introduced by AMD with the K10 (Barcelona) microarchitecture.SIMD
Two new SIMD instructions, working on XMM registers were combined mask-shift instructions (EXTRQ/INSERTQ) and scalar streaming store instructions (MOVNTSD/MOVNTSS). These instructions are not available in Intel's SSE4.Advanced Bit Manipulation
The two important instructions work on general purpose registers. Leading Zero Count was not available in Intel's Application-Targeted Accelerator Instructions of SSE4.2, but later incorporated with BMI.See also
Manuals
Forum Posts
External Links
References
What links here?
Up one Level