Word

toc
 * Home * Programming * Data * Word**

A **Word** or Computer Word, is a term for the natural unit of data used by a particular computer architecture. Modern computers usually have a word size to be a power of 2 multiple of the unit of address resolution, likely a Byte, that is two, four, or eight Bytes, which are 16, 32, or 64 bits. Many other sizes have been used in the past, including 8 (a Byte), 9, 12, 18, 24, 36, 39, 40, 48, and 60 bits. Some of the early computers were decimal rather than binary, having a word size of 10 or 12 decimal digits, and some of them had no fixed word length at all.

=16-bit Word= Often the size of a word is defined to be a value for compatibility with earlier computers, such as Intel's x86 and x86-64 architecture, which referes a **Word** from the original 8086 16-bit µ-Processor. Subsequently Intel used the terms Double Word (**dword**) for 32-bit words, a quadruple word or Quad Word (**qword**) for 64-bits words, and even a Double Quad Word for 128-bit words. x86 and x86-64 registers may still treated as word registers (ax versus eax or even rax), while it is recommend to use the native 32-bit double word, because the word-wise access requires a prefix byte to overwrite the default width. SIMD instruction sets like MMX, AltiVec and SSE2 provide operations on vectors of four or eight words inside appropriate SIMD-registers. IBM 360 and successors with 32-bit words, refer 16-bit size as **halfword**.

=Short= On recent 32-bit and 64-bit processors the primitive C datatype **short** and **unsigned short** refers to 16-bit words by most compilers for those architectures. In Java, **short** is guaranteed to have 16-bit. Signed short in C is assumed to use Twos' Complement, but not strictly specified. A Word-type, explicitly type-defined in C, is therefor usually treated as unsigned, also to avoid arithmetical right shift issues: code format="cpp" typedef unsigned char BYTE; typedef unsigned short WORD; code  =Ranges=
 * ~ type ||~ language ||~ min ||~ max ||
 * unsigned short || C, C++ ||> 0 ||> 65535 ||
 * hexadecimal ||  ||> 0x0000 ||> 0xFFFF ||
 * #include  ||  ||   || USHRT_MAX ||
 * short || C, C++, Java ||> -32768 ||> 32767 ||
 * hexadecimal ||  ||> 0x8000 ||> 0x7FFF ||
 * #include  ||  ||> SHRT_MIN ||> SHRT_MAX ||

=Alignment= Words stored in memory should be stored at even byte addresses. Otherwise at runtime it will cause a miss-alignment exception on some processors, or a huge penalty on others.

=Endianness= //Main article: Endianness.// An issue with words consisting of two or more bytes, is the order, bytes may appear inside a word of memory. According to their usual arithmetical significance, there is a low and a high byte of a 16-bit word, which may either be stored at the lower or higher byte-address in memory. Intel processors were always so called little-endian machines, the least significant byte (LSB) is at the lowest address. Other processors, including the IBM 370 family, the PDP-10 (36 bit), the Motorola microprocessor families, and most of the various RISC designs are big-endian, and store the ‘big-end-first’.

=Extracting Bytes= Following C union to extract or synthesize bytes from/in words, is not portable and should be avoided. code format="cpp" union { BYTE b[2]; WORD s; } u;

u.s = 0xaa55; assert (u.b[0] == 0x55); // fails, if big-endian code The portable way in C can be done with inlined functions or C preprocessor macros, using arithmetical divide or modulo by 256, aka shift and mask by bitwise 'and' - or for the synthesis multiplication of high byte by 256 plus low byte: code format="cpp" BYTE lowByte (WORD s) {return (BYTE)(s & 255);} // mod 256 BYTE highByte(WORD s) {return (BYTE)(s >> 8);} // div 256

WORD makeWORD (BYTE high, BYTE low) { WORD s = high; return (s << 8) + low; // high * 256 + low } code

=See also=
 * Byte
 * Double Word
 * Quad Word

=External Links= > John McLaughlin, Billy Cobham, [|Rick Laird], Jan Hammer, [|Jerry Goodman] > media type="youtube" key="_--OPhoTUZY?rel=0" height="360" width="480"
 * [|Word from Wikipedia]
 * [|Byte from Wikipedia]
 * [|Endianness from Wikipedia]
 * [|Understanding Big and Little Endian Byte Order]
 * [|IEN 137 - DAV's Endian FAQ - On Holy Wars and a Plea for Peace] by [|Danny Cohen], [|U S C/I S I], April 1, 1980
 * Mahavishnu Orchestra - [|One Word], 1973, [|YouTube] Video

=What links here?= include page="Word" component="backlinks" limit="40"
 * Up one Level**