A memory management unit ( MMU ), sometimes called paged memory management unit ( PMMU ), is a computer hardware unit that examines all memory references on the memory bus , translating these requests, known as virtual memory addresses , into physical addresses in main memory .
116-398: Data structure alignment is the way data is arranged and accessed in computer memory . It consists of three separate but related issues: data alignment , data structure padding , and packing . The CPU in modern computer hardware performs reads and writes to memory most efficiently when the data is naturally aligned , which generally means that the data's memory address is a multiple of
232-441: A 1 KB tiny page. ARM uses a two-level page table if using 4 KB and 64 KB pages, or just a one-level page table for 1 MB sections and 16 MB sections. TLB updates are performed automatically by page table walking hardware. PTEs include read/write access permission based on privilege, cacheability information, an NX bit , and a non-secure bit. DEC Alpha processors divide memory into 8 KB , 16 KB , 32 KB , or 64 KB ;
348-437: A is said to be n- byte aligned when a is a multiple of n (where n is a power of 2). In this context, a byte is the smallest unit of memory access, i.e. each memory address specifies a different byte. An n -byte aligned address would have a minimum of log 2 ( n ) least-significant zeros when expressed in binary . The alternate wording b-bit aligned designates a b/8 byte aligned address (ex. 64-bit aligned
464-441: A mass storage cache and write buffer to improve both reading and writing performance. Operating systems borrow RAM capacity for caching so long as it is not needed by running software. If needed, contents of the computer memory can be transferred to storage; a common way of doing this is through a memory management technique called virtual memory . Modern computer memory is implemented as semiconductor memory , where data
580-426: A 16-bit address that made it too small as memory sizes increased in the 1970s. This was addressed by expanding the physical memory bus to 18-bits, and using an MMU to add two more bits based on other pins on the processor bus to indicate which program was accessing memory. Another use of this same technique, although not referred to as paging but bank switching , was widely used by early 8-bit microprocessors like
696-786: A 32-bit machine, a data structure containing a 16-bit value followed by a 32-bit value could have 16 bits of padding between the 16-bit value and the 32-bit value to align the 32-bit value on a 32-bit boundary. Alternatively, one can pack the structure, omitting the padding, which may lead to slower access, but uses three quarters as much memory. Although data structure alignment is a fundamental issue for all modern computers, many computer languages and computer language implementations handle data alignment automatically. Fortran , Ada , PL/I , Pascal , certain C and C++ implementations, D , Rust , C# , and assembly language allow at least partial control of data structure padding, which may be useful in certain special circumstances. A memory address
812-434: A 32-bit system are: Some data types are dependent on the implementation. Here is a structure with members of various types, totaling 8 bytes before compilation: After compilation the data structure will be supplemented with padding bytes to ensure a proper alignment for each of its members: The compiled size of the structure is now 12 bytes. The last member is padded with the number of bytes required so that
928-401: A C structure when the purpose is the efficient mapping of that area through a hardware address translation mechanism (PCI remapping, operation of a MMU ). For instance, on a 32-bit operating system, a 4 KiB (4096 bytes) page is not just an arbitrary 4 KiB chunk of data. Instead, it is usually a region of memory that's aligned on a 4 KiB boundary. This is because aligning
1044-408: A TLB exception occurs when processing a TLB exception, a double fault TLB exception, it is dispatched to its own exception handler . MIPS32 and MIPS32r2 support 32 bits of virtual address space and up to 36 bits of physical address space. MIPS64 supports up to 64 bits of virtual address space and up to 59 bits of physical address space. The original Sun-1 is a single-board computer built around
1160-539: A TLB resolution of 0x2CFC7 to 0x12345 to issue a physical access to pa=0x12345ABC. Here, the 20/12-bit split luckily matches the hexadecimal representation split at 5/3 digits. The hardware can implement this translation by simply combining the first 20 bits of the physical address (0x12345) and the last 12 bits of the virtual address (0xABC). This is also referred to as virtually indexed (ABC) physically tagged (12345). A block of data of size 2 − 1 always has one sub-block of size 2 aligned on 2 bytes. This
1276-421: A compiled size of 6 bytes on a 32-bit system. The above directives are available in compilers from Microsoft , Borland , GNU , and many others. Another example: On some Microsoft compilers, particularly for RISC processors, there is an unexpected relationship between project default packing (the /Zp directive) and the #pragma pack directive. The #pragma pack directive can only be used to reduce
SECTION 10
#17331148615991392-643: A context is 1024 pages or 2 MB. The maximum physical address that can be mapped simultaneously is also 2 MB. The context register is important in a multitasking operating system because it allows the CPU to switch between processes without reloading all the translation state information. The 4-bit context register can switch between 16 sections of the segment map under supervisor control, which allows 16 contexts to be mapped concurrently. Each context has its own virtual address space. Sharing of virtual address space and inter-context communications can be provided by writing
1508-459: A contiguous series of fixed-sized blocks. This is similar to the modern demand paging system in that the result is a series of pages, but in these earlier systems the list of pages is fixed in size and normally stored in some form of fast memory like static RAM to improve performance. In this case, the two parts of the address stored by the MMU are known as the segment number and page index . Consider
1624-458: A fault on write bit. The MIPS architecture supports one to 64 entries in the TLB. The number of TLB entries is configurable at CPU configuration before synthesis. TLB entries are dual. Each TLB entry maps a virtual page number (VPN2) to either one of two page frame numbers (PFN0 or PFN1), depending on the least significant bit of the virtual address that is not part of the page mask . This bit and
1740-442: A fixed set of blocks instead of loading them on demand. The difference between these two approaches is the size of the contiguous block of memory; paged systems break up main memory into a series of equal sized blocks, while segmented systems generally allow for variable sizes. Early memory management systems, often implemented in software, set aside a portion of memory to hold a series of mappings. These consisted of pairs of values,
1856-516: A fixed-size list of pages that divided up memory; this meant that the block size was a function of the number of pages and the installed memory. Another common technique, found mostly on larger machines, was segmented translation, which allowed for variable-size blocks of memory that better mapped onto program requests. This was efficient but did not map as well onto virtual memory. Some early systems, especially 8-bit systems, used very simple MMUs to perform bank switching . Modern MMUs typically divide
1972-425: A form of space–time tradeoff . Although use of "packed" structures is most frequently used to conserve memory space , it may also be used to format a data structure for transmission using a standard protocol. However, in this usage, care must also be taken to ensure that the values of the struct members are stored with the endianness required by the protocol (often network byte order ), which may be different from
2088-418: A list of page number originally expressed by the program and the actual page number in main memory. When it attempts to access memory, the MMU reads the segment number from the processor's memory bus, finds the corresponding entry for that program in its internal memory, and expresses the mapped version of the value on the memory's bus while the lower bits of the original address are passed through unchanged. Like
2204-483: A multiple of 4 ( alignof(float) )). In this example the total size of the structure sizeof (FinalPadShort) == 6 , not 5 (not 8 either) (so that the size is a multiple of 2 ( alignof(short) == 2 on linux-32bit/gcc)). It is possible to change the alignment of structures to reduce the memory they require (or to conform to an existing format) by reordering structure members or changing the compiler's alignment (or “packing”) of structure members. The compiled size of
2320-420: A page on a page-sized boundary lets the hardware map a virtual address to a physical address by substituting the higher bits in the address, rather than doing complex arithmetic. Example: Assume that we have a TLB mapping of virtual address 0x2CFC7000 to physical address 0x12345000. (Note that both these addresses are aligned at 4 KiB boundaries.) Accessing data located at virtual address va=0x2CFC7ABC causes
2436-433: A page table entry or other per-page information prohibits access to a particular virtual page, perhaps because no physical random-access memory (RAM) has been allocated to that virtual page. In this case, the MMU signals a page fault to the CPU. The operating system (OS) then handles the situation, perhaps by trying to find a spare frame of RAM and set up the page map to map it to the requested virtual address. If no RAM
SECTION 20
#17331148615992552-481: A processor design with 24-bit addressing, like the original Motorola 68000 . In such a system, the MMU splits the virtual address into parts, for instance, the 13 least significant bits for the page index and the remaining 11 most significant bits as the segment number. This results a list of 2048 pages of 8 kB each. In this approach, memory requests result in one or more pages being granted to that program, which may not be contiguous in main memory. The MMU maintains
2668-411: A program refers to a location in a page that is not in physical memory, the MMU will cause an interrupt to the operating system . The OS will then select a lesser-used block in memory, write it to backing storage such as a hard drive if it has been modified since it was read in, read the page from backing storage into that block, and set up the MMU to map the block to the originally requested page so
2784-553: A request, but this is spread out and cannot be allocated. On systems where programs start and stop over time, this can eventually lead to memory being highly fragmented and no large blocks remaining. A number of algorithms were developed to address this problem. Segmenting was widely used on microcomputer platforms of the 1980s. Among the MMUs that used this concept were the Motorola 68451 and Signetics 68905, but many other examples exist. It
2900-785: A set of bits to index the root level of the tree, a set of bits to index the middle level of the tree, a set of bits to index the leaf level of the tree, and remaining bits that pass through to the physical address without modification, indexing a byte within the page. The sizes of the fields are dependent on the page size; all three tree index fields are the same size. The OpenVMS AXP PALcode supports full read and write permission bits for user, supervisor, executive, and kernel modes, and also supports fault on read/write/execute bits are also supported. The DEC OSF/1 PALcode supports full read and write permission bits for user and kernel modes, and also supports fault on read/write/execute bits are also supported. The Windows NT AXP PALcode can either walk
3016-473: A set of bits to index the root level of the tree, a set of bits to index the top level of the tree, a set of bits to index the leaf level of the tree, and remaining bits that pass through to the physical address without modification, indexing a byte within the page. The sizes of the fields are dependent on the page size. The Windows NT AXP PALcode supports a page being accessible only from kernel mode or being accessible from user and kernel mode, and also supports
3132-399: A single memory word is accessed the operation is atomic, i.e. the whole memory word is read or written at once and other devices must wait until the read or write operation completes before they can access it. This may not be true for unaligned accesses to multiple memory words, e.g. the first word might be read by one device, both words written by another device and then the second word read by
3248-427: A single-level page table in a virtual address space or a two-level page table in physical address space. The upper 32 bits of an address are ignored. For a single-level page table, addresses are broken down into a set of bits to index the page table and remaining bits that pass through to the physical address without modification, indexing a byte within the page. For a two-level page table, addresses are borken down into
3364-482: A structure may be declared UNALIGNED to eliminate all padding except around bit strings. One use for such "packed" structures is to conserve memory. For example, a structure containing a single byte (such as a char) and a four-byte integer (such as uint32_t) would require three additional bytes of padding. A large array of such structures would use 37.5% less memory if they are packed, although accessing each structure might take longer. This compromise may be considered
3480-487: A uniform fashion. The MMU is implemented in hardware on the CPU board. The MMU consists of a context register, a segment map and a page map. Virtual addresses from the CPU are translated into intermediate addresses by the segment map, which in turn are translated into physical addresses by the page map. The page size is 2 KB and the segment size is 32 KB which gives 16 pages per segment. Up to 16 contexts can be mapped concurrently. The maximum logical address space for
3596-470: A very fast memory and thus reduces the need to talk to the slower main memory. In some implementations, they are also responsible for bus arbitration , controlling access to the memory bus among the many parts of the computer that desire access. Prior to VM systems becoming widespread in the 1990s, earlier MMU designs were more varied. Common among these was paged translation, which was similar to modern demand paging in that it used fixed-size blocks, but had
Data structure alignment - Misplaced Pages Continue
3712-593: Is semi-volatile . The term is used to describe a memory that has some limited non-volatile duration after power is removed, but then data is ultimately lost. A typical goal when using a semi-volatile memory is to provide the high performance and durability associated with volatile memories while providing some benefits of non-volatile memory. For example, some non-volatile memory types experience wear when written. A worn cell has increased volatility but otherwise continues to work. Data locations which are written frequently can thus be directed to use worn circuits. As long as
3828-402: Is 8 bytes aligned). A memory access is said to be aligned when the data being accessed is n bytes long and the datum address is n -byte aligned. When a memory access is not aligned, it is said to be misaligned . Note that by definition byte memory accesses are always aligned. A memory pointer that refers to primitive data that is n bytes long is said to be aligned if it
3944-464: Is a system where each program is given an area of memory to use and is prevented from going outside that range. If the operating system detects that a program has tried to alter memory that does not belong to it, the program is terminated (or otherwise restricted or redirected). This way, only the offending program crashes, and other programs are not affected by the misbehavior (whether accidental or intentional). Use of protected memory greatly enhances both
4060-409: Is also used to describe semi-volatile behavior constructed from other memory types, such as nvSRAM , which combines SRAM and a non-volatile memory on the same chip , where an external signal copies data from the volatile memory to the non-volatile memory, but if power is removed before the copy occurs, the data is lost. Another example is battery-backed RAM , which uses an external battery to power
4176-470: Is computer memory that requires power to maintain the stored information. Most modern semiconductor volatile memory is either static RAM (SRAM) or dynamic RAM (DRAM). DRAM dominates for desktop system memory. SRAM is used for CPU cache . SRAM is also found in small embedded systems requiring little memory. SRAM retains its contents as long as the power is connected and may use a simpler interface, but commonly uses six transistors per bit . Dynamic RAM
4292-549: Is free, it may be necessary to choose an existing page (known as a victim ), using some replacement algorithm , and save it to disk (a process called paging ). With some MMUs, there can also be a shortage of PTEs, in which case the OS will have to free one for the new mapping. The MMU may also generate illegal access error conditions or invalid page faults upon illegal or non-existing memory accesses, respectively, leading to segmentation fault or bus error conditions when handled by
4408-410: Is how a dynamic allocator that has no knowledge of alignment, can be used to provide aligned buffers, at the price of a factor two in space loss. where aligntonext( p , r ) works by adding an aligned increment, then clearing the r least significant bits of p . A possible implementation is Computer memory Computer memory stores information, such as data and programs, for immediate use in
4524-918: Is more complicated for interfacing and control, needing regular refresh cycles to prevent losing its contents, but uses only one transistor and one capacitor per bit, allowing it to reach much higher densities and much cheaper per-bit costs. Non-volatile memory can retain the stored information even when not powered. Examples of non-volatile memory include read-only memory , flash memory , most types of magnetic computer storage devices (e.g. hard disk drives , floppy disks and magnetic tape ), optical discs , and early computer storage methods such as magnetic drum , paper tape and punched cards . Non-volatile memory technologies under development include ferroelectric RAM , programmable metallization cell , Spin-transfer torque magnetic RAM , SONOS , resistive random-access memory , racetrack memory , Nano-RAM , 3D XPoint , and millipede memory . A third category of memory
4640-430: Is not present when compiling for x86. It would be beneficial to allocate memory aligned to cache lines . If an array is partitioned for more than one thread to operate on, having the sub-array boundaries unaligned to cache lines could lead to performance degradation. Here is an example to allocate memory (double array of size 10) aligned to cache of 64 bytes. Alignment concerns can affect areas much larger than
4756-446: Is one of the benefits of paging . However, paged mapping causes another problem, internal fragmentation . This occurs when a program requests a block of memory that does not cleanly map into a page, for instance, if a program requests a 1 KB buffer to perform file work. In this case, the request results in an entire page being set aside even though only 1 KB of the page will ever be used; if pages are larger than 1 KB,
Data structure alignment - Misplaced Pages Continue
4872-424: Is only allowed to contain addresses that are n -byte aligned, otherwise it is said to be unaligned . A memory pointer that refers to a data aggregate (a data structure or array) is aligned if (and only if) each primitive datum in the aggregate is aligned. Note that the definitions above assume that each primitive datum is a power of two bytes long. When this is not the case (as with 80-bit floating-point on x86 )
4988-411: Is organized into memory cells each storing one bit (0 or 1). Flash memory organization includes both one bit per memory cell and a multi-level cell capable of storing multiple bits per cell. The memory cells are grouped into words of fixed word length , for example, 1, 2, 4, 8, 16, 32, 64 or 128 bits. Each word can be accessed by a binary address of N bits, making it possible to store 2 words in
5104-479: Is physically stored or whether the user's computer will have enough memory. The operating system will place actively used data in RAM, which is much faster than hard disks. When the amount of RAM is not sufficient to run all the current programs, it can result in a situation where the computer spends more time moving data from RAM to disk and back than it does accomplishing tasks; this is known as thrashing . Protected memory
5220-404: Is properly aligned. In addition, the data structure as a whole may be padded with a final unnamed member. This allows each member of an array of structures to be properly aligned. Padding is only inserted when a structure member is followed by a member with a larger alignment requirement or at the end of the structure. By changing the ordering of members in a structure, it is possible to change
5336-488: Is stored within memory cells built from MOS transistors and other components on an integrated circuit . There are two main kinds of semiconductor memory: volatile and non-volatile . Examples of non-volatile memory are flash memory and ROM , PROM , EPROM , and EEPROM memory. Examples of volatile memory are dynamic random-access memory (DRAM) used for primary storage and static random-access memory (SRAM) used mainly for CPU cache . Most semiconductor memory
5452-399: Is very small. An OS may treat multiple pages as if they were a single larger page. For example, Linux on VAX groups eight pages together. Thus, the system is viewed as having 4 KB pages. The VAX divides memory into four fixed-purpose regions, each 1 GB in size. They are: Page tables are big linear arrays. Normally, this would be very wasteful when addresses are used at both ends of
5568-490: The Electrotechnical Laboratory in 1972. Flash memory was invented by Fujio Masuoka at Toshiba in the early 1980s. Masuoka and colleagues presented the invention of NOR flash in 1984, and then NAND flash in 1987. Toshiba commercialized NAND flash memory in 1987. Developments in technology and economies of scale have made possible so-called very large memory (VLM) computers. Volatile memory
5684-581: The MOS 6502 . For instance, the Atari MMU would express additional bits on the address bus to select among several banks of DRAM memory based on which of the chips was currently active, normally the CPU or ANTIC . This was used to expand the available memory on the Atari 130XE to 128 kB. The Commodore 128 used a similar approach. Most modern systems divide memory into pages that are 4–64 KB in size, often with
5800-489: The Motorola 68000 microprocessor and introduced in 1982. It includes the original Sun 1 memory management unit that provides address translation, memory protection, memory sharing and memory allocation for multiple processes running on the CPU. All access of the CPU to private on-board RAM, external Multibus memory, on-board I/O and the Multibus I/O runs through the MMU, where address translation and protection are done in
5916-488: The Royal Radar Establishment proposed digital storage systems that use CMOS (complementary MOS) memory cells, in addition to MOSFET power devices for the power supply , switched cross-coupling, switches and delay-line storage . The development of silicon-gate MOS integrated circuit (MOS IC) technology by Federico Faggin at Fairchild in 1968 enabled the production of MOS memory chips . NMOS memory
SECTION 50
#17331148615996032-544: The System/360 Model 95 . Toshiba introduced bipolar DRAM memory cells for its Toscal BC-1411 electronic calculator in 1965. While it offered improved performance, bipolar DRAM could not compete with the lower price of the then dominant magnetic-core memory. MOS technology is the basis for modern DRAM. In 1966, Robert H. Dennard at the IBM Thomas J. Watson Research Center was working on MOS memory. While examining
6148-535: The Whirlwind I computer in 1953. Magnetic-core memory was the dominant form of memory until the development of MOS semiconductor memory in the 1960s. The first semiconductor memory was implemented as a flip-flop circuit in the early 1960s using bipolar transistors . Semiconductor memory made from discrete devices was first shipped by Texas Instruments to the United States Air Force in 1961. In
6264-576: The Zilog Z8000 family of processors. Later microprocessors (such as the Motorola 68030 and the Zilog Z280 ) placed the MMU together with the CPU on the same integrated circuit, as did the Intel 80286 and later x86 microprocessors. While this article concentrates on modern MMUs, commonly based on demand paging, early systems used base and bounds addressing that further developed into segmentation , or used
6380-494: The base and limit , although many other terms have been used. When the operating system requested memory to load a program, or a program requested more memory to hold data from a file for instance, it would call the memory handling library . This examined the mappings to look for an area in main memory large enough to hold the request. If such a block was found, a new entry was entered into the table. From then on, when that program accessed memory, all of its addresses were offset by
6496-434: The computer . The term memory is often synonymous with the terms RAM , main memory , or primary storage . Archaic synonyms for main memory include core (for magnetic core memory) and store . Main memory operates at a high speed compared to mass storage which is slower but less expensive per bit and higher in capacity. Besides storing opened programs and data being actively processed, computer memory serves as
6612-457: The 1980s. This problem can be reduced by making the pages larger, say 64 kB instead of 8. Now the page index uses 16 bits and the resulting page table is 64 kB, which is more tractable. Moving to a larger page size leads to the second problem, increased internal fragmentation. A program that generates a series of requests for small block will be assigned large blocks and thereby waste large amounts of memory. The paged translation approach
6728-630: The ARMv6 ISA require mandatory aligned memory access for all multi-byte load and store instructions. Depending on which specific instruction was issued, the result of attempted misaligned access might be to round down the least significant bits of the offending address turning it into an aligned access (sometimes with additional caveats), or to throw an MMU exception (if MMU hardware is present), or to silently yield other potentially unpredictable results. The ARMv6 and later architectures support unaligned access in many circumstances, but not necessarily all. When
6844-541: The Arma Division of the American Bosch Arma Corporation. In 1967, Dawon Kahng and Simon Sze of Bell Labs proposed that the floating gate of a MOS semiconductor device could be used for the cell of a reprogrammable ROM, which led to Dov Frohman of Intel inventing EPROM (erasable PROM) in 1971. EEPROM (electrically erasable PROM) was developed by Yasuo Tarui, Yutaka Hayashi and Kiyoko Naga at
6960-555: The CPU, with four processor registers holding base values accessed directly by the program. These mapped only the upper 4 bits of the 20-bit address, and there was no equivalent of a limit, which was simply the lower 16-bits of the address and thus a fixed 64 kB. Later entries in the x86 architecture series used different approaches. Some systems, such as the GE 645 and its successors, used both segmentation and paging. The table of segments, instead of containing per-segment entries giving
7076-464: The MMU when trapping into OS code. The IBM System/360 Model 67 , which was introduced August, 1965, included an MMU called a dynamic address translation (DAT) box. It has the unusual feature of storing accessed and dirty bits outside of the page table (along with the four bit protection key for all S/360 processors). They refer to physical memory rather than virtual memory, and are accessed by special-purpose instructions. This reduces overhead for
SECTION 60
#17331148615997192-509: The OS, which would otherwise need to propagate accessed and dirty bits from the page tables to a more physically oriented data structure. This makes OS-level virtualization , later called paravirtualization , easier. Starting in August, 1972, the IBM System/370 has a similar MMU, although it initially supported only a 24-bit virtual address space rather than the 32-bit virtual address space of
7308-517: The System/360 Model 67. It also stores the accessed and dirty bits outside the page table. In early 1983, the System/370-XA architecture expanded the virtual address space to 31 bits, and in 2000, the 64-bit z/Architecture was introduced, with the address space expanded to 64 bits; those continue to store the accessed and dirty bits outside the page table. VAX pages are 512 bytes, which
7424-454: The accessed bit if they are to operate efficiently. Typically, the OS will periodically unmap pages so that page-not-present faults can be used to let the OS set an accessed bit. ARM architecture -based application processors implement an MMU defined by ARM's virtual memory system architecture. The current architecture defines PTEs for describing 4 KB and 64 KB pages, 1 MB sections and 16 MB super-sections; legacy versions also defined
7540-400: The addresses from each program into separate areas in physical memory, which is generally much smaller than the theoretical maximum. This is possible because programs rarely use large amounts of memory at any one time. Most modern operating systems (OS) work in concert with an MMU to provide virtual memory (VM) support. The MMU tracks memory use in fixed-size blocks known as pages , and if
7656-425: The amount of padding required to maintain alignment. For example, if members are sorted by descending alignment requirements a minimal amount of padding is required. The minimal amount of padding required is always less than the largest alignment in the structure. Computing the maximum amount of padding required is more complicated, but is always less than the sum of the alignment requirements for all members minus twice
7772-411: The base value. When the program is done with the memory it requested and releases, or the program exits, the entries associated with it are released. This style of access, over time, became common in the mainframe market and was known as segmented translation , although a variety of terms are used here as well. This style has the advantage of simplicity; the memory blocks are continuous and thus only
7888-402: The capability to use so-called huge pages of 2 MB or 1 GB in size (often both variants are possible). Page translations are cached in a translation lookaside buffer (TLB). Some systems, mainly older RISC designs, trap into the OS when a page translation is not found in the TLB. Most systems use a hardware-based tree walker. Most systems allow the MMU to be disabled, but some disable
8004-487: The characteristics of MOS technology, he found it was possible to build capacitors , and that storing a charge or no charge on the MOS capacitor could represent the 1 and 0 of a bit, while the MOS transistor could control writing the charge to the capacitor. This led to his development of a single-transistor DRAM memory cell. In 1967, Dennard filed a patent for a single-transistor DRAM memory cell based on MOS technology. This led to
8120-442: The computer, aligned accesses will always access a single memory word. This may not be true for misaligned data accesses. If the highest and lowest bytes in a datum are not within the same memory word, the computer must split the datum access into multiple memory accesses. This requires a lot of complex circuitry to generate the memory accesses and coordinate them. To handle the case where the memory words are in different memory pages
8236-399: The context influences the conditions where the datum is considered aligned or not. Data structures can be stored in memory on the stack with a static size known as bounded or on the heap with a dynamic size known as unbounded . The CPU accesses memory by a single memory word at a time. As long as the memory word size is at least as large as the largest primitive data type supported by
8352-416: The data size. For instance, in a 32-bit architecture, the data may be aligned if the data is stored in four consecutive bytes and the first byte lies on a 4-byte boundary. Data alignment is the aligning of elements according to their natural alignment. To ensure natural alignment, it may be necessary to insert some padding between structure elements or after the last element of a structure. For example, on
8468-582: The delay line, the Williams tube and Selectron tube , originated in 1946, both using electron beams in glass tubes as means of storage. Using cathode-ray tubes , Fred Williams invented the Williams tube, which was the first random-access computer memory . The Williams tube was able to store more information than the Selectron tube (the Selectron was limited to 256 bits, while the Williams tube could store thousands) and
8584-401: The early 1940s. Through the construction of a glass tube filled with mercury and plugged at each end with a quartz crystal, delay lines could store bits of information in the form of sound waves propagating through the mercury, with the quartz crystals acting as transducers to read and write bits. Delay-line memory was limited to a capacity of up to a few thousand bits. Two alternatives to
8700-428: The endianness used natively by the host machine. The following formulas provide the number of padding bytes required to align the start of a data structure (where mod is the modulo operator ): For example, the padding to add to offset 0x59d for a 4-byte aligned structure is 3. The structure will then start at 0x5a0, which is a multiple of 4. However, when the alignment of offset is already equal to that of align ,
8816-530: The first commercial DRAM IC chip, the Intel 1103 in October 1970. Synchronous dynamic random-access memory (SDRAM) later debuted with the Samsung KM48SL2000 chip in 1992. The term memory is also often used to refer to non-volatile memory including read-only memory (ROM) through modern flash memory . Programmable read-only memory (PROM) was invented by Wen Tsing Chow in 1956, while working for
8932-461: The first device so that the value read is neither the original value nor the updated value. Although such failures are rare, they can be very difficult to identify. Although the compiler (or interpreter ) normally allocates individual data items on aligned boundaries, data structures often have members with different alignment requirements. To maintain proper alignment the translator normally inserts additional unnamed data members so that each member
9048-518: The following types: Virtual memory is a system where physical memory is managed by the operating system typically with assistance from a memory management unit , which is part of many modern CPUs . It allows multiple types of memory to be used. For example, some data can be stored in RAM while other data is stored on a hard drive (e.g. in a swapfile ), functioning as an extension of the cache hierarchy . This offers several advantages. Computer programmers no longer need to worry about where their data
9164-449: The late 1960s. The invention of the metal–oxide–semiconductor field-effect transistor ( MOSFET ) enabled the practical use of metal–oxide–semiconductor (MOS) transistors as memory cell storage elements. MOS memory was developed by John Schmidt at Fairchild Semiconductor in 1964. In addition to higher performance, MOS semiconductor memory was cheaper and consumed less power than magnetic core memory. In 1965, J. Wood and R. Ball of
9280-447: The location is updated within some known retention time, the data stays valid. After a period of time without update, the value is copied to a less-worn circuit with longer retention. Writing first to the worn area allows a high write rate while avoiding wear on the not-worn circuits. As a second example, an STT-RAM can be made non-volatile by building large cells, but doing so raises the cost per bit and power requirements and reduces
9396-411: The main memory every time a virtual address is mapped. Other MMUs may have a private array of memory, registers, or static RAM that holds a set of mapping information. The virtual page number may be directly used as an index into the page table or other mapping information, or it may be further divided, with bits at a given level used as an index into a table of lower-level tables into which bits at
9512-457: The memory device in case of external power loss. If power is off for an extended period of time, the battery may run out, resulting in data loss. Proper management of memory is vital for a computer system to operate properly. Modern operating systems have complex systems to properly manage memory. Failure to do so can lead to bugs or slow performance. Improper management of memory is a common cause of bugs and security vulnerabilities, including
9628-443: The memory used by other programs. This is done by viruses and malware to take over computers. It may also be used benignly by desirable programs which are intended to modify other programs, debuggers , for example, to insert breakpoints or hooks. Memory management unit In modern systems, programs generally have addresses that access the theoretical maximum memory of the computer architecture , 32 or 64 bits. The MMU maps
9744-474: The memory. In the early 1940s, memory technology often permitted a capacity of a few bytes. The first electronic programmable digital computer , the ENIAC , using thousands of vacuum tubes , could perform simple calculations involving 20 numbers of ten decimal digits stored in the vacuum tubes. The next significant advance in computer memory came with acoustic delay-line memory , developed by J. Presper Eckert in
9860-560: The next level down are used as an index, with two or more levels of indexing. The physical page number is combined with the page offset to give the complete physical address. A page table entry or other per-page information may also include information about whether the page has been written to (the dirty bit ), when it was last used (the accessed bit , for a least recently used (LRU) page replacement algorithm ), what kind of processes ( user mode or supervisor mode ) may read and write it, and whether it should be cached . Sometimes,
9976-432: The operating system. In some cases, a page fault may indicate a software bug , which can be prevented by using memory protection as one of key benefits of an MMU: an operating system can use it to protect against errant programs by disallowing access to memory that a particular program should not have access to. Typically, an operating system assigns each program its own virtual address space. A paged MMU also mitigates
10092-458: The packing size of a structure from the project default packing. This leads to interoperability problems with library headers which use, for example, #pragma pack(8) , if the project packing is smaller than this. For this reason, setting the project packing to any value other than the default of 8 bytes would break the #pragma pack directives used in library headers and result in binary incompatibilities between structures. This limitation
10208-451: The page mask bits are not stored in the VPN2. Each TLB entry has its own page size, which can be any value from 1 KB to 256 MB in multiples of four. Each PFN in a TLB entry has a caching attribute, a dirty and a valid status bit. A VPN2 has a global status bit and an OS assigned ID which participates in the virtual address TLB entry match, if the global status bit is set to zero. A PFN stores
10324-400: The page size is dependent on the processor. pages. After a TLB miss, low-level firmware machine code (here called PALcode ) walks a page table. The OpenVMS AXP PALcode and DEC OSF/1 PALcode walk a three-level tree-structured page table. Addresses are broken down into an unused set of bits (containing the same value as the uppermost bit of the index into the root level of the tree),
10440-401: The physical address without the page mask bits. A TLB refill exception is generated when there are no entries in the TLB that match the mapped virtual address. A TLB invalid exception is generated when there is a match but the entry is marked invalid. A TLB modified exception is generated when a store instruction references a mapped address and the matching entry's dirty status is not set. If
10556-414: The physical base address and length of the segment, contains entries giving the physical base address of a page table for the segment, in addition to the length of the segment. Physical memory is divided into fixed-size pages, and the same techniques used for purely page-based demand paging are used for segment-and-page-based demand paging. Another approach to memory handling is to break up main memory into
10672-462: The possible range, but the page tables for P0 and P1 space are stored in the paged S0 space. Thus, there is effectively a two-level tree , allowing applications to have sparse memory layout without wasting a lot of space on unused page table entries. Unlike page table entries in most MMUs, page table entries in the VAX MMU lack an accessed bit . OSes which implement paging must find some way to emulate
10788-416: The problem of external fragmentation of memory. After blocks of memory have been allocated and freed, the free memory may become fragmented (discontinuous) so that the largest contiguous block of free memory may be much smaller than the total amount. With virtual memory, a contiguous range of virtual addresses can be mapped to several non-contiguous blocks of physical memory; this non-contiguous allocation
10904-464: The processor must either verify that both pages are present before executing the instruction or be able to handle a TLB miss or a page fault on any memory access during the instruction execution. Some processor designs deliberately avoid introducing such complexity, and instead yield alternative behavior in the event of a misaligned memory access. For example, implementations of the ARM architecture prior to
11020-432: The program can use it. This is known as demand paging . Modern MMUs generally perform additional memory-related tasks as well. Memory protection blocks attempts by a program to access memory it has not previously requested, which prevents a misbehaving program from using up all memory or malicious code from reading data from another program. They also often manage a processor cache , which stores recently accessed data in
11136-409: The reliability and security of a computer system. Without protected memory, it is possible that a bug in one program will alter the memory used by another program. This will cause that other program to run off of corrupted memory with unpredictable results. If the operating system's memory is corrupted, the entire computer system may crash and need to be rebooted . At times programs intentionally alter
11252-595: The remainder of the page is wasted. If many small allocations of this sort are made, memory can be used up even though much of it remains empty. In some early microprocessor designs, memory management was performed by a separate integrated circuit such as the VLSI Technology VI475 (1986), the Motorola 68851 (1984) used with the Motorola 68020 CPU in the Macintosh II , or the Z8010 and Z8015 (1985) used with
11368-443: The same values in to the segment or page maps of different contexts. Additional contexts can be handled by treating the segment map as a context cache and replacing out-of-date contexts on a least-recently used basis. The context register makes no distinction between user and supervisor states. Interrupts and traps do not switch contexts, which requires that all valid interrupt vectors always be mapped in page 0 of context, as well as
11484-475: The same year, the concept of solid-state memory on an integrated circuit (IC) chip was proposed by applications engineer Bob Norman at Fairchild Semiconductor . The first bipolar semiconductor memory IC chip was the SP95 introduced by IBM in 1965. While semiconductor memory offered improved performance over magnetic-core memory, it remained larger and more expensive and did not displace magnetic-core memory until
11600-400: The second modulo in (align - (offset mod align)) mod align will return zero, therefore the original value is left unchanged. Since the alignment is by definition a power of two, the modulo operation can be reduced to a bitwise AND operation. The following formulas produce the correct values (where & is a bitwise AND and ~ is a bitwise NOT ) – providing the offset is unsigned or
11716-518: The segmented case, programs see its memory as a single contiguous block. There are two disadvantages to this approach. The first is that as the virtual address space expands, the amount of memory needed to hold the mapping increases as well. For instance, in the 68020 the addresses are 32-bits wide, meaning the segment number for the same 8 kB page size is now the upper 19 bits and the mapping table expands to 512 kB in size, far beyond what could be implemented in hardware for reasonable cost in
11832-406: The structure members and thus no padding bytes would be inserted. While there is no standard way of defining the alignment of structure members (while C and C++ allow using the alignas specifier for this purpose it can be used only for specifying a stricter alignment), some compilers use #pragma directives to specify packing inside source files. Here is an example: This structure would have
11948-477: The structure now matches the pre-compiled size of 8 bytes . Note that Padding1[1] has been replaced (and thus eliminated) by Data4 and Padding2[3] is no longer necessary as the structure is already aligned to the size of a long word. The alternative method of enforcing the MixedData structure to be aligned to a one byte boundary will cause the pre-processor to discard the pre-determined alignment of
12064-442: The structure usually has a default alignment, meaning that it will, unless otherwise requested by the programmer, be aligned on a pre-determined boundary. The following typical alignments are valid for compilers from Microsoft ( Visual C++ ), Borland / CodeGear ( C++Builder ), Digital Mars (DMC), and GNU ( GCC ) when compiling for 32-bit x86: The only notable differences in alignment for an LP64 64-bit system when compared to
12180-529: The sum of the alignment requirements for the least aligned half of the structure members. Although C and C++ do not allow the compiler to reorder structure members to save space, other languages might. It is also possible to tell most C and C++ compilers to "pack" the members of a structure to a certain level of alignment, e.g. "pack(2)" means align data members larger than a byte to a two-byte boundary so that any padding members are at most one byte long. Likewise, in PL/I
12296-519: The system uses two's complement arithmetic: Data structure members are stored sequentially in memory so that, in the structure below, the member Data1 will always precede Data2; and Data2 will always precede Data3: If the type "short" is stored in two bytes of memory then each member of the data structure depicted above would be 2-byte aligned. Data1 would be at offset 0, Data2 at offset 2, and Data3 at offset 4. The size of this structure would be 6 bytes. The type of each member of
12412-400: The total size of the structure should be a multiple of the largest alignment of any structure member ( alignof(int) in this case, which = 4 on linux-32bit/gcc). In this case 3 bytes are added to the last member to pad the structure to the size of 12 bytes ( alignof(int) * 3 ). In this example the total size of the structure sizeof (FinalPad) == 8 , not 5 (so that the size is
12528-482: The two values, base and limit, need to be stored. Each entry corresponds to a block of memory used by a single program, and the translation is invisible to the program, which sees main memory starting at address zero and extending to some fixed value. The disadvantage of this approach is that it leads to an effect known as external fragmentation . This occurs when memory allocations are released but are non-contiguous. In this case, enough memory may be available to handle
12644-487: The valid supervisor stack. The Sun-2 workstations are similar; they are built around the Motorola 68010 microprocessor and have a similar memory management unit, with 2 KB pages and 32 KB segments. The context register has a 3-bit system context used in supervisor state and a 3-bit user context used in user state. The Sun-3 workstations, except for the Sun-3/80, Sun-3/460, Sun-3/470, and Sun-3/480, are built around
12760-407: The virtual address space (the range of addresses used by the processor) into pages , each having a size which is a power of 2, usually a few kilobytes , but they may be much larger. Programs reference memory using the natural address size of the machine, typically 32 or 64-bits in modern systems. The bottom bits of the address (the offset within a page) are left unchanged. The upper address bits are
12876-423: The virtual page numbers. Most MMUs use an in-memory table of items called a page table , containing one page table entry (PTE) per virtual page, to map virtual page numbers to physical page numbers in main memory. Multi-level page tables are often used to reduce the size of the page table. An associative cache of PTEs is called a translation lookaside buffer (TLB) and is used to avoid the necessity of accessing
12992-434: The write speed. Using small cells improves cost, power, and speed, but leads to semi-volatile behavior. In some applications, the increased volatility can be managed to provide many benefits of a non-volatile memory, for example by removing power but forcing a wake-up before data is lost; or by caching read-only data and discarding the cached data if the power-off time exceeds the non-volatile threshold. The term semi-volatile
13108-568: Was also supported in software implementations; one example is Apple's MultiFinder , released in 1987 for the Macintosh platform. Each program was allocated an amount of memory that was pre-selected in the Finder and translation from virtual to physical was accomplished within the programs using handles . A more common example is the Intel 8088 used in the IBM PC . This implemented a very simple MMU inside
13224-638: Was commercialized by IBM in the early 1970s. MOS memory overtook magnetic core memory as the dominant memory technology in the early 1970s. The two main types of volatile random-access memory (RAM) are static random-access memory (SRAM) and dynamic random-access memory (DRAM). Bipolar SRAM was invented by Robert Norman at Fairchild Semiconductor in 1963, followed by the development of MOS SRAM by John Schmidt at Fairchild in 1964. SRAM became an alternative to magnetic-core memory, but requires six transistors for each bit of data. Commercial use of SRAM began in 1965, when IBM introduced their SP95 SRAM chip for
13340-421: Was less expensive. The Williams tube was nevertheless frustratingly sensitive to environmental disturbances. Efforts began in the late 1940s to find non-volatile memory . Magnetic-core memory allowed for memory recall after power loss. It was developed by Frederick W. Viehe and An Wang in the late 1940s, and improved by Jay Forrester and Jan A. Rajchman in the early 1950s, before being commercialized with
13456-518: Was widely used by microprocessor MMUs in the 1970s and early 80s, including the Signetics 68905 (which could operate in either mode). Both Signetics and Philips produced a version of the 68000 that combined the 68905 on the same physical chip, the 68070. Another use of this technique is to expand the size of the physical address when the virtual address is too small. For instance, the PDP-11 originally had
#598401