Misplaced Pages

Tandem Computers

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

Tandem Computers, Inc. was the dominant manufacturer of fault-tolerant computer systems for ATM networks, banks , stock exchanges , telephone switching centers, 911 systems, and other similar commercial transaction processing applications requiring maximum uptime and no data loss. The company was founded by Jimmy Treybig in 1974 in Cupertino, California . It remained independent until 1997, when it became a server division within Compaq . It is now a server division within Hewlett Packard Enterprise , following Hewlett-Packard 's acquisition of Compaq and the split of Hewlett-Packard into HP Inc. and Hewlett Packard Enterprise.

#4995

114-533: Tandem's NonStop systems use a number of independent identical processors, redundant storage devices, and redundant controllers to provide automatic high-speed " failover " in the case of a hardware or software failure. To contain the scope of failures and of corrupted data, these multi-computer systems have no shared central components, not even main memory. Conventional multi-computer systems all use shared memories and work directly on shared data objects. Instead, NonStop processors cooperate by exchanging messages across

228-504: A data warehouse and business intelligence server line, HP Neoview , based on the NonStop line. It acted as a database server , providing NonStop OS and NonStop SQL , but lacked the transaction processing functionality of the original NonStop systems. The line was retired , and no longer marketed , as of 24 January 2011. Computer bus In computer architecture , a bus (historically also called data highway or databus )

342-418: A processor or DMA -enabled device needs to read or write to a memory location, it specifies that memory location on the address bus (the value to be read or written is sent on the data bus). The width of the address bus determines the amount of memory a system can address. For example, a system with a 32-bit address bus can address 2 (4,294,967,296) memory locations. If each memory location holds one byte,

456-726: A bidirectional data bus, re-using the same wires for input and output at different times. Some processors use a dedicated wire for each bit of the address bus, data bus, and the control bus. For example, the 64-pin STEbus is composed of 8 physical wires dedicated to the 8-bit data bus, 20 physical wires dedicated to the 20-bit address bus, 21 physical wires dedicated to the control bus, and 15 physical wires dedicated to various power buses. Bus multiplexing requires fewer wires, which reduces costs in many early microprocessors and DRAM chips. One common multiplexing scheme, address multiplexing , has already been mentioned. Another multiplexing scheme re-uses

570-505: A business campus; a cluster of clusters with a total of 224 CPUs. This allowed further scale-up for taking on the largest mainframe applications. Like the CPU modules within the computers, the Guardian operating system could failover entire task sets to other machines in the network. Worldwide clusters of 4000 CPUs could also be built via conventional long-haul network links. In 1986, Tandem introduced

684-726: A card plugged into the bus, which is why computers have so many slots on the bus. But through the 1980s and 1990s, new systems like SCSI and IDE were introduced to serve this need, leaving most slots in modern systems empty. Today there are likely to be about five different buses in the typical machine, supporting various devices. "Third generation" buses have been emerging into the market since about 2001, including HyperTransport and InfiniBand . They also tend to be very flexible in terms of their physical connections, allowing them to be used both as internal buses, as well as connecting different machines together. This can lead to complex problems when trying to service different requests, so much of

798-424: A complex server farm . Most customers also have a backup server in a remote location for IT disaster recovery . There are standard products to keep the data of the production and the backup server in sync, for example, HPE's Remote Database Facility (RDF), hence there is fast takeover and little to no data loss also in a disaster situation with the production server being disabled or destroyed. HP also developed

912-455: A custom operating system which was significantly different from Unix or HP 3000's MPE. It was initially called T/TOS ( Tandem Transactional Operating System ) but soon named Guardian for its ability to protect all data from machine faults and software faults. In contrast to all other commercial operating systems, Guardian was based on message passing as the basic way for all processes to interact, without shared memory, regardless of where

1026-401: A failed part. Their fast clocks could not be synchronized as in strict lock stepping, so voting instead happened at each interrupt. Some other versions of Integrity used 4x "pair and spares" redundancy. Pairs of processors ran in lock-step to check each other. When they disagreed, both processors were marked untrusted, and their workload was taken over by a hot-spare pair of processors whose state

1140-510: A multiplexed address scheme, the address is sent in two equal parts on alternate bus cycles. This halves the number of address bus signals required to connect to the memory. For example, a 32-bit address bus can be implemented by using 16 lines and sending the first half of the memory address, immediately followed by the second half memory address. Typically two additional pins in the control bus – row-address strobe (RAS) and column-address strobe (CAS) – are used to tell

1254-469: A new way that was safe from all " single-point failures " yet would be only marginally more expensive than conventional non-fault-tolerant systems. They would be less expensive and support more throughput than some existing ad-hoc toughened systems that used redundant but usually required "hot spares". Each engineer was confident they could quickly pull off their own part of this complex new design but doubted that others' areas could be worked out. The parts of

SECTION 10

#1732848966005

1368-593: A novel way; the compiler, not the microcode, was responsible for deciding when full registers were spilled to the memory stack and when empty registers were re-filled from the memory stack. On the HP ;3000, this decision took extra microcode cycles in every instruction. The HP 3000 supported COBOL with several instructions for calculating directly on arbitrary-length BCD (binary-coded decimal) strings of digits. The T/16 simplified this to single instructions for converting between BCD strings and 64-bit binary integers. In

1482-497: A pair of CPUs, controllers, or buses, so that the system would keep running without loss of connections if one power supply failed. The careful complex arrangement of parts and connections in customers' larger configurations were documented in a Mackie diagram , named after lead salesman David Mackie, who invented the notation. None of these duplicated parts were wasted "hot spares"; everything added to system throughput during normal operations. Besides recovering well from failed parts,

1596-463: A passive backplane connected directly or through buffer amplifiers to the pins of the CPU . Memory and other devices would be added to the bus using the same address and data pins as the CPU itself used, connected in parallel. Communication was controlled by the CPU, which read and wrote data from the devices as if they are blocks of memory, using the same instructions, all timed by a central clock controlling

1710-470: A process or CPU failure. Data integrity is maintained during those takeovers; no transactions or data are lost or corrupted. The operating system as a whole is branded NonStop OS and includes the Guardian layer, which is a low-level component of the operating system and the Open System Services (OSS) personality which runs atop this layer, which implements a Unix-like interface for other components of

1824-423: A reliable fabric, and software takes periodic snapshots for possible rollback of program memory state. Besides masking failures, this " shared-nothing " messaging system design also scales to the largest commercial workloads. Each doubling of the total number of processors doubles system throughput, up to the maximum configuration of 4000 processors. In contrast, the performance of conventional multiprocessor systems

1938-506: A serial bus inherently has no timing skew or crosstalk. USB , FireWire , and Serial ATA are examples of this. Multidrop connections do not work well for fast serial buses, so most modern serial buses use daisy-chain or hub designs. The transition from parallel to serial buses was allowed by Moore's law which allowed for the incorporation of SerDes in integrated circuits which are used in computers. Network connections such as Ethernet are not generally regarded as buses, although

2052-408: A simpler hardware-only memory-centric design where all recovery was done by switching between hot spares. The most successful competitor was Stratus Technologies , whose machines were re-marketed by IBM as "IBM System/88". In such systems, the spare processors do not contribute to system throughput between failures, but merely redundantly execute exactly the same data thread as the active processor at

2166-446: A single source LRI/LRU or, as with ARINC 629, MIL-STD-1553B, and STANAG 3910, be duplex , allow all the connected LRI/LRUs to act, at different times ( half duplex ), as transmitters and receivers of data. The frequency or the speed of a bus is measured in Hz such as MHz and determines how many clock cycles there are per second; there can be one or more data transfers per clock cycle. If there

2280-713: A small number of top-of-stack, 16-bit data registers plus some extra address registers for accessing the memory stack. Both used Huffman encoding of operand address offsets, to fit a large variety of address modes and offset sizes into the 16-bit instruction format with good code density. Both relied heavily on pools of indirect addresses to overcome the short instruction format. Both supported larger 32- and 64-bit operands via multiple ALU cycles, and memory-to-memory string operations. Both used "big-endian" addressing of long versus short memory operands. These features had all been inspired by Burroughs B5500–B6800 mainframe stack machines. The T/16 instruction set changed several features from

2394-569: A stronger foundation than its inherited HP 3000 traits. Rainbow's hardware was a 32-bit register-file machine that aimed to be better than a Digital Equipment Corporation VAX . For reliable programming, the main programming language was "TPL", a subset of Ada . At that time, programmers barely understood how to compile Ada to unoptimized code. There was no migration path for existing NonStop system software coded in TAL. The OS, database and Cobol compilers were entirely redesigned. Customers would see it as

SECTION 20

#1732848966005

2508-486: A third generation CPU, the NonStop VLX . It had 32-bit data paths, wider microcode, 12 MHz cycle time, and a peak rate of one instruction per cycle. It was built from three boards of ECL gate array chips (with TTL pinout). It had a revised Dynabus with speed raised to 20 MB/s per link, 40 MB/s total. Later, FOX II increased the physical diameter of TNS clusters to 4 kilometers. Tandem's initial database support

2622-533: A totally disjoint product line requiring all-new software from them. The software side of this project took much longer than planned. The hardware was already obsolete and outperformed by TXP before its software was ready, resulting in the Rainbow project being abandoned. All subsequent efforts emphasized upward compatibility and easy migration paths. Development of Rainbow's advanced client/server application development framework called "Crystal" continued awhile longer and

2736-433: A unified system bus . In this case, a single mechanical and electrical system can be used to connect together many of the system components, or in some cases, all of them. Later computer programs began to share memory common to several CPUs. Access to this memory bus had to be prioritized, as well. The simple way to prioritize interrupts or bus access was with a daisy chain . In this case signals will naturally flow through

2850-406: Is a communication system that transfers data between components inside a computer , or between computers. This expression covers all related hardware components (wire, optical fiber , etc.) and software , including communication protocols . In most traditional computer architectures , the CPU and main memory tend to be tightly coupled, with the internal bus connecting the two being known as

2964-665: Is a series of server computers introduced to market in 1976 by Tandem Computers Inc., beginning with the NonStop product line . It was followed by the Tandem Integrity NonStop line of lock-step fault-tolerant computers, now defunct (not to be confused with the later and much different Hewlett-Packard Integrity product line extension ). The original NonStop product line is currently offered by Hewlett Packard Enterprise since Hewlett-Packard Company's split in 2015. Because NonStop systems are based on an integrated hardware/software stack, Tandem and later HPE also developed

3078-412: Is a single transfer per clock cycle it is known as Single Data Rate (SDR), and if there are two transfers per clock cycle it is known as Double Data Rate (DDR) although the use of signalling other than SDR is uncommon outside of RAM. An example of this is PCIe which uses SDR. Within each data transfer there can be multiple bits of data. This is described as the width of a bus which is the number of bits

3192-431: Is limited by the speed of some shared memory, bus, or switch. Adding more than 4–8 processors in that manner gives no further system speedup. NonStop systems have more often been bought to meet scaling requirements than for extreme fault tolerance. They compete against IBM's largest mainframes, despite being built from simpler minicomputer technology. Tandem Computers was founded in 1974 by James Treybig . Treybig first saw

3306-475: Is provided by the bus‍—‌is not the case in many avionic systems , where data connections such as ARINC 429 , ARINC 629 , MIL-STD-1553B (STANAG 3838), and EFABus ( STANAG 3910 ) are commonly referred to as “data buses” or, sometimes, "databuses". Such avionic data buses are usually characterized by having several equipments or Line Replaceable Items/Units (LRI/LRUs) connected to a common, shared media . They may, as with ARINC 429, be simplex , i.e. have

3420-436: Is the case with PCI . While the term " peripheral bus " is sometimes used to refer to all other buses apart from the system bus, the "expansion bus" has also been used to describe a third category of buses separate from the peripheral bus, which includes bus systems like PCI. Early computer buses were parallel electrical wires with multiple hardware connections, but the term is now used for any physical arrangement that provides

3534-469: The IBM PC , although similar physical architecture can be employed, instructions to access peripherals ( in and out ) and memory ( mov and others) have not been made uniform at all, and still generate distinct CPU signals, that could be used to implement a separate I/O bus. These simple bus systems had a serious drawback when used for general-purpose computers. All the equipment on the bus had to talk at

Tandem Computers - Misplaced Pages Continue

3648-758: The InfiniBand industry standard. All S-Series machines used MIPS processors, including the R4400, R10000 , R12000 , and R14000 . The design of the later, faster MIPS cores was primarily funded by Silicon Graphics Inc . But Intel's sixth generation Pentium Pro overtook the performance of RISC designs and also SGI's graphics business shrunk. After the R10000, there was no investment in significant new MIPS core designs for high-end servers. So Tandem needed to move its NonStop product line to another microprocessor architecture with competitive fast chips. Jimmy Treybig remained CEO of

3762-700: The Lockheed SR-71 Blackbird Mach 3 spy plane. Cyclone's name was supposed to represent its "unstoppable speed in roaring through OLTP workloads". Announcement day was October 17, 1989. That afternoon, the region was struck by the magnitude 6.9 Loma Prieta earthquake , causing freeway collapses in Oakland and major fires in San Francisco . Tandem offices were shaken, but no one was badly hurt on site. In 1980–1983, Tandem attempted to re-design its entire hardware and software stack to put its NonStop methods on

3876-508: The NonStop TXP CPU was the first entirely new implementation of the TNS instruction set architecture. It was built from standard TTL chips and Programmed Array Logic chips, with four boards per CPU module. It had Tandem's first use of cache memory. It had a more direct implementation of 32-bit addressing, but still sent them through 16-bit adders. A wider microcode store allowed a major reduction in

3990-876: The S-100 bus were used, but to reduce latency , modern memory buses are designed to connect directly to DRAM chips, and thus are designed by chip standards bodies such as JEDEC . Examples are the various generations of SDRAM , and serial point-to-point buses like SLDRAM and RDRAM . An exception is the Fully Buffered DIMM which, despite being carefully designed to minimize the effect, has been criticized for its higher latency. Buses can be parallel buses , which carry data words in parallel on multiple wires, or serial buses , which carry data in bit-serial form. The addition of extra power and control connections, differential drivers , and data connections in each direction usually means that most serial buses have more conductors than

4104-484: The SATA ports in modern computers support multiple peripherals, allowing multiple hard drives to be connected without an expansion card . In systems that have a similar architecture to multicomputers , but which communicate by buses instead of networks, the system bus is known as a front-side bus . In such systems, the expansion bus may not share any architecture with their host CPUs, instead supporting many different CPUs, as

4218-446: The system bus . In systems that include a cache , CPUs use high-performance system buses that operate at speeds greater than memory to communicate with memory. The internal bus (also known as the internal data bus, memory bus or system bus ) connects internal components of a computer to the mother board. Local buses connect the CPU and memory to the expansion bus , which in turn connects the computer to peripherals. Bus systems such as

4332-634: The Apache Trafodion project. In 1987, Tandem introduced the NonStop CLX , a low-cost less-expandable minicomputer system. Its role was for growing the low end of the fault-tolerant market, and for deploying on the remote edges of large Tandem networks. Its initial performance was roughly similar to the TXP; later versions improved to where they were about 20% slower than a VLX. Its small cabinet could be installed into any "copier room" office environment. A CLX CPU

4446-525: The CPU core and shared a single bus and single bank of SRAM . As a result, CLX required at least two machine cycles per instruction. In 1989, Tandem introduced the NonStop Cyclone , a fast but expensive system for the mainframe end of the market. Each self-checking CPU took three boards full of hot-running ECL gate array chips, plus memory boards. Despite being microprogrammed, the CPU was superscalar , often completing two instructions per cache cycle. This

4560-506: The Cyclone/R, also known as CLX/R. This was a low-cost mid-range system based on CLX components but used R3000 microprocessors instead of the much slower CLX stack machine board. To minimize time to market, this machine was initially shipped without any MIPS native-mode software. Everything, including its NonStop Kernel (NSK) operating system (a follow-on to Guardian) and NonStop SQL database, was compiled to TNS stack machine code. That object code

4674-465: The DRAM whether the address bus is currently sending the first half of the memory address or the second half. Accessing an individual byte frequently requires reading or writing the full bus width (a word ) at once. In these instances the least significant bits of the address bus may not even be implemented - it is instead the responsibility of the controlling device to isolate the individual byte required from

Tandem Computers - Misplaced Pages Continue

4788-520: The Dynamite to serving primarily as a smart terminal. It was quietly and quickly withdrawn from the market. Tandem's message-based NonStop operating system had advantages for scaling, extreme reliability, and efficiently using expensive "spare" resources. But many potential customers wanted just good-enough reliability in a small system, using a familiar Unix operating system and industry-standard programs. Tandem's various fault-tolerant competitors all adopted

4902-617: The HP 3000 design. The T/16 supported paged virtual memory from the beginning. The HP 3000 series did not add paging until the PA-RISC generation, 10 years later (although via MPE V it had a form of paging using the APL firmware, in 1978). Tandem added support for 32-bit addressing in its second machine; HP 3000 lacked this until its PA-RISC generation. Paging and long addresses were critical for supporting complex system software and large applications. The T/16 treated its top-of-stack registers in

5016-468: The IEEE "Superbus" study group, the open microprocessor initiative (OMI), the open microsystems initiative (OMI), the "Gang of Nine" that developed EISA , etc. Early computer buses were bundles of wire that attached computer memory and peripherals. Anecdotally termed the " digit trunk " in the early Australian CSIRAC computer, they were named after electrical power buses, or busbars . Almost always, there

5130-519: The NonStop Himalaya S-Series with a new top-level system architecture based on ServerNet connections. ServerNet replaced the Dynabus, FOX, and I/O buses. It was much faster, more general, and could be extended to more than just two-way redundancy via an arbitrary fabric of point-to-point connections. Tandem designed ServerNet for its own needs but then promoted its use by others; it evolved into

5244-661: The NonStop OS operating system for them. NonStop systems are, to an extent, self-healing. To circumvent single points of failure , they are equipped with almost all redundant components. When a mainline component fails, the system automatically falls back to the backup. These systems can be used by banks , stock exchanges , payment applications, retail companies, energy and utility services, healthcare organizations, manufacturers, telecommunication providers, transportation, and other enterprises requiring extremely high uptime . Originally introduced in 1976 by Tandem Computers Inc.,

5358-482: The OS to use. The operating system and application are both designed to support the fault tolerant hardware. The operating system continually monitors the status of all components, switching control as necessary to maintain operations. There are also features designed into the software that allow programs to be written as continuously available programs. That is accomplished using a pair of processes where one process performs all

5472-424: The OS, and systems can be expanded up to over 4000 CPUs. This is a shared-nothing architecture — a "share nothing" arrangement also known as loosely coupled multiprocessing . Due to the integrated hardware/software stack and a single system image for even the largest configurations, system management requirements for NonStop systems are rather low. In most deployments there is just a single production server, not

5586-493: The T/16 was also designed to detect as many kinds of intermittent failures as possible, as soon as possible. This prompt detection is called "fail fast". The point was to find and isolate corrupted data before it was permanently written into databases and other disk files. In the T/16, error detection was by added custom circuits that added little cost to the total design; no major parts were duplicated to get error detection. The T/16 CPU

5700-452: The T/16, each CPU consisted of two boards of TTL logic and SRAMs , and ran at about 0.7 MIPS . At any instant, it could access only four virtual memory segments (System Data, System Code, User Data, User Code), each limited to 128 KB in size. The 16-bit address spaces were already small for major applications when it shipped. The first release of T/16 had only a single programming language, Transaction Application Language (TAL). This

5814-533: The address bus pins as the data bus pins, an approach used by conventional PCI and the 8086 . The various "serial buses" can be seen as the ultimate limit of multiplexing, sending each of the address bits and each of the data bits, one at a time, through a single pin (or a single differential pair). Over time, several groups of people worked on various computer bus standards, including the IEEE Bus Architecture Standards Committee (BASC),

SECTION 50

#1732848966005

5928-469: The addressable memory space is 4 GB. Early processors used a wire for each bit of the address width. For example, a 16-bit address bus had 16 physical wires making up the bus. As the buses became wider and lengthier, this approach became expensive in terms of the number of chip pins and board traces. Beginning with the Mostek 4096 DRAM , address multiplexing implemented with multiplexers became common. In

6042-411: The associated eSATA are one example of a system that would formerly be described as internal, while certain automotive applications use the primarily external IEEE 1394 in a fashion more similar to a system bus. Other examples, like InfiniBand and I²C were designed from the start to be used both internally and externally. An address bus is a bus that is used to specify a physical address . When

6156-489: The bits themselves, and allows for an increase in data transfer speed without increasing the frequency of the bus. The effective or real data transfer speed/rate may be lower due to the use of encoding that also allows for error correction such as 128/130b (b for bit) encoding. The data transfer speed is also known as the bandwidth. The simplest system bus has completely separate input data lines, output data lines, and address lines. To reduce cost, most microcomputers have

6270-493: The bus can transfer per clock cycle and can be synonymous with the number of physical electrical conductors the bus has if each conductor transfers one bit at a time. The data rate in bits per second can be obtained by multiplying the number of bits per clock cycle times the frequency times the number of transfers per clock cycle. Alternatively a bus such as PCIe can use modulation or encoding such as PAM4 which groups 2 bits into symbols which are then transferred instead of

6384-409: The bus had to talk at the same speed. While the CPU was now isolated and could increase speed, CPUs and memory continued to increase in speed much faster than the buses they talked to. The result was that the bus speeds were now much slower than what a modern system needed, and the machines were left starved for data. A particularly common example of this problem was that video cards quickly outran even

6498-511: The bus in physical or logical order, eliminating the need for complex scheduling. Digital Equipment Corporation (DEC) further reduced cost for mass-produced minicomputers , and mapped peripherals into the memory bus, so that the input and output devices appeared to be memory locations. This was implemented in the Unibus of the PDP-11 around 1969. Early microcomputer bus systems were essentially

6612-615: The bus supplied power, but often use a separate power source. This distinction is exemplified by a telephone system with a connected modem , where the RJ11 connection and associated modulated signalling scheme is not considered a bus, and is analogous to an Ethernet connection. A phone line connection scheme is not considered to be a bus with respect to signals, but the Central Office uses buses with cross-bar switches for connections between phones. However, this distinction‍—‌that power

6726-431: The cards to be much more complex. These buses also often addressed speed issues by being "bigger" in terms of the size of the data path, moving from 8-bit parallel buses in the first generation, to 16 or 32-bit in the second, as well as adding software setup (now standardised as Plug-n-play ) to supplant or replace the jumpers. However, these newer systems shared one quality with their earlier cousins, in that everyone on

6840-563: The chips must be designed to be fully deterministic. Any hidden internal state must be cleared by the chip's reset mechanism. Otherwise, the matched chips can go out of sync for no visible reason and without any faults, long after the chips are restarted. Chip designers agree that these are good principles because it helps them test chips at manufacturing time. But all new microprocessor chips seemed to have bugs in this area and required months of shared work between MIPS (the third-party manufacturer used by Tandem) and Tandem to eliminate or work around

6954-575: The choice to abdicate its successful PA-RISC product lines in favor of Intel's Itanium microprocessors that HP helped to design. Shortly thereafter, Compaq and HP announced their plan to merge and consolidate their similar product lines. This contentious merger became official in May 2002. The consolidations were painful and destroyed the DEC and "HP Way" engineer-oriented cultures, but the combined company did know how to sell complex systems to enterprises and profit, so it

SECTION 60

#1732848966005

7068-476: The company he founded until a downturn in 1996. The next CEO was Roel Pieper , who joined the company in 1996 as president and CEO. Re-branding to promote itself as a true Wintel (Windows/Intel) platform was conducted by their in-house brand and creative team led by Ronald May, who later went on to co-found the Silicon Valley Brand Forum in 1999. The concept worked, and shortly thereafter the company

7182-520: The complete word transmitted. This is the case, for instance, with the VESA Local Bus which lacks the two least significant bits, limiting this bus to aligned 32-bit transfers. Historically, there were also some examples of computers which were only able to address words -- word machines . The memory bus is the bus which connects the main memory to the memory controller in computer systems . Originally, general-purpose buses like VMEbus and

7296-491: The computer into two "worlds", the CPU and memory on one side, and the various devices on the other. A bus controller accepted data from the CPU side to be moved to the peripherals side, thus shifting the communications protocol burden from the CPU itself. This allowed the CPU and memory side to evolve separately from the device bus, or just "bus". Devices on the bus could talk to each other with no CPU intervention. This led to much better "real world" performance, but also required

7410-414: The cycles executed per instruction; speed increased to 2.0 MIPS. It used the same rack packaging, controllers, backplane, and buses as before. The Dynabus and I/O buses had been overdesigned in the T/16 so they would work for several generations of upgrades. Up to 14 TXP and NonStop II systems could now be combined via FOX , a long-distance fault-tolerant fibre optic bus for connecting TNS clusters across

7524-548: The data directly in memory, a concept known as direct memory access . Low-performance bus systems have also been developed, such as the Universal Serial Bus (USB). Given technological changes, the classical terms "system", "expansion" and "peripheral" no longer have the same connotations. Other common categorization systems are based on the bus's primary role, connecting devices internally or externally. However, many common modern bus systems can be used for both. SATA and

7638-455: The details of this in a semi-portable way. In 1981, all T/16 CPUs were replaced by the NonStop II . Its main difference from the T/16 was support for occasional 32-bit addressing via a user-switchable "extended data segment". This supported the next ten years of growth in software and was an advantage over the T/16 or HP 3000. Visible registers remained 16-bit, and this unplanned addition to

7752-489: The difference is largely conceptual rather than practical. An attribute generally used to characterize a bus is that power is provided by the bus for the connected hardware. This emphasizes the busbar origins of bus architecture as supplying switched or distributed power. This excludes, as buses, schemes such as serial RS-232 , parallel Centronics , IEEE 1284 interfaces and Ethernet, since these devices also needed separate power supplies. Universal Serial Bus devices may use

7866-581: The duplicated parts are commodity single-chip microprocessors. Tandem's products for this market began with the Integrity line in 1989, using MIPS processors and a "NonStop UX" variant of Unix. It was developed in Austin, Texas. In 1991, the Integrity S2 used TMR, Triple Modular Redundancy, where each logical CPU used three MIPS R2000 microprocessors to execute the same data thread, with voting to find and lock out

7980-487: The era, including large mainframes , had mean-time-between-failures (MTBF) on the order of a few days, the NonStop system was designed to failure intervals 100 times longer, with uptimes measured in years. Nevertheless, the NonStop was designed to be price-competitive with conventional systems, with a simple 2-CPU system priced at just over twice that of a competing single-processor mainframe, as opposed to four or more times of other fault-tolerant solutions. The first system

8094-579: The fastest-growing public company in America. By 1996, Tandem was a $ 2.3 billion company employing approximately 8,000 people worldwide. Over 40 years, Tandem's main NonStop product line grew and evolved in an upward-compatible way from the initial T/16 fault-tolerant system, with three major changes to its top-level modular architecture or its programming-level instruction set architecture. Within each series, there have been several major re-implementations as chip technology progressed. While conventional systems of

8208-640: The final subtle bugs. In 1993, Tandem released the NonStop Himalaya K-series with the faster MIPS R4400 , a native mode NSK operating system, and fully expandable Cyclone system components. These were connected by Dynabus, Dynabus+, and the original I/O bus, which by now were all running out of performance headroom. In 1995, the NonStop Kernel was extended with a Unix-like POSIX environment called Open System Services. The original Guardian shell and ABI remained available. In 1997, Tandem introduced

8322-566: The hardware and software design that did not have to be different were largely based on incremental improvements to the familiar hardware and software designs of the HP 3000. Many subsequent engineers and programmers also came from HP. Tandem headquarters in Cupertino, California, were a quarter mile away from the HP offices. Initial venture capital investment in Tandem Computers came from Tom Perkins, who

8436-632: The input and output of a given bus. IBM introduced these on the IBM 709 in 1958, and they became a common feature of their platforms. Other high-performance vendors like Control Data Corporation implemented similar designs. Generally, the channel controllers would do their best to run all of the bus operations internally, moving data when the CPU was known to be busy elsewhere if possible, and only using interrupts when necessary. This greatly reduced CPU load, and provided better overall system performance. To provide modularity, memory and I/O buses can be combined into

8550-402: The instruction differences, even when debugging at machine code level. These Cyclone/R machines were updated with a faster native-mode NSK operating system in a follow-up release. The R3000 and later microprocessors had only a typical amount of internal error checking, insufficient for Tandem's needs. So, the Cyclone/R ran pairs of R3000 processors in lock step, running the same data thread. This

8664-529: The instruction set required executing many instructions per memory reference compared to most 32-bit minicomputers. All subsequent TNS computers were hampered by this instruction set inefficiency. As the NonStop II lacked wider internal data paths, it had to use additional microcode steps for 32-bit addresses. A NonStop II CPU had three boards, using chips and design similar to the T/16. The NonStop II also replaced core memory with battery-backed DRAM memory. In 1983,

8778-665: The line was later owned by Compaq (from 1997), Hewlett-Packard Company (from 2003) and Hewlett Packard Enterprise (since 2015). In 2005, the HP Integrity "NonStop i" (or TNS/E) servers, based on Intel Itanium microprocessors, was introduced. In 2014, the first "NonStop X" (or TNS/X) systems, based on Intel x86-64 processors, were introduced. Sales of the Itanium-based systems ended in July 2020. Early NonStop applications had to be specifically coded for fault tolerance . That requirement

8892-747: The market need for fault tolerance in OLTP (online transaction processing) systems while running a marketing team for Hewlett-Packard 's HP 3000 computer division, but HP was not interested in developing for this niche. He then joined the venture capital firm Kleiner Perkins and developed the Tandem business plan there. Treybig pulled together a core engineering team hired away from the HP 3000 division: Mike Green, Jim Katzman, Dave Mackie and Jack Loustaunou. Their business plan called for ultra-reliable systems that never had outages and never lost or corrupted data. These were modular in

9006-516: The master process ran into trouble. This allowed the application to survive failures in any CPU or its associated devices, without data loss. It further allowed recovery from some intermittent-style software failures. Between failures, the monitoring by the slave process added some performance overhead but this was far less than the 100% duplication in other system designs. Some major early applications were directly coded in this checkpoint style, but most instead used various Tandem software layers which hid

9120-446: The minimum of one used in 1-Wire and UNI/O . As data rates increase, the problems of timing skew , power consumption, electromagnetic interference and crosstalk across parallel buses become more and more difficult to circumvent. One partial solution to this problem has been to double pump the bus. Often, a serial bus can be operated at higher overall data rates than a parallel bus, despite having fewer electrical connections, because

9234-401: The newer bus systems like PCI , and computers began to include AGP just to drive the video card. By 2004 AGP was outgrown again by high-end video cards and other peripherals and has been replaced by the new PCI Express bus. An increasing number of external devices started employing their own bus systems as well. When disk drives were first introduced, they would be added to the machine with

9348-469: The number of nodes added to the system, whereas most databases had performance that plateaued quite quickly, often after just two CPUs. A later version released in 1989 added transactions that could be spread over nodes, a feature that remained unique for some time. NonStop SQL continued to evolve, first as NonStop SQL/MP and then NonStop SQL/MX, which transitioned from Tandem to Compaq to HP. The code remains in use in both HP's NonStop SQL/MP, NonStop SQL/MX and

9462-452: The plants to fabricate the chips. Facing the challenges of this changing marketplace and manufacturing landscape, Tandem partnered with MIPS and adopted its R3000 and successor chipsets and their advanced optimizing compiler. Subsequent NonStop Guardian machines using the MIPS architecture were known to programmers as TNS/R machines and had a variety of marketing names. In 1991, Tandem released

9576-514: The primary processing and the other serves as a "hot backup", receiving updates to data whenever the primary reaches a critical point in processing. Should the primary stop, the backup steps in to resume execution using the current transaction. The systems support relational database management systems like NonStop SQL and hierarchical databases such as Enscribe . Languages supported include Java , C , C++ , COBOL , SCOBOL (Screen COBOL), Transaction Application Language (TAL), etc. It uses

9690-404: The processes were running. This approach easily scaled to multiple-computer clusters and helped isolate corrupted data before it propagated. All file system processes and all transactional application processes were structured as master/slave pairs of processes running in separate CPUs. The slave process periodically took snapshots of the master's memory state and took over the workload if and when

9804-484: The program attempted to perform those other tasks, it might take too long for the program to check again, resulting in loss of data. Engineers thus arranged for the peripherals to interrupt the CPU. The interrupts had to be prioritized, because the CPU can only execute code for one peripheral at a time, and some devices are more time-critical than others. High-end systems introduced the idea of channel controllers , which were essentially small computers dedicated to handling

9918-403: The same instant, in "lock step". Faults are detected by seeing when the cloned processors' outputs diverged. To detect failures, the system must have two physical processors for each logical, active processor. To also implement automatic failover recovery, the system must have three or four physical processors for each logical processor. The triple or quadruple cost of this sparing is practical when

10032-529: The same logical function as a parallel electrical busbar . Modern computer buses can use both parallel and bit serial connections, and can be wired in either a multidrop (electrical parallel) or daisy chain topology, or connected by switched hubs. Many modern CPUs also feature a second set of pins similar to those for communicating with memory—but able to operate with different speeds and protocols—to ensure that peripherals do not slow overall system performance. CPUs can also feature smart controllers to place

10146-420: The same speed, as it shared a single clock. Increasing the speed of the CPU becomes harder, because the speed of all the devices must increase as well. When it is not practical or economical to have all devices as fast as the CPU, the CPU must either enter a wait state , or work at a slower clock frequency temporarily, to talk to other devices in the computer. While acceptable in embedded systems , this problem

10260-553: The scripting and job control language TACL (Tandem Advanced Command Language), and is written in TAL and C. The HPE Integrity NonStop computers are a line of fault-tolerant , message-based server computers based on the Intel Xeon processor platform, and optimized for transaction processing. Average availability levels of 99.999% have been observed. NonStop systems feature a massively parallel processing (MPP) architecture and provide linear scalability. Each CPU runs its own copy of

10374-518: The speed of the CPU. Still, devices interrupted the CPU by signaling on separate CPU pins. For instance, a disk drive controller would signal the CPU that new data was ready to be read, at which point the CPU would move the data by reading the "memory location" that corresponded to the disk drive. Almost all early microcomputers were built in this fashion, starting with the S-100 bus in the Altair 8800 computer system. In some instances, most notably in

10488-522: The work on these systems concerns software design, as opposed to the hardware itself. In general, these third generation buses tend to look more like a network than the original concept of a bus, with a higher protocol overhead needed than early systems, while also allowing multiple devices to use the bus at once. Buses such as Wishbone have been developed by the open source hardware movement in an attempt to further remove legal and patent constraints from computer design. The Compute Express Link (CXL)

10602-420: Was a proprietary design. It was greatly influenced by the HP 3000 minicomputer. They were both microprogrammed , 16-bit , stack-based machines with segmented, 16-bit virtual addressing. Both were intended to be programmed exclusively in high-level languages, with no use of assembler . Both were initially implemented via standard low-density TTL chips, each holding a 4-bit slice of the 16-bit ALU . Both had

10716-523: Was accomplished by having a separate microcode routine for every common pair of instructions. That fused pair of stack instructions generally accomplished the same work as a single instruction of normal 32-bit minicomputers. Cyclone processors were packaged as sections of four CPUs each, and the sections joined by a fiber optic version of Dynabus. Like Tandem's prior high-end machines, Cyclone cabinets were styled with much angular black to suggest strength and power. Advertising videos directly compared Cyclone to

10830-464: Was acquired by Compaq. Compaq's x86-based server division was an early outside adopter of Tandem's ServerNet/InfiniBand interconnect technology. In 1997, Compaq acquired the Tandem Computers company and NonStop customer base to balance Compaq's heavy focus on personal computers (PCs). In 1998, Compaq also acquired the much larger Digital Equipment Corporation and inherited its DEC Alpha RISC servers with OpenVMS and Tru64 Unix customer bases. Tandem

10944-586: Was already current. In 1995, the Integrity S4000 was the first to use ServerNet (a networked "bus" structure) and moved toward sharing peripherals with the NonStop line. In 1995–1997, Tandem partnered with Microsoft to implement high-availability features and advanced SQL configurations in clusters of commodity Microsoft Windows NT machines. This project was codenamed "Wolfpack" and first shipped as Microsoft Cluster Server in 1997. Microsoft benefited greatly from this partnership; Tandem did not. When Tandem

11058-434: Was an efficient machine-dependent systems programming language (for operating systems, compilers, etc.) but could also be used for non-portable applications. It was derived from HP 3000's System Programming Language (SPL). Both had semantics similar to C but a syntax based on Burroughs' ALGOL . Subsequent releases added support for Cobol74, Basic , Fortran , Java , C, C++ , and MUMPS . The Tandem NonStop series ran

11172-518: Was an improvement for the surviving NonStop division and its customers. In some ways, Tandem's journey from HP-inspired start-up to an HP-inspired competitor, then to an HP division was "bringing Tandem back to its original roots", but this was not the same HP. The porting of the NSK-based NonStop product line from MIPS processors to Itanium-based processors was completed and was branded as "HP Integrity NonStop Servers". (This NSK Integrity NonStop

11286-459: Was duplicated and had dual connections to both CPUs and devices. Each disk was mirrored, with separate connections to two independent disk controllers. If a disk failed, its data was still available from its mirrored copy. If a CPU, controller or bus failed, the disk was still reachable through alternative CPU, controller, and/or bus. Each disk or network controller was connected to two independent CPUs. Power supplies were each wired to only one side of

11400-447: Was for purposes of data integrity, and not fault-tolerance – fault tolerance was handled by the other mechanisms still in place. It used a variation of lock stepping. The checker processor ran 1 cycle behind the primary processor. This allowed them to share a single copy of external code and data caches without putting excessive pinout load on the system bus and lowering the system clock rate. To successfully run microprocessors in lock step,

11514-419: Was formed in 1974, every computer company designed and built its CPUs from basic circuits, using its own proprietary instruction set, compilers, etc. With each year of semiconductor progress with Moore's Law, more of a CPU's core circuits could fit into single chips and run faster and cheaper as a result. However, it became increasingly expensive for a computer company to design those advanced custom chips or build

11628-400: Was formerly a general manager of the HP 3000 division. The business plan included detailed ideas for building a unique corporate culture reflecting Treybig's values. The design of the initial Tandem/16 hardware was completed in 1975, and the first system shipped to Citibank in May 1976. The company enjoyed uninterrupted exponential growth through 1983. Inc. magazine ranked Tandem as

11742-441: Was not tolerated for long in general-purpose, user-expandable computers. Such bus systems are also difficult to configure when constructed from common off-the-shelf equipment. Typically each added expansion card requires many jumpers in order to set memory addresses, I/O addresses, interrupt priorities, and interrupt numbers. "Second generation" bus systems like NuBus addressed some of these problems. They typically separated

11856-470: Was one board, containing six "compiled silicon" ASIC CMOS chips. The CPU core chip was duplicated and lock stepped for maximal error detection. This added no additional fault tolerance but assured data integrity as each CPU included checking logic that made certain that the results of both CPU chips were identical. Other processors would provide fault tolerance. Pinout was a main limitation of this chip technology. Microcode, cache, and TLB were all external to

11970-402: Was one bus for memory, and one or more separate buses for peripherals. These were accessed by separate instructions, with completely different timings and protocols. One of the first complications was the use of interrupts . Early computer programs performed I/O by waiting in a loop for the peripheral to become ready. This was a waste of time for programs that had other tasks to do. Also, if

12084-475: Was only for hierarchical, non-relational databases via the ENSCRIBE file system. This was extended into a relational database called ENCOMPASS . In 1986 Tandem introduced the first fault-tolerant SQL database, NonStop SQL . Developed totally in-house, NonStop SQL includes a number of features based on Guardian to ensure data validity across nodes. NonStop SQL is known for scaling linearly in performance with

12198-553: Was removed in 1983 with the introduction of the Transaction Monitoring Facility (TMF), along with Pathway transaction management software and SCOBOL applications (or, later, NonStop Tuxedo transaction management software), which handles the various aspects of fault tolerance on the system level. NonStop OS is a message-based operating system designed for fault tolerance. It works with process pairs and ensures that backup processes on redundant CPUs take over in case of

12312-404: Was sold to customers needing the utmost reliability. This new checking approach was called NSAA, NonStop Advanced Architecture . As in the earlier migration from stack machines to MIPS microprocessors, all customer software was carried forward without source changes. "Native mode" source code compiled directly to MIPS machine code was simply recompiled for Itanium. Some older "non-native" software

12426-475: Was spun off as the "Ellipse" product of Cooperative Systems Incorporated. In 1985, Tandem attempted to grab a piece of the rapidly growing personal computer market with its introduction of the MS-DOS based Dynamite PC/workstation. Numerous design compromises (including a unique 8086-based hardware platform incompatible with expansion cards of the day and extremely limited compatibility with IBM -based PCs) relegated

12540-547: Was still in TNS stack machine form. These were automatically ported onto Itanium via object code translation techniques. The next endeavor was to move from Itanium to the Intel x86 architecture. It was completed in 2014 with the first systems being made commercially available. The inclusion of the fault-tolerant 4X FDR (Fourteen Data Rate) InfiniBand double-wide switches provided more than 25 times increase in system interconnect capacity. NonStop (server computers) NonStop

12654-508: Was the Tandem/16 or T/16 , later re-branded NonStop I . The machine consisted of between two and 16 CPUs, organized as a fault-tolerant computer cluster packaged in a single rack. Each CPU had its own private, unshared memory, its own I/O processor, its own private I/O bus to connect to I/O controllers, and dual connections to all the other CPUs over a custom inter-CPU backplane bus called Dynabus . Each disk controller or network controller

12768-500: Was then midway in porting its NonStop product line from MIPS R12000 microprocessors to Intel's new Itanium Merced microprocessors. This project was restarted with Alpha as the new target to align NonStop with Compaq's other large server lines. But in 2001, Compaq terminated all Alpha engineering investments in favor of the Itanium microprocessors, before any new NonStop products were released on Alpha. In 2001, Hewlett-Packard similarly made

12882-510: Was then translated to equivalent partially optimized MIPS instruction sequences at kernel install time by a tool called the Accelerator. Less-important programs could also be executed directly without pre-translation, via a TNS code interpreter . These migration techniques were successful and remain in use today. End-user software was brought over without extra work, the performance was good enough for mid-range machines, and programmers could ignore

12996-771: Was unrelated to Tandem's original "Integrity" series for Unix.) Because it was not possible to run Itanium McKinley chips with clock-level lock stepping, the Integrity NonStop machines instead lock stepped using comparisons between chip states at longer time scales, at interrupt points and at various software synchronization points in between interrupts. The intermediate synchronization points were automatically triggered at every n'th taken branch instruction and were also explicitly inserted into long loop bodies by all NonStop compilers. The machine design supported both dual and triple redundancy, with either two or three physical microprocessors per logical Itanium processor. The triple version

#4995