Cray MTA-2 - Misplaced Pages

In computing , multiple instruction, multiple data ( MIMD ) is a technique employed to achieve parallelism. Machines using MIMD have a number of processor cores that function asynchronously and independently. At any time, different processors may be executing different instructions on different pieces of data.

#986013

28-590: The Cray MTA-2 is a shared-memory MIMD computer marketed by Cray Inc. It is an unusual design based on the Tera computer designed by Tera Computer Company . The original Tera computer (also known as the MTA ) turned out to be nearly unmanufacturable due to its aggressive packaging and circuit technology. The MTA-2 was an attempt to correct these problems while maintaining essentially the same processor architecture respun in one silicon ASIC, down from some 26 gallium arsenide ASICs in

56-684: A fat tree network of reduced instruction set computing (RISC) SPARC processors. To make programming easier, it was made to simulate a SIMD design. The later CM-5E replaces the SPARC processors with faster SuperSPARCs. A CM-5 was the fastest computer in the world in 1993 according to the TOP500 list, running 1024 cores with Rpeak of 131.0 G FLOPS , and for several years many of the top 10 fastest computers were CM-5s. Connection Machines were noted for their striking visual design. The CM-1 and CM-2 design teams were led by Tamiko Thiel . The physical form of

84-413: A mesh interconnection network, processors are placed in a two-dimensional grid. Each processor is connected to its four immediate neighbors. Wrap around connections may be provided at the edges of the mesh. One advantage of the mesh interconnection network over the hypercube is that the mesh system need not be configured in powers of two. A disadvantage is that the diameter of the mesh network is greater than

112-563: A problem with these machines. It is not economically feasible to connect a large number of processors directly to each other. A way to avoid this multitude of direct connections is to connect each processor to just a few others. This type of design can be inefficient because of the added time required to pass a message from one processor to another along the message path. The amount of time required for processors to perform simple message routing can be substantial. Systems were designed to reduce this time loss and hypercube and mesh are among two of

140-648: A redundant array of independent disks ( RAID ) hard disk system, called a DataVault , of up to 25 GB. Two later variants of the CM-2 were also produced, the smaller CM-2a with either 4096 or 8192 single-bit processors, and the faster CM-200 . Due to its origins in AI research, the software for the CM-1/2/200 single-bit processor was influenced by the Lisp programming language and a version of Common Lisp , *Lisp (spoken: Star-Lisp ),

168-498: A side, divided equally into eight smaller cubes. Each subcube contains 16 printed circuit boards and a main processor called a sequencer. Each circuit board contains 32 chips. Each chip contains a router , 16 processors, and 16 RAMs. The CM-1 as a whole has a 12-dimensional hypercube -based routing network (connecting the 2 chips), a main RAM, and an input-output processor (a channel controller) . Each router contains five buffers to store

196-585: A specific CM, common bus system for all the clients. For example, if we consider a bus with clients A, B, C connected on one side and P, Q, R connected on the opposite side, any one of the clients will communicate with the other by means of the bus interface between them. MIMD machines with hierarchical shared memory use a hierarchy of buses (as, for example, in a " fat tree ") to give processors access to each other's memory. Processors on different boards may communicate through inter-nodal buses. Buses support communication between boards. With this type of architecture,

224-633: A team to develop what would become the CM-1 Connection Machine, a design for a massively parallel hypercube -based arrangement of thousands of microprocessors , springing from his PhD thesis work at MIT in Electrical Engineering and Computer Science (1985). The dissertation won the ACM Distinguished Dissertation prize in 1985, and was presented as a monograph that overviewed the philosophy, architecture, and software for

252-458: Is difficult, and the shared memory model is less flexible than the distributed memory model. There are many examples of shared memory (multiprocessors): UMA ( uniform memory access ), COMA ( cache-only memory access ). MIMD machines with shared memory have processors which share a common, central memory. In the simplest form, all processors are attached to a bus which connects them to memory. This means that every machine with shared memory shares

280-441: The bus-based , extended, or hierarchical type. Distributed memory machines may have hypercube or mesh interconnection schemes. An example of MIMD system is Intel Xeon Phi , descended from Larrabee microarchitecture. These processors have multiple processing cores (up to 61 as of 2015) that can execute different instructions on different data. Most parallel computers, as of 2013, are MIMD systems. In shared memory model

308-413: The hypercube -based array of them was designed to perform the same operation on multiple data points simultaneously, i.e., to execute tasks in single instruction, multiple data ( SIMD ) fashion. The CM-1, depending on the configuration, has as many as 65,536 individual processors, each extremely simple, processing one bit at a time. CM-1 and its successor CM-2 take the form of a cube 1.5 meters on

SECTION 10

#1732854768987

336-456: The CM-1, CM-2, and CM-200 chassis was a cube-of-cubes, referencing the machine's internal 12-dimensional hypercube network, with the red light-emitting diodes (LEDs), by default indicating the processor status, visible through the doors of each cube. By default, when a processor is executing an instruction, its LED is on. In a SIMD program, the goal is to have as many processors as possible working

364-535: The CM-5 design. A CM-5 was featured in the film Jurassic Park in the control room for the island (instead of a Cray X-MP supercomputer as in the novel). Two banks, one bank of 4 Units and a single off to the right of the set could be seen in the control room. The computer mainframes in Fallout 3 were inspired heavily by the CM-5. Cyberpunk 2077 features numerous CM-1/CM-2 style units in various portions of

392-889: The United States Naval Research Laboratory in 2002, and one 4-processor system sold to the Electronic Navigation Research Institute (ENRI) in Japan. The MTA computers pioneered several technologies, presumably to be used in future Cray Inc. products: Multiple instruction, multiple data MIMD architectures may be used in a number of application areas such as computer-aided design / computer-aided manufacturing , simulation , modeling , and as communication switches . MIMD machines can be of either shared memory or distributed memory categories. These classifications are based on how MIMD processors access memory. Shared memory machines may be of

420-414: The data being transmitted when a clear channel is not available. The engineers had originally calculated that seven buffers per chip would be needed, but this made the chip slightly too large to build. Nobel Prize -winning physicist Richard Feynman had previously calculated that five buffers would be enough, using a differential equation involving the average number of 1 bits in an address. They resubmitted

448-477: The design of the chip with only five buffers, and when they put the machine together, it worked fine. Each chip is connected to a switching device called a nexus. The CM-1 uses Feynman's algorithm for computing logarithms that he had developed at Los Alamos National Laboratory for the Manhattan Project . It is well suited to the CM-1, using as it did, only shifting and adding, with a small table shared by all

476-462: The early 1980s. Starting with CM-1, the machines were intended originally for applications in artificial intelligence (AI) and symbolic processing, but later versions found greater success in the field of computational science . Danny Hillis and Sheryl Handler founded Thinking Machines Corporation (TMC) in Waltham, Massachusetts , in 1983, moving in 1984 to Cambridge, MA. At TMC, Hillis assembled

504-445: The first Connection Machine, including information on its data routing between central processing unit (CPU) nodes, its memory handling, and the programming language Lisp applied in the parallel machine. Very early concepts contemplated just over a million processors, each connected in a 20-dimensional hypercube, which was later scaled down. Each CM-1 microprocessor has its own 4 kilobits of random-access memory (RAM), and

532-497: The hypercube for systems with more than four processors. Connection Machine The Connection Machine ( CM ) is a member of a series of massively parallel supercomputers sold by Thinking Machines Corporation . The idea for the Connection Machine grew out of doctoral research on alternatives to the traditional von Neumann architecture of computers by Danny Hillis at Massachusetts Institute of Technology (MIT) in

560-402: The machine may support over nine thousand processors. In distributed memory MIMD (multiple instruction, multiple data) machines, each processor has its own individual memory location. Each processor has no direct knowledge about other processor's memory. For data to be shared, it must be passed from one processor to another as a message. Since there is no shared memory, contention is not as great

588-504: The original MTA; and while regressing the network design from a 4-D torus topology to a less efficient but more scalable Cayley graph topology. The name Cray was added to the second version after Tera Computer Company bought the remains of the Cray Research division of Silicon Graphics in 2000 and renamed itself Cray Inc. The MTA-2 was not a commercial success, with only one moderately-sized 40-processor system ("Boomer") being sold to

SECTION 20

#1732854768987

616-403: The popular interconnection schemes. Examples of distributed memory (multiple computers) include MPP (massively parallel processors) , COW (clusters of workstations) and NUMA ( non-uniform memory access ). The former is complex and expensive: Many super-computers coupled by broad-band networks. Examples include hypercube and mesh interconnections. COW is the "home-made" version for a fraction of

644-492: The price. In an MIMD distributed memory machine with a hypercube system interconnection network containing four processors, a processor and a memory module are placed at each vertex of a square. The diameter of the system is the minimum number of steps it takes for one processor to send a message to the processor that is the farthest away. So, for example, the diameter of a 2-cube is 2. In a hypercube system with eight processors and each processor and memory module being placed in

672-463: The processors are all connected to a "globally available" memory, via either software or hardware means. The operating system usually maintains its memory coherence . From a programmer's point of view, this memory model is better understood than the distributed memory model. Another advantage is that memory coherence is managed by the operating system and not the written program. Two known disadvantages are: scalability beyond thirty-two processors

700-557: The processors. Feynman also discovered that the CM-1 would compute the Feynman diagrams for quantum chromodynamics (QCD) calculations faster than an expensive special-purpose machine developed at Caltech. To improve its commercial viability, TMC launched the CM-2 in 1987, adding Weitek 3132 floating-point numeric coprocessors and more RAM to the system. Thirty-two of the original one-bit processors shared each numeric processor. The CM-2 can be configured with up to 512 MB of RAM, and

728-484: The program at the same time – indicated by having all LEDs being steady on. Those unfamiliar with the use of the LEDs wanted to see the LEDs blink – or even spell out messages to visitors. The result is that finished programs often have superfluous operations to blink the LEDs. The CM-5, in plan view, had a staircase-like shape, and also had large panels of red blinking LEDs. Prominent sculptor-architect Maya Lin contributed to

756-436: The vertex of a cube, the diameter is 3. In general, a system that contains 2^N processors with each processor directly connected to N other processors, the diameter of the system is N. One disadvantage of a hypercube system is that it must be configured in powers of two, so a machine must be built that could potentially have many more processors than is really needed for the application. In an MIMD distributed memory machine with

784-607: Was implemented on the CM-1. Other early languages included Karl Sims ' IK and Cliff Lasser's URDU. Much system utility software for the CM-1/2 was written in *Lisp. Many applications for the CM-2, however, were written in C* , a data-parallel superset of ANSI C . With the CM-5 , announced in 1991, TMC switched from the CM-2's hypercubic architecture of simple processors to a new and different multiple instruction, multiple data ( MIMD ) architecture based on

#986013