Misplaced Pages

MEGAN

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

A computer program is a sequence or set of instructions in a programming language for a computer to execute . It is one component of software , which also includes documentation and other intangible components.

#138861

53-445: MEGAN ("MEtaGenome ANalyzer") is a computer program that allows optimized analysis of large metagenomic datasets. Metagenomics is the analysis of the genomic sequences from a usually uncultured environmental sample. A large term goal of most metagenomics is to inventory and measure the extent and the role of microbial biodiversity in the ecosystem due to discoveries that the diversity of microbial organisms and viral agents in

106-462: A list of integers could be called integer_list . In object-oriented jargon, abstract datatypes are called classes . However, a class is only a definition; no memory is allocated. When memory is allocated to a class and bound to an identifier , it is called an object . Object-oriented imperative languages developed by combining the need for classes and the need for safe functional programming . A function, in an object-oriented language,

159-422: A programming language . Programming language features exist to provide building blocks to be combined to express programming ideals. Ideally, a programming language should: The programming style of a programming language to provide these building blocks may be categorized into programming paradigms . For example, different paradigms may differentiate: Each of these programming styles has contributed to

212-412: A runtime system , which implements runtime language features (such as task scheduling , exception handling , calling static constructors and destructors, etc.) and interactions with the operating system, notably passing arguments, environment, and returning an exit status , together with other startup and shutdown features such as releasing resources like file handles . For C, this is done by linking in

265-428: A store which consisted of memory to hold 1,000 numbers of 50 decimal digits each. Numbers from the store were transferred to the mill for processing. The engine was programmed using two sets of perforated cards. One set directed the operation and the other set inputted the variables. However, the thousands of cogged wheels and gears never fully worked together. Ada Lovelace worked for Charles Babbage to create

318-512: A computer "to perform indicated tasks according to encoded instructions ", as opposed to a data file that must be interpreted ( parsed ) by an interpreter to be functional. The exact interpretation depends upon the use. "Instructions" is traditionally taken to mean machine code instructions for a physical CPU . In some contexts, a file containing scripting instructions (such as bytecode ) may also be considered executable. Executable files can be hand-coded in machine language, although it

371-604: A description of the Analytical Engine (1843). The description contained Note G which completely detailed a method for calculating Bernoulli numbers using the Analytical Engine. This note is recognized by some historians as the world's first computer program . In 1936, Alan Turing introduced the Universal Turing machine , a theoretical device that can model every computation. It is a finite-state machine that has an infinitely long read/write tape. The machine can move

424-542: A file is executed by loading it into memory and jumping to the start of the address space and executing from there. In more complicated interfaces, executable files have additional metadata specifying a separate entry point . For example, in ELF, the entry point is defined in the header's e_entry field, which specifies the (virtual) memory address at which to start execution. In the GNU Compiler Collection , this field

477-580: A language's basic syntax . The syntax of the language BASIC (1964) was intentionally limited to make the language easy to learn. For example, variables are not declared before being used. Also, variables are automatically initialized to zero. Here is an example computer program, in Basic, to average a list of numbers: Once the mechanics of basic computer programming are learned, more sophisticated and powerful languages are available to build large computer systems. Improvements in software development are

530-521: A profound influence on programming language design. Emerging from a committee of European and American programming language experts, it used standard mathematical notation and had a readable, structured design. Algol was first to define its syntax using the Backus–Naur form . This led to syntax-directed compilers. It added features like: Algol's direct descendants include Pascal , Modula-2 , Ada , Delphi and Oberon on one branch. On another branch

583-557: A result, the computer could be programmed quickly and perform calculations at very fast speeds. Presper Eckert and John Mauchly built the ENIAC. The two engineers introduced the stored-program concept in a three-page memo dated February 1944. Later, in September 1944, John von Neumann began working on the ENIAC project. On June 30, 1945, von Neumann published the First Draft of a Report on

SECTION 10

#1732854834139

636-678: A single dataset while the latest version can analyse multiple datasets including new features (query different databases, new algorithm etc.). MEGAN analysis starts with collecting reads from any shotgun platform. Then, the reads are compared with sequence databases using BLAST or similar. Third, MEGAN assigns a taxon ID to processed read results based on NCBI taxonomy which creates a MEGAN file that contains required information for statistical and graphical analysis. Lastly, lowest common ancestor ( LCA ) algorithm can be run to inspect assignments, to analyze data and to create summaries of data based on different NCBI taxonomy levels. LCA algorithm simply finds

689-418: Is assigned to a class. An assigned function is then referred to as a method , member function , or operation . Object-oriented programming is executing operations on objects . Object-oriented languages support a syntax to model subset/superset relationships. In set theory , an element of a subset inherits all the attributes contained in the superset. For example, a student is a person. Therefore,

742-399: Is called an executable . Alternatively, source code may execute within an interpreter written for the language. If the executable is requested for execution, then the operating system loads it into memory and starts a process . The central processing unit will soon switch to this process so it can fetch, decode, and then execute each machine instruction. If the source code

795-444: Is far more convenient to develop software as source code in a high-level language that can be easily understood by humans. In some cases, source code might be specified in assembly language instead, which remains human-readable while being closely associated with machine code instructions. The high-level language is compiled into either an executable machine code file or a non-executable machine code – object file of some sort;

848-402: Is requested for execution, then the operating system loads the corresponding interpreter into memory and starts a process. The interpreter then loads the source code into memory to translate and execute each statement . Running the source code is slower than running an executable . Moreover, the interpreter must be installed on the computer. The "Hello, World!" program is used to illustrate

901-548: Is to alter the electrical resistivity and conductivity of a semiconductor junction . First, naturally occurring silicate minerals are converted into polysilicon rods using the Siemens process . The Czochralski process then converts the rods into a monocrystalline silicon , boule crystal . The crystal is then thinly sliced to form a wafer substrate . The planar process of photolithography then integrates unipolar transistors, capacitors , diodes , and resistors onto

954-517: The new statement. A module's other file is the source file . Here is a C++ source file for the GRADE class in a simple school application: Here is a C++ header file for the PERSON class in a simple school application: Executable In computer science , executable code , an executable file , or an executable program , sometimes simply referred to as an executable or binary , causes

1007-604: The IBM System/360 (1964) had a CPU made from circuit boards containing discrete components on ceramic substrates . The Intel 4004 (1971) was a 4- bit microprocessor designed to run the Busicom calculator. Five months after its release, Intel released the Intel 8008 , an 8-bit microprocessor. Bill Pentz led a team at Sacramento State to build the first microcomputer using the Intel 8008:

1060-480: The Sac State 8008 (1972). Its purpose was to store patient medical records. The computer supported a disk operating system to run a Memorex , 3- megabyte , hard disk drive . It had a color display and keyboard that was packaged in a single console. The disk operating system was programmed using IBM's Basic Assembly Language (BAL) . The medical records application was programmed using a BASIC interpreter. However,

1113-550: The circuits . At its core, it was a series of Pascalines wired together. Its 40 units weighed 30 tons, occupied 1,800 square feet (167 m ), and consumed $ 650 per hour ( in 1940s currency ) in electricity when idle. It had 20 base-10 accumulators . Programming the ENIAC took up to two months. Three function tables were on wheels and needed to be rolled to fixed function panels. Function tables were connected to function panels by plugging heavy black cables into plugboards . Each function table had 728 rotating knobs. Programming

SECTION 20

#1732854834139

1166-452: The crt0 object, which contains the actual entry point and does setup and shutdown by calling the runtime library . Executable files thus normally contain significant additional machine code beyond that directly generated from the specific source code. In some cases, it is desirable to omit this, for example for embedded systems development, or simply to understand how compilation, linking, and loading work. In C, this can be done by omitting

1219-404: The programming environment to advance from a computer terminal (until the 1990s) to a graphical user interface (GUI) computer. Computer terminals limited programmers to a single shell running in a command-line environment . During the 1970s, full-screen source code editing became possible through a text-based user interface . Regardless of the technology available, the goal is to program in

1272-494: The EDVAC , which equated the structures of the computer with the structures of the human brain. The design became known as the von Neumann architecture . The architecture was simultaneously deployed in the constructions of the EDVAC and EDSAC computers in 1949. The IBM System/360 (1964) was a family of computers, each having the same instruction set architecture . The Model 20 was

1325-433: The ENIAC also involved setting some of the 3,000 switches. Debugging a program took a week. It ran from 1947 until 1955 at Aberdeen Proving Ground , calculating hydrogen bomb parameters, predicting weather patterns, and producing firing tables to aim artillery guns. Instead of plugging in cords and turning switches, a stored-program computer loads its instructions into memory just like it loads its data into memory. As

1378-640: The cheaper Intel 8088 . IBM embraced the Intel 8088 when they entered the personal computer market (1981). As consumer demand for personal computers increased, so did Intel's microprocessor development. The succession of development is known as the x86 series . The x86 assembly language is a family of backward-compatible machine instructions . Machine instructions created in earlier microprocessors were retained throughout microprocessor upgrades. This enabled consumers to purchase new computers without having to purchase new application software . The major categories of instructions are: VLSI circuits enabled

1431-419: The computer was an evolutionary dead-end because it was extremely expensive. Also, it was built at a public university lab for a specific purpose. Nonetheless, the project contributed to the development of the Intel 8080 (1974) instruction set . In 1978, the modern software development environment began when Intel upgraded the Intel 8080 to the Intel 8086 . Intel simplified the Intel 8086 to manufacture

1484-537: The configuration, an execute button was pressed. This process was then repeated. Computer programs also were automatically inputted via paper tape , punched cards or magnetic-tape . After the medium was loaded, the starting address was set via switches, and the execute button was pressed. A major milestone in software development was the invention of the Very Large Scale Integration (VLSI) circuit (1964). Following World War II , tube-based technology

1537-434: The descendants include C , C++ and Java . BASIC (1964) stands for "Beginner's All-Purpose Symbolic Instruction Code". It was developed at Dartmouth College for all of their students to learn. If a student did not go on to a more powerful language, the student would still remember Basic. A Basic interpreter was installed in the microcomputers manufactured in the late 1970s. As the microcomputer industry grew, so did

1590-581: The environment is far greater than previously estimated. Tools that allow the investigation of very large data sets from environmental samples using shotgun sequencing techniques in particular, such as MEGAN, are designed to sample and investigate the unknown biodiversity of environmental samples where more precise techniques with smaller, better known samples, cannot be used. Fragments of DNA from an metagenomics sample, such as ocean waters or soil, are compared against databases of known DNA sequences using BLAST or another sequence comparison tool to assemble

1643-617: The equivalent process on assembly language source code is called assembly . Several object files are linked to create the executable. Object files -- executable or not -- are typically stored in a container format , such as Executable and Linkable Format (ELF) or Portable Executable (PE) which is operating system -specific. This gives structure to the generated machine code, for example dividing it into sections such as .text (executable code), .data (initialized global and static variables), and .rodata (read-only data, such as constants and strings). Executable files typically also include

MEGAN - Misplaced Pages Continue

1696-430: The extent of species diversity. Targeted or random sequencing are widely used with comparisons against sequence databases. Recent developments in sequencing technology increased the number of metagenomics samples. MEGAN is an easy to use tool for analysing such metagenomics data. First version of MEGAN was released in 2007 and the most recent version is MEGAN6 . First version is capable of analysing taxonomic content of

1749-460: The first Fortran standard in 1966. In 1978, Fortran 77 became the standard until 1991. Fortran 90 supports: COBOL (1959) stands for "COmmon Business Oriented Language". Fortran manipulated symbols. It was soon realized that symbols did not need to be numbers, so strings were introduced. The US Department of Defense influenced COBOL's development, with Grace Hopper being a major contributor. The statements were English-like and verbose. The goal

1802-475: The language BCPL was replaced with B , and AT&T Bell Labs called the next version "C". Its purpose was to write the UNIX operating system . C is a relatively small language, making it easy to write compilers. Its growth mirrored the hardware growth in the 1980s. Its growth also was because it has the facilities of assembly language , but uses a high-level syntax . It added advanced features like: C allows

1855-400: The language. Basic pioneered the interactive session . It offered operating system commands within its environment: However, the Basic syntax was too simple for large programs. Recent dialects added structure and object-oriented extensions. Microsoft's Visual Basic is still widely used and produces a graphical user interface . C programming language (1973) got its name because

1908-479: The lowest common ancestor of different species. Computer program A computer program in its human-readable form is called source code . Source code needs another computer program to execute because computers can only execute their native machine instructions . Therefore, source code may be translated to machine instructions using a compiler written for the language. ( Assembly language programs are translated using an assembler .) The resulting file

1961-485: The matrix was to burn out the unneeded connections. There were so many connections, firmware programmers wrote a computer program on another chip to oversee the burning. The technology became known as Programmable ROM . In 1971, Intel installed the computer program onto the chip and named it the Intel 4004 microprocessor . The terms microprocessor and central processing unit (CPU) are now used interchangeably. However, CPUs predate microprocessors. For example,

2014-443: The programmer to control which region of memory data is to be stored. Global variables and static variables require the fewest clock cycles to store. The stack is automatically used for the standard variable declarations . Heap memory is returned to a pointer variable from the malloc() function. In the 1970s, software engineers needed language support to break large projects down into modules . One obvious feature

2067-486: The result of improvements in computer hardware . At each stage in hardware's history, the task of computer programming changed dramatically. In 1837, Jacquard's loom inspired Charles Babbage to attempt to build the Analytical Engine . The names of the components of the calculating device were borrowed from the textile industry. In the textile industry, yarn was brought from the store to be milled. The device had

2120-551: The segments into discrete comparable sequences. MEGAN is then used to compare the resulting sequences with gene sequences from GenBank in NCBI . The program was used to investigate the DNA of a woolly mammoth recovered from the Siberian permafrost and Sargasso Sea data set. Metagenomics is the study of genomic content of samples from same habitat, which is designed to determine the role and

2173-438: The set of students is a subset of the set of persons. As a result, students inherit all the attributes common to all persons. Additionally, students have unique attributes that other people do not have. Object-oriented languages model subset/superset relationships using inheritance . Object-oriented programming became the dominant language paradigm by the late 1990s. C++ (1985) was originally called "C with Classes". It

MEGAN - Misplaced Pages Continue

2226-467: The smallest and least expensive. Customers could upgrade and retain the same application software . The Model 195 was the most premium. Each System/360 model featured multiprogramming —having multiple processes in memory at once. When one process was waiting for input/output , another could compute. IBM planned for each model to be programmed using PL/1 . A committee was formed that included COBOL , Fortran and ALGOL programmers. The purpose

2279-430: The synthesis of different programming languages . A programming language is a set of keywords , symbols , identifiers , and rules by which programmers can communicate instructions to the computer. They follow a set of rules called a syntax . Programming languages get their basis from formal languages . The purpose of defining a solution in terms of its formal language is to generate an algorithm to solve

2332-447: The tape back and forth, changing its contents as it performs an algorithm . The machine starts in the initial state, goes through a sequence of steps, and halts when it encounters the halt state. All present-day computers are Turing complete . The Electronic Numerical Integrator And Computer (ENIAC) was built between July 1943 and Fall 1945. It was a Turing complete , general-purpose computer that used 17,468 vacuum tubes to create

2385-553: The underlining problem. An algorithm is a sequence of simple instructions that solve a problem. The evolution of programming languages began when the EDSAC (1949) used the first stored computer program in its von Neumann architecture . Programming the EDSAC was in the first generation of programming language . Imperative languages specify a sequential algorithm using declarations , expressions , and statements : FORTRAN (1958)

2438-436: The usual runtime, and instead explicitly specifying a linker script, which generates the entry point and handles startup and shutdown, such as calling main to start and returning exit status to the kernel at the end. In order to be executed by the system (such as an operating system , firmware , or boot loader ), an executable file must conform to the system's application binary interface (ABI). In simple interfaces,

2491-448: The wafer to build a matrix of metal–oxide–semiconductor (MOS) transistors. The MOS transistor is the primary component in integrated circuit chips . Originally, integrated circuit chips had their function set during manufacturing. During the 1960s, controlling the electrical flow migrated to programming a matrix of read-only memory (ROM). The matrix resembled a two-dimensional array of fuses. The process to embed instructions onto

2544-427: Was designed to expand C's capabilities by adding the object-oriented facilities of the language Simula . An object-oriented module is composed of two files. The definitions file is called the header file . Here is a C++ header file for the GRADE class in a simple school application: A constructor operation is a function with the same name as the class name. It is executed when the calling operation executes

2597-436: Was replaced with point-contact transistors (1947) and bipolar junction transistors (late 1950s) mounted on a circuit board . During the 1960s , the aerospace industry replaced the circuit board with an integrated circuit chip . Robert Noyce , co-founder of Fairchild Semiconductor (1957) and Intel (1968), achieved a technological improvement to refine the production of field-effect transistors (1963). The goal

2650-405: Was to decompose large projects physically into separate files . A less obvious feature was to decompose large projects logically into abstract data types . At the time, languages supported concrete (scalar) datatypes like integer numbers, floating-point numbers, and strings of characters . Abstract datatypes are structures of concrete datatypes, with a new name assigned. For example,

2703-433: Was to design a language so managers could read the programs. However, the lack of structured statements hindered this goal. COBOL's development was tightly controlled, so dialects did not emerge to require ANSI standards. As a consequence, it was not changed for 15 years until 1974. The 1990s version did make consequential changes, like object-oriented programming . ALGOL (1960) stands for "ALGOrithmic Language". It had

SECTION 50

#1732854834139

2756-425: Was to develop a language that was comprehensive, easy to use, extendible, and would replace Cobol and Fortran. The result was a large and complex language that took a long time to compile . Computers manufactured until the 1970s had front-panel switches for manual programming. The computer program was written on paper for reference. An instruction was represented by a configuration of on/off settings. After setting

2809-423: Was unveiled as "The IBM Mathematical FORmula TRANslating system". It was designed for scientific calculations, without string handling facilities. Along with declarations , expressions , and statements , it supported: It succeeded because: However, non-IBM vendors also wrote Fortran compilers, but with a syntax that would likely fail IBM's compiler. The American National Standards Institute (ANSI) developed

#138861