World Community Grid ( WCG ) is an effort to create the world's largest volunteer computing platform to tackle scientific research that benefits humanity. Launched on November 16, 2004, with proprietary Grid MP client from United Devices and adding support for Berkeley Open Infrastructure for Network Computing (BOINC) in 2005, World Community Grid eventually discontinued the Grid MP client and consolidated on the BOINC platform in 2008. In September 2021, it was announced that IBM transferred ownership to the Krembil Research Institute of University Health Network in Toronto , Ontario .
117-627: World Community Grid utilizes unused processing power of consumer devices (PCs, Laptops, Android Smartphones, etc.) to analyse data created by the research groups that participate in the grid. WCG projects have analysed data related to the human genome , the human microbiome , HIV , dengue , muscular dystrophy , cancer , influenza , Ebola , Zika virus , virtual screening , rice crop yields , clean energy , water purification and COVID-19 , among other research areas. There are currently five active projects and 26 completed projects. Several of these projects have published peer-reviewed papers based on
234-481: A Creative Commons public domain license . The Personal Genome Project (started in 2005) is among the few to make both genome sequences and corresponding medical phenotypes publicly available. The sequencing of individual genomes further unveiled levels of genetic complexity that had not been appreciated before. Personal genomics helped reveal the significant level of diversity in the human genome attributed not only to SNPs but structural variations as well. However,
351-531: A cause and effect relationship between aneuploidy and cancer has not been established. Whereas a genome sequence lists the order of every DNA base in a genome, a genome map identifies the landmarks. A genome map is less detailed than a genome sequence and aids in navigating around the genome. An example of a variation map is the HapMap being developed by the International HapMap Project . The HapMap
468-609: A July 2012 status report, the project scientists reported that the results generated by the WCG calculations are being used by Dr. Markus Landthaler of the Max Delbruch Center for Molecular Medicine (MDC) in Berlin. The HPF2 results helped Dr. Markus Landthaler and his collaborators in writing up a new paper on "The mRNA-Bound Proteome and Its Global Occupancy Profile on Protein-Coding Transcripts" The Help Defeat Cancer project seeks to improve
585-467: A chip is C·V ·A·f , where C is the capacitance being switched per clock cycle, V is voltage , A is the Activity Factor indicating the average number of switching events per clock cycle by the transistors in the chip (as a unitless quantity) and f is the clock frequency. Voltage is therefore the main determinant of power usage and heating. The voltage required for stable operation is determined by
702-456: A chromosome; ultra-rare means that they are only found in individuals or their family members and thus have arisen very recently. Single-nucleotide polymorphisms (SNPs) do not occur homogeneously across the human genome. In fact, there is enormous diversity in SNP frequency between genes, reflecting different selective pressures on each gene as well as different mutation and recombination rates across
819-668: A day due to a lack of clean water. On April 25, 2014, the project scientists released an update stating that they had exciting results to report when the paper is submitted and that the project on WCG was finished. Drug Search for Leishmaniasis (launched September 7, 2011) is spearheaded by the University of Antioquia in Medellín , Colombia , with assistance from researchers at the University of Texas Medical Branch in Galveston, Texas. The mission
936-483: A human female genome, filling all the gaps in the X chromosome (2020) and the 22 autosomes (May 2021). The previously unsequenced parts contain immune response genes that help to adapt to and survive infections, as well as genes that are important for predicting drug response . The completed human genome sequence will also provide better understanding of human formation as an individual organism and how humans vary both between each other and other species. Although
1053-528: A large percentage of non-coding DNA . Some of this non-coding DNA is non-functional junk DNA , such as pseudogenes, but there is no firm consensus on the total amount of junk DNA. Although the sequence of the human genome has been completely determined by DNA sequencing in 2022 (including methylome ), it is not yet fully understood. Most, but not all, genes have been identified by a combination of high throughput experimental and bioinformatics approaches, yet much work still needs to be done to further elucidate
1170-480: A major role in sculpting the human genome. Some of these sequences represent endogenous retroviruses , DNA copies of viral sequences that have become permanently integrated into the genome and are now passed on to succeeding generations. There are also a significant number of retroviruses in human DNA , at least 3 of which have been proven to possess an important function (i.e., HIV -like functional HERV-K; envelope genes of non-functional viruses HERV-W and HERV-FRD play
1287-783: A microsatellite hexanucleotide repeat of the sequence (TTAGGG) n . Tandem repeats of longer sequences (arrays of repeated sequences 10–60 nucleotides long) are termed minisatellites . Transposable genetic elements , DNA sequences that can replicate and insert copies of themselves at other locations within a host genome, are an abundant component in the human genome. The most abundant transposon lineage, Alu , has about 50,000 active copies, and can be inserted into intragenic and intergenic regions. One other lineage, LINE-1, has about 100 active copies per genome (the number varies between people). Together with non-functional relics of old transposons, they account for over half of total human DNA. Sometimes called "jumping genes", transposons have played
SECTION 10
#17329055065541404-565: A particular gene sequence serves in a particular function of one organism, via comparing it to a similar gene sequence of known function in another organism. Help Cure Muscular Dystrophy is run by Décrypthon , a collaboration between French Muscular Dystrophy Association, French National Center for Scientific Research and IBM . Phase 1 was launched on December 19, 2006, and completed on June 11, 2007. The project investigated protein–protein interactions for 40,000 proteins whose structures are known, with particular focus on those proteins that play
1521-680: A partnership between AFM (French Muscular Dystrophy Association), CNRS (French National Center for Scientific Research), Universite Pierre et Marie Curie, and IBM were investigating protein–protein interactions for more than 2,200 proteins whose structures are known, with particular focus on those proteins that play a role in neuromuscular diseases . Phase 2 was launched on May 12, 2009, and completed on September 26, 2012. The database of information produced will help researchers design molecules to inhibit or enhance binding of particular macromolecules , hopefully leading to better treatments for muscular dystrophy and other neuromuscular diseases. Phase 2 of
1638-524: A performance level range and a "efficiency/performance preference" hint from the OS. Dynamic frequency scaling reduces the number of instructions a processor can issue in a given amount of time, thus reducing performance. Hence, it is generally used when the workload is not CPU-bound. Dynamic frequency scaling by itself is rarely worthwhile as a way to conserve switching power. Saving the highest possible amount of power requires dynamic voltage scaling too, because of
1755-478: A role in neuromuscular diseases . The database of information produced will help researchers design molecules to inhibit or enhance binding of particular macromolecules , hopefully leading to better treatments for muscular dystrophy and other neuromuscular diseases. This project was available only to agents running the Grid MP client, making it unavailable to users running BOINC . Discovering Dengue Drugs – Together
1872-410: A role in placenta formation by inducing cell-cell fusion). Mobile elements within the human genome can be classified into LTR retrotransposons (8.3% of total genome), SINEs (13.1% of total genome) including Alu elements , LINEs (20.4% of total genome), SVAs (SINE- VNTR -Alu) and Class II DNA transposons (2.9% of total genome). There is no consensus on what constitutes a "functional" element in
1989-414: A security measure for overheated systems (e.g. after poor overclocking ). Dynamic frequency scaling almost always appear in conjunction with dynamic voltage scaling , since higher frequencies require higher supply voltages for the digital circuit to yield correct results. The combined topic is known as dynamic voltage and frequency scaling ( DVFS ). The dynamic power ( switching power ) dissipated by
2106-480: A single individual, later revealed to have been Venter himself. Thus the Celera human genome sequence released in 2000 was largely that of one man. Subsequent replacement of the early composite-derived data and determination of the diploid sequence, representing both sets of chromosomes , rather than a haploid sequence originally reported, allowed the release of the first personal genome. In April 2008, that of James Watson
2223-489: A single umbrella. Users are included in a subset of projects by default, but may opt out of projects as they choose. Even though WCG makes use of open source client software, the actual applications that perform the scientific calculations may not be. However, several of the science applications are available under a free license, although the source is not available directly from WCG. The World Community Grid software increases CPU usage by consuming unused processing time; in
2340-540: A technology named LongHaul (PowerSaver), while Transmeta 's version was called LongRun . The 36-processor AsAP 1 chip is among the first multi-core processor chips to support completely unconstrained clock operation (requiring only that frequencies are below the maximum allowed) including arbitrary changes in frequency, starts, and stops. The 167-processor AsAP 2 chip is the first multi-core processor chip which enables individual processors to make fully unconstrained changes to their own clock frequencies. According to
2457-399: A uniform density. Thus follows the popular statement that "we are all, regardless of race , genetically 99.9% the same", although this would be somewhat qualified by most geneticists. For example, a much larger fraction of the genome is now thought to be involved in copy number variation . A large-scale collaborative effort to catalog SNP variations in the human genome is being undertaken by
SECTION 20
#17329055065542574-472: Is overclocking , whereby processor performance is increased by ramping the processor's (dynamic) frequency beyond the manufacturer's design specifications. One major difference between the two is that in modern PC systems overclocking is mostly done over the Front Side Bus (mainly because the multiplier is normally locked), but dynamic frequency scaling is done with the multiplier . Moreover, overclocking
2691-563: Is 60%. The throttle is coarse-grained; for example, if usage is set to 60% it will work at 100% for 3 seconds, then at 0% for 2 seconds, resulting in an average decrease of processor use. An add-on program for Windows computers – TThrottle – can solve the problem of overheating by directly limiting the BOINC project's use of the host computer. It does this by measuring the CPU and/or the GPU temperature and adjusts
2808-478: Is a haplotype map of the human genome, "which will describe the common patterns of human DNA sequence variation." It catalogs the patterns of small-scale variations in the genome that involve single DNA letters, or bases. Researchers published the first sequence-based map of large-scale structural variation across the human genome in the journal Nature in May 2008. Large-scale structural variations are differences in
2925-567: Is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA , transfer RNA , ribozymes , small nuclear RNAs , and several types of regulatory RNAs . It also includes promoters and their associated gene-regulatory elements , DNA playing structural and replicatory roles, such as scaffolding regions , telomeres , centromeres , and origins of replication , plus large numbers of transposable elements , inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences . Introns make up
3042-448: Is a limit of 16 states maximum. ACPI 5.0 (2011) introduces collaborative processor performance control (CPPC), exposing hundreds of performance levels to the OS for selection in the form of a "performance level" abstracted away from the frequency. This abstraction provides some leeway for the processor to adjust its workings in ways other than just the frequency. A number of modern CPUs can perform frequency scaling autonomously, using
3159-416: Is analyzing millions of data points collected from thousands of healthy and cancerous patient tissue samples. These include tissues with lung, ovarian, prostate, pancreatic and breast cancers. By comparing these different data points, researchers aim to identify patterns of markers for different cancers and correlate them with different outcomes, including responsiveness to various treatment options. The project
3276-491: Is another related power conservation technique that is often used in conjunction with frequency scaling, as the frequency that a chip may run at is related to the operating voltage. The efficiency of some electrical components, such as voltage regulators, decreases with increasing temperature, so the power usage may increase with temperature. Since increasing power use may increase the temperature, increases in voltage or frequency may increase system power demands even further than
3393-583: Is deliterious to the organism and is under negative selective pressure is called garbage DNA. The first human genome sequences were published in nearly complete draft form in February 2001 by the Human Genome Project and Celera Corporation . Completion of the Human Genome Project's sequencing effort was announced in 2004 with the publication of a draft genome sequence, leaving just 341 gaps in
3510-400: Is done for the grid overall. World Community Grid recognizes companies and organizations as partners if they promote WCG within their company or organization. As of April 2021, WCG had 452 partners. Also, as part of its commitment to improving human health and welfare, the results of all computations completed on World Community Grid are released into the public domain and made available to
3627-464: Is focusing on 4 types of cancer, with the first focus being on lung cancer, and will move on to ovarian cancer, prostate cancer and sarcoma. Help Stop TB was launched in March 2016 to help combat tuberculosis , a disease caused by a bacterium that is evolving resistance to currently available treatments. The computations of this project target mycolic acids in the bacterium's protective coat, simulating
World Community Grid - Misplaced Pages Continue
3744-481: Is from 1 to 16 gigabytes. The first project launched on World Community Grid was the Human Proteome Folding Project, or HPF1, which aims to predict the structure of human proteins . The project was launched on November 16, 2004, and completed on July 18, 2006. This project was unique in that computation was done in tandem with the grid.org distributed computing project. Devised by Richard Bonneau at
3861-491: Is minimized. Leakage current has become more and more important as transistor sizes have become smaller and threshold voltage levels are reduced. A decade ago, dynamic power accounted for approximately two-thirds of the total chip power. The power loss due to leakage currents in contemporary CPUs and SoCs tend to dominate the total power consumption. In the attempt to control the leakage power, high-k metal-gates and power gating have been common methods. Dynamic voltage scaling
3978-447: Is no consensus in the literature on the amount of functional DNA since, depending on how "function" is understood, ranges have been estimated from up to 90% of the human genome is likely nonfunctional DNA (junk DNA) to up to 80% of the genome is likely functional. It is also possible that junk DNA may acquire a function in the future and therefore may play a role in evolution, but this is likely to occur only very rarely. Finally DNA that
4095-671: Is not to save battery life, as it is not used in AMD's mobile processor line, but instead with the purpose of producing less heat, which in turn allows the system fan to spin down to slower speeds, resulting in cooler and quieter operation, hence the name of the technology. AMD's PowerNow! CPU throttling technology is used in its mobile processor line, though some supporting CPUs like the AMD K6-2 + can be found in desktops as well. AMD PowerTune and AMD ZeroCore Power are dynamic frequency scaling technologies for GPUs . VIA Technologies processors use
4212-470: Is often static, while dynamic frequency scaling is always dynamic. Software can often incorporate overclocked frequencies into the frequency scaling algorithm, if the chip degradation risks are allowable. Intel 's CPU throttling technology, SpeedStep , is used in its mobile and desktop CPU lines. AMD employs two different CPU throttling technologies. AMD's Cool'n'Quiet technology is used on its desktop and server processor lines. The aim of Cool'n'Quiet
4329-422: Is to identify potential molecule candidates that could possibly be developed into treatments for Leishmaniasis . The extensive computing power of World Community Grid will be used to perform computer simulations of the interactions between millions of chemical compounds and certain target proteins. This will help find the most promising compounds that may lead to effective treatments for the disease. The mission of
4446-474: Is to improve the results of protein X-ray crystallography, which helps researchers not only annotate unknown parts of the human proteome, but importantly improves their understanding of cancer initiation, progression and treatment. The HCC project was the first WCG project benefiting from graphics processing units (GPU)s which helped finish it a lot earlier than initially projected due to the massive power of GPUs. In
4563-436: Is to provide deeper insight on the molecular scale into the origins of the efficient flow of water through a novel class of filter materials. This insight will in turn guide future development of low-cost and more efficient water filters. It is estimated that 1.2 billion people lack access to safe drinking water, and 2.6 billion have little or no sanitation. As a result, millions of people die annually – an estimated 3,900 children
4680-501: Is unclear whether any significant phenotypic effect results from typical variation in repeats or heterochromatin. Most gross genomic mutations in gamete germ cells probably result in inviable embryos; however, a number of human diseases are related to large-scale genomic abnormalities. Down syndrome , Turner Syndrome , and a number of other diseases result from nondisjunction of entire chromosomes. Cancer cells frequently have aneuploidy of chromosomes and chromosome arms, although
4797-578: The Dengue , Hepatitis C , West Nile , Yellow Fever , and other related viruses. The extensive computing power of World Community Grid will be used to complete the structure-based drug discovery calculations required to identify these drug candidates. Computing for Clean Water (launched September 20, 2010) is sponsored by the Center for Nano and Micro Mechanics of Tsinghua University in Beijing . The project's mission
World Community Grid - Misplaced Pages Continue
4914-700: The GO Fight Against Malaria project (launched November 16, 2011) is to discover promising drug candidates that could be developed into new drugs that cure drug resistant forms of malaria . The computing power of World Community Grid will be used to perform computer simulations of the interactions between millions of chemical compounds and certain target proteins, to predict their ability to eliminate malaria. The best compounds will be tested by scientists at The Scripps Research Institute in La Jolla, California, U.S.A. and further developed into possible treatments for
5031-495: The Institute for Systems Biology , the project used grid computing to produce the likely structures for each of the proteins using a Rosetta Score. From these predictions, researchers hope to predict the function of the myriad proteins. This increased understanding of the human proteins could prove vital in the search for cures to human diseases . Computing for this project was officially completed on July 18, 2006. Research results for
5148-552: The International HapMap Project . The genomic loci and length of certain types of small repetitive sequences are highly variable from person to person, which is the basis of DNA fingerprinting and DNA paternity testing technologies. The heterochromatic portions of the human genome, which total several hundred million base pairs, are also thought to be quite variable within the human population (they are so repetitive and so long that they cannot be accurately sequenced with current technology). These regions contain few genes, and it
5265-514: The Smallpox Research Grid Project to accelerate the discovery of a cure for smallpox . The smallpox study used a massive distributed computing grid to analyse compounds' effectiveness against smallpox. The project allowed scientists to screen 35 million potential drug molecules against several smallpox proteins to identify good candidates for developing into smallpox treatments. In the first 72 hours, 100,000 results were returned. By
5382-422: The University of Washington . The project was launched on May 12, 2008, and completed on April 6, 2010. The purpose of this project is to predict the structure of proteins of major strains of rice , in order to help farmers breed better rice strains with higher crop yields , promote greater disease and pest resistance, and utilize a full range of bioavailable nutrients that can benefit people around
5499-619: The grid.org distributed computing projects. Demand for Linux support led to the addition in November 2005 of open source Berkeley Open Infrastructure for Network Computing (BOINC) software which powers projects such as SETI@home and Climateprediction . Mac OS and Linux support was added since the introduction of BOINC. In 2007, the World Community Grid migrated from Grid MP to BOINC for all of its supported platforms. In September 2021, IBM announced that it had transferred ownership of
5616-486: The 'completion' of the human genome project was announced in 2001, there remained hundreds of gaps, with about 5–10% of the total sequence remaining undetermined. The missing genetic information was mostly in repetitive heterochromatic regions and near the centromeres and telomeres , but also some gene-encoding euchromatic regions. There remained 160 euchromatic gaps in 2015 when the sequences spanning another 50 formerly unsequenced regions were determined. Only in 2020
5733-549: The April 2013 status report the scientists report there is still a lot of data to analyze but that they are preparing a new project that will search for prognostic and predictive signatures (sets of genes, proteins, microRNAs, etc.) that help predict patient survival and response to treatment. The project finished in May 2013. The Nutritious Rice for the World project is carried out by Ram Samudrala 's Computational Biology Research Group at
5850-576: The CMOS formula indicates, and vice versa. ACPI 1.0 (1996) defines a way for a CPU to go to idle "C states", but defines no frequency-scaling system. ACPI 2.0 (2000) introduces a system of P states (power-performance states) that a processor can use to communicate its possible frequency–power settings to the OS. The operating system then sets the speed as needed by switching between these states. Throttling technology such as SpeedStep, PowerNow!/Cool'n'Quiet, and PowerSaver all work through P states. There
5967-656: The Help Cure Muscular Dystrophy project began once the results from the first phase had been analyzed. Phase 2 ran on the BOINC platform. Discovering Dengue Drugs – Together – Phase 2 (launched February 17, 2010) is sponsored by The University of Texas Medical Branch (UTMB) in Galveston , Texas , United States and the University of Chicago in Illinois , USA. The mission is to identify promising drug candidates to combat
SECTION 50
#17329055065546084-671: The Influenza Antiviral Drug Search project is to find new drugs that can stop the spread of an influenza infection in the body. The research will specifically address the influenza strains that have become drug resistant as well as new strains that are appearing. Identifying the chemical compounds that are the best candidates will accelerate the efforts to develop treatments that would be useful in managing seasonal influenza outbreaks, and future influenza epidemics and even pandemics. Phase 1 of The Influenza Antiviral Drug Search project has already finished on October 22, 2009. Now
6201-495: The V component and the fact that modern CPUs are strongly optimized for low power idle states. In most constant-voltage cases, it is more efficient to run briefly at peak speed and stay in a deep idle state for longer time (called " race to idle " or computational sprinting), than it is to run at a reduced clock rate for a long time and only stay briefly in a light idle state. However, reducing voltage along with clock rate can change those trade-offs. A related-but-opposite technique
6318-470: The World Community Grid servers can look at the points claimed by each of those computers. The WCG servers disregard statistical outliers, average the remaining values and award the resulting number of points to each computer. Within the grid, users may join teams that have been created by organizations, groups, or individuals. Teams allow for a heightened sense of community identity and can also inspire competition. As teams compete against each other, more work
6435-617: The World Community Grid to the Krembil Research Institute . As of January 8, 2023, World Community Grid had over 23,000 active user accounts, with over 57,000 active devices. Over the course of the project, more than 2,000,000 cumulative years of computing time have been donated, and over 6,000,000,000 work units have been completed. The World Community Grid software uses the unused computing time of Internet -connected devices to perform research calculations. Users install WCG client software onto their devices. This software works in
6552-498: The World Community Grid, researchers were able to calculate the electronic properties of tens of thousands of organic materials – many more than could ever be tested in a lab – and determine which candidates are most promising for developing affordable solar energy technology. Phase 2 was launched June 28, 2010, sponsored by the scientists of Harvard University 's Department of Chemistry and Chemical Biology. Further calculations about optical, electronic and other physical properties of
6669-403: The Y chromosome is quite small. Most human cells are diploid so they contain twice as much DNA (~6.2 billion base pairs). In 2023, a draft human pangenome reference was published. It is based on 47 genomes from persons of varied ethnicity. Plans are underway for an improved reference capturing still more biodiversity from a still wider sample. While there are significant differences among
6786-409: The ability of medical professionals to determine the best treatment options for patients with breast, head, or neck cancer. The project was launched on July 20, 2006, and completed in April 2007. The project worked by identifying visual patterns in large numbers of tissue microarrays taken from archived tissue samples. By correlating the pattern data with information about treatment and patient outcome,
6903-435: The accumulation of inactivating mutations. The number of pseudogenes in the human genome is on the order of 13,000, and in some chromosomes is nearly the same as the number of functional protein-coding genes. Gene duplication is a major mechanism through which new genetic material is generated during molecular evolution . For example, the olfactory receptor gene family is one of the best-documented examples of pseudogenes in
7020-435: The advent of genomic sequencing, the identification of these sequences could be inferred by evolutionary conservation. The evolutionary branch between the primates and mouse , for example, occurred 70–90 million years ago. So computer comparisons of gene sequences that identify conserved non-coding sequences will be an indication of their importance in duties such as gene regulation. Other genomes have been sequenced with
7137-432: The amount of idle resources available, contributions are usually measured in terms of points . Points are awarded for each workunit depending on the effort required to process it. Upon completing a workunit, the BOINC client will request the number of points it thinks it deserves based on software benchmarks ( see BOINC Credit System#Cobblestones ). Since multiple computers process the same workunit to ensure accuracy,
SECTION 60
#17329055065547254-684: The analysis of the data generated by WCG. These include an OpenZika project paper on the discovery of a compound (FAM 3) that inhibits the NS3 Helicase protein of the Zika virus, thus reducing viral replication by up to 86%; a FightAIDS@home paper on the discovery of new vulnerabilities on the HIV-1 Capsid protein which may allow for a new drug target; a FightAIDS@home paper on new computational drug discovery techniques for more refined and accurate results. In 2003, IBM and other research participants sponsored
7371-773: The application of such knowledge to the treatment of disease and in the medical field is only in its very beginnings. Exome sequencing has become increasingly popular as a tool to aid in diagnosis of genetic disease because the exome contributes only 1% of the genomic sequence but accounts for roughly 85% of mutations that contribute significantly to disease. In humans, gene knockouts naturally occur as heterozygous or homozygous loss-of-function gene knockouts. These knockouts are often difficult to distinguish, especially within heterogeneous genetic backgrounds. They are also difficult to find as they occur in low frequencies. Populations with high rates of consanguinity , such as countries with high rates of first-cousin marriages, display
7488-418: The average size of an intron is about 6 kb (6,000 bp). This means that the average size of a protein-coding gene is about 62 kb and these genes take up about 40% of the genome. Exon sequences consist of coding DNA and untranslated regions (UTRs) at either end of the mature mRNA. The total amount of coding DNA is about 1-2% of the genome. Many people divide the genome into coding and non-coding DNA based on
7605-494: The background, using spare system resources to process work for WCG. When a piece of work or workunit is completed, the client software sends it back to WCG over the Internet and downloads a new workunit. To ensure accuracy, the WCG servers send out multiple copies of each workunit. Then, when the results are received, they are collected and validated against each other. World Community Grid offers multiple humanitarian projects under
7722-594: The behaviour of these molecules to better understand how they offer protection to the bacteria. Launched in January 2017, the Smash Childhood Cancer project builds on the work from the Help Fight Childhood Cancer project by looking for drug candidates targeting additional childhood cancers. Upon Dr. Akira Nakagawara's retirement in March 2020, the principal investigator changed to Dr. Godfrey Chan, who
7839-464: The biological functions of their protein and RNA products. In 2000, scientists reported the sequencing of 88% of human genome, but as of 2020, at least 8% was still missing. In 2021, scientists reported sequencing a complete, female genome (i.e., without the Y chromosome). The human Y chromosome , consisting of 62,460,029 base pairs from a different cell line and found in all males, was sequenced completely in January 2022. The current version of
7956-593: The candidate materials are being conducted with the Q-Chem quantum chemistry software. Their findings have been submitted to the Energy & Environmental Science journal. Help Fight Childhood Cancer project (launched March 13, 2009) is sponsored by the scientists at Chiba Cancer Center Research Institute and Chiba University . The mission of the Help Fight Childhood Cancer project is to find drugs that can disable three particular proteins associated with neuroblastoma , one of
8073-525: The candidates that make it through Phase 1." The drug candidates that make it through Phase 2 will then be lab-tested. The mission of AfricanClimate@Home was to develop more accurate climate models of specific regions in Africa. It was intended to serve as a basis for understanding how the climate will change in the future so that measures designed to alleviate the adverse effects of climate change could be implemented. World Community Grid's tremendous computing power
8190-410: The diagnosis and treatment of diseases, and to new insights in many fields of biology, including human evolution . By 2018, the total number of genes had been raised to at least 46,831, plus another 2300 micro-RNA genes. A 2018 population survey found another 300 million bases of human genome that was not in the reference sequence. Prior to the acquisition of the full genome sequence, estimates of
8307-517: The dinucleotide repeat (AC) n ) are termed microsatellite sequences. Among the microsatellite sequences, trinucleotide repeats are of particular importance, as sometimes occur within coding regions of genes for proteins and may lead to genetic disorders. For example, Huntington's disease results from an expansion of the trinucleotide repeat (CAG) n within the Huntingtin gene on human chromosome 4. Telomeres (the ends of linear chromosomes) end with
8424-583: The disease. Say No to Schistosoma (launched February 22, 2012) was the 20th research project to be launched on World Community Grid. The researchers at Infórium University in Belo Horizonte and FIOCRUZ-Minas , Brazil , ran this project on World Community Grid to perform computer simulations of the interactions between millions of chemical compounds and certain target proteins in the hope of finding effective treatments for schistosomiasis . As of April 2015, subsequent analysis had been performed, and three of
8541-481: The end of the project, 44 strong treatment candidates had been identified. Based on the success of the Smallpox study, IBM announced the creation of World Community Grid on November 16, 2004, with the goal of creating a technical environment where other humanitarian research could be processed. World Community Grid initially only supported Windows, using the proprietary Grid MP software from United Devices which powered
8658-537: The exact number in the human genome is yet to be determined. Many RNAs are thought to be non-functional. Many ncRNAs are critical elements in gene regulation and expression. Noncoding RNA also contributes to epigenetics, transcription, RNA splicing, and the translational machinery. The role of RNA in genetic regulation and disease offers a new potential level of unexplored genomic complexity. Pseudogenes are inactive copies of protein-coding genes, often generated by gene duplication , that have become nonfunctional through
8775-490: The field of computational protein modeling. These results – which were only possible because of the massive amount of donated computing power they had available – are expected to guide future research and plant science efforts. The Clean Energy project is sponsored by the scientists of Harvard University 's Department of Chemistry and Chemical Biology. The mission of the Clean Energy Project is to find new materials for
8892-430: The first family sequenced as part of Illumina's Personal Genome Sequencing program. Since then hundreds of personal genome sequences have been released, including those of Desmond Tutu , and of a Paleo-Eskimo . In 2012, the whole genome sequences of two family trios among 1092 genomes was made public. In November 2013, a Spanish family made four personal exome datasets (about 1% of the genome) publicly available under
9009-488: The frequency at which the circuit is clocked, and can be reduced if the frequency is also reduced. Dynamic power alone does not account for the total power of the chip, however, as there is also static power, which is primarily because of various leakage currents. Due to static power consumption and asymptotic execution time it has been shown that the energy consumption of software shows convex energy behavior, i.e., there exists an optimal CPU frequency at which energy consumption
9126-542: The gene that has been knocked out. Dynamic frequency scaling Dynamic frequency scaling (also known as CPU throttling ) is a power management technique in computer architecture whereby the frequency of a microprocessor can be automatically adjusted "on the fly" depending on the actual needs, to conserve power and reduce the amount of heat generated by the chip. Dynamic frequency scaling helps preserve battery on mobile devices and decrease cooling cost and noise on quiet computing settings , or can be useful as
9243-427: The genome among people that range from a few thousand to a few million DNA bases; some are gains or losses of stretches of genome sequence and others appear as re-arrangements of stretches of sequence. These variations include differences in the number of copies individuals have of a particular gene, deletions, translocations and inversions. Structural variation refers to genetic variants that affect larger segments of
9360-446: The genome since geneticists, evolutionary biologists, and molecular biologists employ different definitions and methods. Due to the ambiguity in the terminology, different schools of thought have emerged. In evolutionary definitions, "functional" DNA, whether it is coding or non-coding, contributes to the fitness of the organism, and therefore is maintained by negative evolutionary pressure whereas "non-functional" DNA has no benefit to
9477-511: The genome, however extrapolations from the ENCODE project give that 20 or more of the genome is gene regulatory sequence. Some types of non-coding DNA are genetic "switches" that do not encode proteins, but do regulate when and where genes are expressed (called enhancers ). Regulatory sequences have been known since the late 1960s. The first identification of regulatory sequences in the human genome relied on recombinant DNA technology. Later with
9594-533: The genome. However, studies on SNPs are biased towards coding regions, the data generated from them are unlikely to reflect the overall distribution of SNPs throughout the genome. Therefore, the SNP Consortium protocol was designed to identify SNPs with no bias towards coding regions and the Consortium's 100,000 SNPs generally reflect sequence diversity across the human chromosomes. The SNP Consortium aims to expand
9711-409: The genomes of human individuals (on the order of 0.1% due to single-nucleotide variants and 0.6% when considering indels ), these are considerably smaller than the differences between humans and their closest living relatives, the bonobos and chimpanzees (~1.1% fixed single-nucleotide variants and 4% when including indels). The total length of the human reference genome does not represent
9828-406: The grid will increase the total/average power required to complete the same calculations. The BOINC client avoids slowing the computer by using a variety of limits that suspend computation when there are insufficient free resources. Unlike other BOINC projects, World Community Grid set the BOINC defaults conservatively, making the chances of computer damage extremely small. The default CPU throttle
9945-427: The highest frequencies of homozygous gene knockouts. Such populations include Pakistan, Iceland, and Amish populations. These populations with a high level of parental-relatedness have been subjects of human knock out research which has helped to determine the function of specific genes in humans. By distinguishing specific knockouts, researchers are able to use phenotypic analyses of these individuals to help characterize
10062-544: The highest mutation rate, presumably due to deamination. A personal genome sequence is a (nearly) complete sequence of the chemical base pairs that make up the DNA of a single person. Because medical treatments have different effects on different people due to genetic variations such as single-nucleotide polymorphisms (SNPs), the analysis of personal genomes may lead to personalized medical treatment based on individual genotypes. The first personal genome sequence to be determined
10179-692: The human genome, as opposed to point mutations . Often, structural variants (SVs) are defined as variants of 50 base pairs (bp) or greater, such as deletions, duplications, insertions, inversions and other rearrangements. About 90% of structural variants are noncoding deletions but most individuals have more than a thousand such deletions; the size of deletions ranges from dozens of base pairs to tens of thousands of bp. On average, individuals carry ~3 rare structural variants that alter coding regions, e.g. delete exons . About 2% of individuals carry ultra-rare megabase-scale structural variants, especially rearrangements. That is, millions of base pairs may be inverted within
10296-643: The human genome. More than 60 percent of the genes in this family are non-functional pseudogenes in humans. By comparison, only 20 percent of genes in the mouse olfactory receptor gene family are pseudogenes. Research suggests that this is a species-specific characteristic, as the most closely related primates all have proportionally fewer pseudogenes. This genetic discovery helps to explain the less acute sense of smell in humans relative to other mammals. The human genome has many different regulatory sequences which are crucial to controlling gene expression . Conservative estimates indicate that these sequences make up 8% of
10413-441: The human genome. These sequences ultimately lead to the production of all human proteins , although several biological processes (e.g. DNA rearrangements and alternative pre-mRNA splicing ) can lead to the production of many more unique proteins than the number of protein-coding genes. The human reference genome contains somewhere between 19,000 and 20,000 protein-coding genes. These genes contain an average of 10 introns and
10530-542: The human reference genome: The Genome Reference Consortium is responsible for updating the HRG. Version 38 was released in December 2013. Most studies of human genetic variation have focused on single-nucleotide polymorphisms (SNPs), which are substitutions in individual bases along a chromosome. Most analyses estimate that SNPs occur 1 in 1000 base pairs, on average, in the euchromatic human genome, although they do not occur at
10647-468: The idea that coding DNA is the most important functional component of the genome. About 98-99% of the human genome is non-coding DNA. Noncoding RNA molecules play many essential roles in cells, especially in the many reactions of protein synthesis and RNA processing . Noncoding genes include those for tRNAs , ribosomal RNAs, microRNAs , snRNAs and long non-coding RNAs (lncRNAs). The number of reported non-coding genes continues to rise slowly but
10764-582: The investigated cell type. Repetitive DNA sequences comprise approximately 50% of the human genome. About 8% of the human genome consists of tandem DNA arrays or tandem repeats, low complexity repeat sequences that have multiple adjacent copies (e.g. "CAGCAGCAG..."). The tandem sequences may be of variable lengths, from two nucleotides to tens of nucleotides. These sequences are highly variable, even among closely related individuals, and so are used for genealogical DNA testing and forensic DNA analysis . Repeated sequences of fewer than ten nucleotides (e.g.
10881-418: The late 1990s and early 2000s, such calculations were meant to reduce "wasted" CPU cycles. With modern CPUs, where dynamic frequency scaling is prevalent, increased usage makes the processor run at higher frequency, increasing power usage and heating counter to power management . Additionally, because of an increasing focus on power performance, or performance per watt , connecting old/inefficient computers to
10998-441: The most frequently occurring solid tumors in children. Identifying these drugs could potentially make the disease much more curable when combined with chemotherapy treatment . Influenza Antiviral Drug Search project is sponsored by Dr. Stan Watowich and his research team at The University of Texas Medical Branch ( Galveston , Texas , USA). The project was launched on May 5, 2009, and completed on October 22, 2009. The mission of
11115-552: The most promising candidate substances had been identified for in-vitro testing. Human genome The human genome is a complete set of nucleic acid sequences for humans, encoded as the DNA within each of the 24 distinct chromosomes in the cell nucleus. A small DNA molecule is found within individual mitochondria . These are usually treated separately as the nuclear genome and the mitochondrial genome . Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins . The latter
11232-402: The next generation of solar cells and later, energy storage devices. Researchers are employing molecular mechanics and electronic structure calculations to predict the optical and transport properties of molecules that could become the next generation of solar cell materials. Phase 1 was launched on December 5, 2008, and completed on October 13, 2009. By harnessing the computing power of
11349-423: The number of SNPs identified across the genome to 300 000 by the end of the first quarter of 2001. Changes in non-coding sequence and synonymous changes in coding sequence are generally more common than non-synonymous changes, reflecting greater selective pressure reducing diversity at positions dictating amino acid identity. Transitional changes are more common than transversions, with CpG dinucleotides showing
11466-462: The number of human genes ranged from 50,000 to 140,000 (with occasional vagueness about whether these estimates included non-protein coding genes). As genome sequence quality and the methods for identifying protein-coding genes improved, the count of recognized protein-coding genes dropped to 19,000–20,000. In 2022, the Telomere-to-Telomere (T2T) consortium reported the complete sequence of
11583-610: The organism and therefore is under neutral selective pressure. This type of DNA has been described as junk DNA . In genetic definitions, "functional" DNA is related to how DNA segments manifest by phenotype and "nonfunctional" is related to loss-of-function effects on the organism. In biochemical definitions, "functional" DNA relates to DNA sequences that specify molecular products (e.g. noncoding RNAs) and biochemical activities with mechanistic roles in gene or genome regulation (i.e. DNA sequences that impact cellular level activity such as cell type, condition, and molecular processes). There
11700-467: The past, notably in FightAIDS@Home projects. The project runs on CPUs and GPUs and will also serve to create a "fast-response, open source tool that will help all scientists quickly search for treatments for future pandemics." The project launched on May 14, 2020. Mapping Cancer Markers (launched November 8, 2013). The project aims to identify the markers associated with various types of cancer, and
11817-473: The researchers are performing post-processing on the results from Phase 1 and are preparing for Phase 2. In November 2012, the project's scientists stated that, given the fact that there is no immediate danger of an influenza outbreak, all of the project's results would be posted online and their resources would be refocused on the Dengue Project. World Community Grid and researchers supported by Decrypthon,
11934-479: The results of this project could help provide better targeted treatment options. The Genome Comparison project is sponsored by the Brazilian research institution Fiocruz . The project was launched on November 21, 2006, and completed on July 21, 2007. The project seeks to compare gene sequences of different organisms against each other in order to find similarities between them. Scientists hope to discover what purpose
12051-414: The run time accordingly. It also uses a shorter switching time of less than one second, resulting in less temperature change during switching. The contributions of each user are recorded and user contribution statistics are publicly available. Due to the fact that the processing time of each workunit varies from computer to computer, depending on the difficulty of the workunit, the speed of the computer, and
12168-514: The same intention of aiding conservation-guided methods, for exampled the pufferfish genome. However, regulatory sequences disappear and re-evolve during evolution at a high rate. As of 2012, the efforts have shifted toward finding interactions between DNA and regulatory proteins by the technique ChIP-Seq , or gaps where the DNA is not packaged by histones ( DNase hypersensitive sites ), both of which tell where there are active regulatory sequences in
12285-540: The scientific community. Since its launch, more than thirty projects have run in the World Community Grid. Some of the results include: On April 1, 2020, IBM announced OpenPandemics - COVID-19 . The project aims to identify possible treatments for the Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) which is responsible for the COVID-19 pandemic . WCG will partner with Scripps Research , with whom it has partnered in
12402-448: The sequence of any specific individual, nor does it represent the sequence of all of the DNA found within a cell. The human reference genome only includes one copy of each of the paired, homologous autosomes plus one copy of each of the two sex chromosomes (X and Y). The total amount of DNA in this reference genome is 3.1 billion base pairs (3.1 Gb). Protein-coding sequences represent the most widely studied and best understood component of
12519-511: The sequence, representing highly repetitive and other DNA that could not be sequenced with the technology available at the time. The human genome was the first of all vertebrates to be sequenced to such near-completion, and as of 2018, the diploid genomes of over a million individual humans had been determined using next-generation sequencing . These data are used worldwide in biomedical science , anthropology , forensics and other branches of science. Such genomic studies have led to advances in
12636-419: The standard reference genome is called GRCh38.p14 (July 2023). It consists of 22 autosomes plus one copy of the X chromosome and one copy of the Y chromosome. It contains approximately 3.1 billion base pairs (3.1 Gb or 3.1 x 10 bp). This represents the size of a composite genome based on data from multiple individuals but it is a good indication of the typical amount of DNA in a haploid set of chromosomes because
12753-504: The world, especially in regions where malnutrition is a critical concern. The project has been covered by more than 200 media outlets since its inception. On April 13, 2010, World Community Grid officially announced that the Nutritious Rice for the World project finished on April 6, 2010. In April 2014, an update was posted stating that the research team was able to publish structural information about thousands of proteins, and advance
12870-575: The yeast portion of HPF1 have been published. Human Proteome Folding - Phase 2 (HPF2) (launched June 23, 2006) was the third project to run on World Community Grid, and completed in 2013. This project, following on from HPF1, focused on human-secreted proteins , with special focus on biomarkers and the proteins on the surface of cells as well as Plasmodium , the organism that causes malaria. HPF2 generates higher-resolution protein models than HPF1. Though these higher-resolution models are more useful, they also require more processing power to generate. In
12987-502: Was also completed. In 2009, Stephen Quake published his own genome sequence derived from a sequencer of his own design, the Heliscope. A Stanford team led by Euan Ashley published a framework for the medical interpretation of human genomes implemented on Quake's genome and made whole genome-informed medical decisions for the first time. That team further extended the approach to the West family,
13104-587: Was one of the original members of the Smash Childhood Cancer team. Additionally, PRDM14 and Fox01 have been added as new targets for investigation. An inhibitor of the osteopontin protein was modeled. The Africa Rainfall Project (launched October 2019) will use the computing power of World Community Grid, data from The Weather Company, and other data to improve rainfall modelling, which can help farmers in sub-Saharan Africa successfully raise their crops. The amount of RAM that can be involved in calculations
13221-413: Was published. It is based on 47 genomes from persons of varied ethnicity. Plans are underway for an improved reference capturing still more biodiversity from a still wider sample. With the exception of identical twins, all humans show significant variation in genomic DNA sequences. The human reference genome (HRG) is used as a standard sequence reference. There are several important points concerning
13338-434: Was sponsored by scientists at the University of Texas and the University of Chicago and will run in two phases. Phase 1, launched August 21, 2007, used AutoDock 2007 (the same software used for FightAIDS@Home ) to test potential antiviral drugs (through NS3 protease inhibition) against viruses from the family flaviviridae and completed on August 11, 2009. Phase 2 "[uses] a more computationally intensive program to screen
13455-493: Was that of Craig Venter in 2007. Personal genomes had not been sequenced in the public Human Genome Project to protect the identity of volunteers who provided DNA samples. That sequence was derived from the DNA of several volunteers from a diverse population. However, early in the Venter-led Celera Genomics genome sequencing effort the decision was made to switch from sequencing a composite sample to using DNA from
13572-408: Was the first truly complete telomere-to-telomere sequence of a human chromosome determined, namely of the X chromosome . The first complete telomere-to-telomere sequence of a human autosomal chromosome, chromosome 8 , followed a year later. The complete human genome (without Y chromosome) was published in 2021, while with Y chromosome in January 2022. In 2023, a draft human pangenome reference
13689-578: Was used to understand and reduce the uncertainty with which climate processes were simulated over Africa. Phase 1 of African Climate@Home launched on September 3, 2007, and ended in July 2008. Help Conquer Cancer project (launched November 1, 2007) is sponsored by the Ontario Cancer Institute (OCI), Princess Margaret Hospital and University Health Network of Toronto, Canada. The project involves X-ray crystallography . The mission of Help Conquer Cancer
#553446