MMLU - Misplaced Pages

In artificial intelligence , Measuring Massive Multitask Language Understanding ( MMLU ) is a benchmark for evaluating the capabilities of large language models .

#857142

59-571: It consists of about 16,000 multiple-choice questions spanning 57 academic subjects including mathematics, philosophy, law, and medicine. It is one of the most commonly used benchmarks for comparing the capabilities of large language models, with over 100 million downloads as of July 2024. The MMLU was released by Dan Hendrycks and a team of researchers in 2020 and was designed to be more challenging than then-existing benchmarks such as General Language Understanding Evaluation (GLUE) on which new language models were achieving better-than-human accuracy. At

118-495: A "qualified definition of effective altruism" in which effective altruists try to do the most good "without violating constraints" such as any obligations that someone might have to help those nearby. William Schambra has criticized the impartial logic of effective altruism, arguing that benevolence arising from reciprocity and face-to-face interactions is stronger and more prevalent than charity based on impartial, detached altruism. Such community-based charitable giving, he wrote,

177-529: A collaborative spirit. To support people's ability to act altruistically on the basis of impartial reasoning, the effective altruism movement promotes values and actions such as a collaborative spirit, honesty, transparency, and publicly pledging to donate a certain percentage of income or other resources. Effective altruism aims to emphasize impartial reasoning in that everyone's well-being counts equally. Singer, in his 1972 essay " Famine, Affluence, and Morality ", wrote: It makes no moral difference whether

236-457: A culture of sexual misconduct. Beginning in the latter half of the 2000s, several communities centered around altruist, rationalist, and futurological concerns started to converge, such as: In 2011, Giving What We Can and 80,000 Hours decided to incorporate into an umbrella organization and held a vote for their new name; the "Centre for Effective Altruism" was selected. The Effective Altruism Global conference has been held since 2013. As

295-572: A rather homogeneous movement of middle-class white men fighting poverty through largely conventional means, but it is at least in theory a broad church." Judith Lichtenberg in The New Republic said that effective altruists "neglect the kind of structural and political change that is ultimately necessary". An article in The Ecologist published in 2016 argued that effective altruism is an apolitical attempt to solve political problems, describing

354-478: A similar scenario of either saving a child from a burning building or saving a Picasso painting to sell and donate the proceeds to charity, MacAskill responded that the effective altruist should save and sell the Picasso. Psychologist Alan Jern called MacAskill's choice "unnatural, even distasteful, to many people", although Jern concluded that effective altruism raises questions "worth asking". MacAskill later endorsed

413-466: A specific cause or geography" and could resolve the conflict between local and global perspectives for some donors. Some charities are considered to be far more effective than others, as charities may spend different amounts of money to achieve the same goal, and some charities may not achieve the goal at all. Effective altruists seek to identify interventions that are highly cost-effective in expectation . Many interventions have uncertain benefits, and

472-421: A version of Singer's drowning child analogy, philosopher Kwame Anthony Appiah in 2006 asked whether the most effective action of a man in an expensive suit, confronted with a drowning child, would not be to save the child and ruin his suit—but rather, sell the suit and donate the proceeds to charity. Appiah believed that he "should save the drowning child and ruin my suit". In a 2015 debate, when presented with

531-485: Is "cause prioritization". Cause prioritization is based on the principle of cause neutrality , the idea that resources should be distributed to causes based on what will do the most good, irrespective of the identity of the beneficiary and the way in which they are helped. By contrast, many non-profits emphasize effectiveness and evidence with respect to a single cause such as education or climate change. One tool that EA-based organizations may use to prioritize cause areas

590-444: Is a think tank founded to expand the moral circle to other sentient beings. The ethical stance of longtermism , emphasizing the importance of positively influencing the long-term future, developed closely in relation to effective altruism. Longtermism argues that "distance in time is like distance in space", suggesting that the welfare of future individuals matters as much as the welfare of currently existing individuals. Given

649-505: Is a field. (A) 0 (B) 1 (C) 2 (D) 3 Would a reservation to the definition of torture in the ICCPR be acceptable in contemporary practice? (A) This is an acceptable reservation if the reserving country’s legislation employs a different definition (B) This is an unacceptable reservation because it contravenes the object and purpose of the ICCPR (C) This is an unacceptable reservation because

SECTION 10

#1732890924858

708-459: Is a moral duty to alleviate suffering through donations if other possible uses of those funds do not offer comparable benefits to oneself. Some lead a frugal lifestyle in order to donate more. Giving What We Can (GWWC) is an organization whose members pledge to donate at least 10% of their future income to the causes that they believe are the most effective. GWWC was founded in 2009 by Toby Ord, who lives on £18,000 ($ 27,000) per year and donates

767-411: Is also an advisor at Scale AI . In 2024 Hendrycks published a 568 page book entitled "Introduction to AI Safety, Ethics, and Society" based on courseware he had previously developed. Effective altruism Effective altruism ( EA ) is a 21st-century philosophical and social movement that advocates impartially calculating benefits and prioritizing causes to provide the greatest good. It

826-558: Is an organization that conducts research and gives advice on which careers have the largest positive impact. Some effective altruists start non-profit or for-profit organizations to implement cost-effective ways of doing good. On the non-profit side, for example, Michael Kremer and Rachel Glennerster conducted randomized controlled trials in Kenya to find out the best way to improve students' test scores. They tried new textbooks and flip charts, as well as smaller class sizes, but found that

885-461: Is correct, and that "when we are morally uncertain, we should act in a way that serves as a best compromise between different moral views". He also wrote that even from a purely consequentialist perspective, "naive calculations that justify some harmful action because it has good consequences are, in practice, almost never correct". The principles and goals of effective altruism are wide enough to support furthering any cause that allows people to do

944-462: Is difficult and sometimes impossible but often necessary. MacAskill argued that the more pernicious form of elitism was that of donating to art galleries (and like institutions) instead of charity. Ian David Moss suggested that the criticism of cause prioritization could be resolved by what he called "domain-specific effective altruism", which would encourage "that principles of effective altruism be followed within an area of philanthropic focus, such as

1003-688: Is foundational to civil society and, in turn, democracy . Larissa MacFarquhar said that people have diverse moral emotions, and she suggested that some effective altruists are not unemotional and detached but feel as much empathy for distant strangers as for people nearby. Richard Pettigrew concurred that many effective altruists "feel more profound dismay at the suffering of people unknown to them than many people feel", and he argued that impartiality in EA need not be dispassionate and "is not obviously in tension with much in care ethics " as some philosophers have argued. Ross Douthat of The New York Times criticized

1062-483: Is motivated by "using evidence and reason to figure out how to benefit others as much as possible, and taking action on that basis". People who pursue the goals of effective altruism, who are sometimes called effective altruists , follow a variety of approaches proposed by the movement, such as donating to selected charities and choosing careers with the aim of maximizing positive impact. The movement has achieved significant popularity outside of academia, spurring

1121-745: Is not a consensus on the answers, and there are also differences between effective altruists who believe that they should do the most good they possibly can with all of their resources and those who only try do the most good they can within a defined budget. According to MacAskill, the view of effective altruism as doing the most good one can within a defined budget can be compatible with a wide variety of views on morality and meta-ethics , as well as traditional religious teachings on altruism such as in Christianity . Effective altruism can also be in tension with religion where religion emphasizes spending resources on worship and evangelism instead of causes that do

1180-598: Is the importance, tractability, and neglectedness framework. Importance is the amount of value that would be created if a problem were solved, tractability is the fraction of a problem that would be solved if additional resources were devoted to it, and neglectedness is the quantity of resources already committed to a cause. The information required for cause prioritization may involve data analysis , comparing possible outcomes with what would have happened under other conditions ( counterfactual reasoning ), and identifying uncertainty . The difficulty of these tasks has led to

1239-581: Is the question of which beings are deserving of moral consideration. Some effective altruists consider the well-being of non-human animals in addition to humans, and advocate for animal welfare issues such as ending factory farming . Those who subscribe to longtermism include future generations as possible beneficiaries and try to improve the moral value of the long-term future by, for example, reducing existential risks . The drowning child analogy in Singer's essay provoked philosophical debate. In response to

SECTION 20

#1732890924858

1298-462: The " measurement problem ", with issues such as medical research or government reform worked on "one grinding step at a time", and results being hard to measure with controlled experiments. Gobry also argues that such interventions risk being undervalued by the effective altruism movement. As effective altruism emphasizes a data-centric approach, critics say principles which do not lend themselves to quantification—justice, fairness, equality—get left in

1357-690: The Estonian billionaire founder of Skype, is known for donating to some effective altruist causes. Sam Bankman-Fried launched a philanthropic organization called the FTX Foundation in February 2021, and it made contributions to a number of effective altruist organizations, but it was shut down in November 2022 when FTX collapsed. A number of books and articles related to effective altruism have been published that have codified, criticized, and brought more attention to

1416-785: The Fish Welfare Initiative works on improving animal welfare in fishing and aquaculture; and the Lead Exposure Elimination Project works on reducing lead poisoning in developing countries. While much of the initial focus of effective altruism was on direct strategies such as health interventions and cash transfers, more systematic social, economic, and political reforms have also attracted attention. Mathew Snow in Jacobin wrote that effective altruism "implores individuals to use their money to procure necessities for those who desperately need them, but says nothing about

1475-503: The Future in 2022. In 2023, Oxford University Press published the volume The Good it Promises, The Harm it Does: Critical Essays on Effective Altruism , edited by Carol J. Adams , Alice Crary , and Lori Gruen . Effective altruists focus on the many philosophical questions related to the most effective ways to benefit others. Such philosophical questions shift the starting point of reasoning from "what to do" to "why" and "how". There

1534-525: The US National Institute of Standards and Technology (NIST) to inform the management of risks from artificial intelligence . In September 2022, Hendrycks wrote a paper providing a framework for analyzing the impact of AI research on societal risks. He later published a paper in March 2023 examining how natural selection and competitive pressures could shape the goals of artificial agents . This

1593-431: The amount of good one candidate does to how much good the next-best candidate would do. According to this reasoning, the marginal impact of a career is likely to be smaller than the gross impact. Although EA aims for maximizing like utilitarianism , EA differs from utilitarianism in a few ways; for example, EA does not claim that people should always maximize the good regardless of the means , and EA does not claim that

1652-860: The balance of his income. In 2020, Ord said that people had donated over $ 100 million to date through the GWWC pledge. Founders Pledge is a similar initiative, founded out of the non-profit Founders Forum for Good, whereby entrepreneurs make a legally binding commitment to donate a percentage of their personal proceeds to charity in the event that they sell their business. As of April 2024, nearly 1,900 entrepreneurs had pledged around $ 10 billion and nearly $ 1.1 billion had been donated. EA has been used to argue that humans should donate organs , whilst alive or after death, and some effective altruists do. Effective altruists often consider using their career to do good, both by direct service and indirectly through their consumption, investment, and donation decisions. 80,000 Hours

1711-418: The benefits of mass deworming programs, with some studies finding long-term effects and others not. The Happier Lives Institute conducts research on the effectiveness of cognitive behavioral therapy (CBT) in developing countries; Canopie develops an app that provides cognitive behavioural therapy to women who are expecting or postpartum; Giving Green analyzes and ranks climate interventions for effectiveness;

1770-468: The best ways to do good". In 2019, Oxford University Press published the volume Effective Altruism: Philosophical Issues , edited by Hilary Greaves and Theron Pummer. More recent books have emphasized concerns for future generations. In 2020, the Australian moral philosopher Toby Ord published The Precipice: Existential Risk and the Future of Humanity , while MacAskill published What We Owe

1829-471: The concept as "pseudo-scientific". The Ethiopian-American AI scientist Timnit Gebru has condemned effective altruists "for acting as though their concerns are above structural issues as racism and colonialism", as Gideon Lewis-Kraus summarized her views in 2022. Philosophers such as Susan Dwyer, Joshua Stein, and Olúfẹ́mi O. Táíwò have criticized effective altruism for furthering the disproportionate influence of wealthy individuals in domains that should be

MMLU - Misplaced Pages Continue

1888-482: The creation of organizations that specialize in researching the relative prioritization of causes. This practice of "weighing causes and beneficiaries against one another" was criticized by Ken Berger and Robert Penna of Charity Navigator for being "moralistic, in the worst sense of the word" and "elitist". William MacAskill responded to Berger and Penna, defending the rationale for comparing one beneficiary's interests against another and concluding that such comparison

1947-466: The creation of university-based institutes, research centers , advisory organizations and charities, which, collectively, have donated several hundreds of millions of dollars. Effective altruists emphasize impartiality and the global equal consideration of interests when choosing beneficiaries. Popular cause priorities within effective altruism include global health and development , social and economic inequality , animal welfare , and risks to

2006-455: The definition of torture in the ICCPR is consistent with customary international law (D) This is an acceptable reservation because under general international law States have the right to enter reservations to treaties Dan Hendrycks Dan Hendrycks (born 1994 or 1995 ) is an American machine learning researcher. He serves as the director of the Center for AI Safety . Hendrycks

2065-576: The elite universities in the United States and Britain, and Silicon Valley has become a key centre for the " longtermist " submovement, with a tight subculture there. The movement received mainstream attention and criticism with the bankruptcy of the cryptocurrency exchange FTX as founder Sam Bankman-Fried was a major funder of effective altruism causes prior to late 2022. Some in the San Francisco Bay Area criticized what they described as

2124-449: The expected value of one intervention can be higher than that of another if its benefits are larger, even if it has a smaller chance of succeeding. One metric effective altruists use to choose between health interventions is the estimated number of quality-adjusted life years (QALY) added per dollar. Some effective altruist organizations prefer randomized controlled trials as a primary form of evidence, as they are commonly considered

2183-525: The good is the sum total of well-being . Toby Ord has described utilitarians as "number-crunching", compared with most effective altruists whom he called "guided by conventional wisdom tempered by an eye to the numbers". Other philosophers have argued that EA still retains some core ethical commitments that are essential and distinctive to utilitarianism, such as the principle of impartiality, welfarism and good-maximization. MacAskill has argued that one shouldn't be absolutely certain about which ethical view

2242-475: The highest level of evidence in healthcare research. Others have argued that requiring this stringent level of evidence unnecessarily narrows the focus to issues where the evidence can be developed. Kelsey Piper argues that uncertainty is not a good reason for effective altruists to avoid acting on their best understanding of the world, because most interventions have mixed evidence regarding their effectiveness. Pascal-Emmanuel Gobry and others have warned about

2301-1028: The long-term future, and have connections with the effective altruism community, are the Future of Humanity Institute at the University of Oxford, the Centre for the Study of Existential Risk at the University of Cambridge, and the Future of Life Institute . In addition, the Machine Intelligence Research Institute is focused on the more narrow mission of managing advanced artificial intelligence . Effective altruists pursue different approaches to doing good, such as donating to effective charitable organizations, using their career to make more money for donations or directly contributing their labor, and starting new non-profit or for-profit ventures. Many effective altruists engage in charitable donation . Some believe it

2360-440: The most good, while taking into account cause neutrality. Many people in the effective altruism movement have prioritized global health and development, animal welfare, and mitigating risks that threaten the future of humanity. The alleviation of global poverty and neglected tropical diseases has been a focus of some of the earliest and most prominent organizations associated with effective altruism. Charity evaluator GiveWell

2419-452: The most good. Other than Peter Singer and William MacAskill, philosophers associated with effective altruism include Nick Bostrom , Toby Ord , Hilary Greaves , and Derek Parfit . Economist Yew-Kwang Ng conducted similar research in welfare economics and moral philosophy . The Centre for Effective Altruism lists the following four principles that unite effective altruism: prioritization, impartial altruism, open truthseeking, and

MMLU - Misplaced Pages Continue

2478-449: The movement formed, it attracted individuals who were not part of a specific community, but who had been following the Australian moral philosopher Peter Singer 's work on applied ethics , particularly " Famine, Affluence, and Morality " (1972), Animal Liberation (1975), and The Life You Can Save (2009). Singer himself used the term in 2013, in a TED talk titled "The Why and How of Effective Altruism". An estimated $ 416 million

2537-441: The movement's " 'telescopic philanthropy' aimed at distant populations" and envisioned "effective altruists sitting around in a San Francisco skyscraper calculating how to relieve suffering halfway around the world while the city decays beneath them", while he also praised the movement for providing "useful rebukes to the solipsism and anti-human pessimism that haunts the developed world today". A key component of effective altruism

2596-536: The movement. In 2015, philosopher Peter Singer published The Most Good You Can Do: How Effective Altruism Is Changing Ideas About Living Ethically . The same year, the Scottish philosopher and ethicist William MacAskill published Doing Good Better: How Effective Altruism Can Help You Make a Difference . In 2018, American news website Vox launched its Future Perfect section, led by journalist Dylan Matthews , which publishes articles and podcasts on "finding

2655-579: The only intervention that raised school attendance was treating intestinal worms in children. Based on their findings, they started the Deworm the World Initiative . From 2013 to August 2022, GiveWell designated Deworm the World (now run by nonprofit Evidence Action ) as a top charity based on their assessment that mass deworming is "generally highly cost-effective"; however, there is substantial uncertainty about

2714-523: The person I can help is a neighbor's child ten yards away from me or a Bengali whose name I shall never know, ten thousand miles away ... The moral point of view requires us to look beyond the interests of our own society. Impartiality combined with seeking to do the most good leads to prioritizing benefits to those who are in a worse state, because anyone who happens to be worse off will benefit more from an improvement in their state, all other things being equal. One issue related to moral impartiality

2773-490: The potentially extremely high number of individuals that could exist in the future, longtermists seek to decrease the probability that an existential catastrophe irreversibly ruins it. Toby Ord has stated that "the people of the future may be even more powerless to protect themselves from the risks we impose than the dispossessed of our own time". Existential risks , such as dangers associated with biotechnology and advanced artificial intelligence , are often highlighted and

2832-619: The questions are wrong (either the question is not well-defined, or that the given answer is wrong), which suggests that 90% is essentially the maximal achievable score. The following examples are taken from the " Abstract Algebra " and " International Law " tasks, respectively. The correct answers are marked in boldface: Find all c {\displaystyle c} in Z 3 {\displaystyle \mathbb {Z} _{3}} such that Z 3 [ x ] / ( x 2 + c ) {\displaystyle \mathbb {Z} _{3}[x]/(x^{2}+c)}

2891-560: The responsibility of democratic governments and organizations. Arguments have been made that movements focused on systemic or institutional change, for example democratization , are compatible with effective altruism. Philosopher Elizabeth Ashford posits that people are obligated to both donate to effective aid charities and to reform the structures that are responsible for poverty. Open Philanthropy has given grants for progressive advocacy work in areas such as criminal justice, economic stabilization, and housing reform, despite pegging

2950-1129: The same name, works to alleviate global poverty by promoting evidence-backed charities, conducting philanthropy education, and changing the culture of giving in affluent countries. Improving animal welfare has been a focus of many effective altruists. Singer and Animal Charity Evaluators (ACE) have argued that effective altruists should prioritize changes to factory farming over pet welfare. 60 billion land animals are slaughtered and between 1 and 2.7 trillion individual fish are killed each year for human consumption. A number of non-profit organizations have been established that adopt an effective altruist approach toward animal welfare. ACE evaluates animal charities based on their cost-effectiveness and transparency, particularly those tackling factory farming. Faunalytics focuses on animal welfare research. Other animal initiatives affiliated with effective altruism include Animal Ethics ' and Wild Animal Initiative 's work on wild animal suffering , addressing farm animal suffering with cultured meat , and increasing concern for all kinds of animals. The Sentience Institute

3009-444: The sidelines. Counterfactual reasoning involves considering the possible outcomes of alternative choices. It has been employed by effective altruists in a number of contexts, including career choice. Many people assume that the best way to help others is through direct methods, such as working for a charity or providing social services. However, since there is a high supply of candidates for such positions, it makes sense to compare

SECTION 50

#1732890924858

3068-413: The subject of active research. Existential risks have such huge impacts that achieving a very small change in such a risk—say a 0.0001-percent reduction—"might be worth more than saving a billion people today", reported Gideon Lewis-Kraus in 2022, but he added that nobody in the EA community openly endorses such an extreme conclusion. Organizations that work actively on research and advocacy for improving

3127-463: The survival of humanity over the long-term future . EA has an especially influential status within animal advocacy. The movement developed during the 2000s, and the name effective altruism was coined in 2011. Philosophers influential to the movement include Peter Singer , Toby Ord , and William MacAskill . What began as a set of evaluation techniques advocated by a diffuse coalition evolved into an identity. Effective altruism has strong ties to

3186-475: The system that determines how those necessities are produced and distributed in the first place". Philosopher Amia Srinivasan criticized William MacAskill's Doing Good Better for a perceived lack of coverage of global inequality and oppression , while noting that effective altruism is in principle open to whichever means of doing good is most effective, including political advocacy aimed at systemic change. Srinivasan said, "Effective altruism has so far been

3245-532: The time of the MMLU's release, most existing language models performed around the level of random chance (25%), with the best performing GPT-3 model achieving 43.9% accuracy. The developers of the MMLU estimate that human domain-experts achieve around 89.8% accuracy. As of 2024, some of the most powerful language models, such as o1 , Gemini and Claude 3 , were reported to achieve scores around 90%. An expert review of 3,000 randomly sampled questions found that over 9% of

3304-628: Was donated to effective charities identified by the movement in 2019, representing a 37% annual growth rate since 2015. Two of the largest donors in the effective altruism community, Dustin Moskovitz , who had become wealthy through co-founding Facebook, and his wife Cari Tuna , hope to donate most of their net worth of over $ 11 billion for effective altruist causes through the private foundation Good Ventures . Others influenced by effective altruism include Sam Bankman-Fried, as well as professional poker players Dan Smith and Liv Boeree . Jaan Tallinn ,

3363-417: Was followed by "An Overview of Catastrophic AI Risks", which discusses four categories of risks: malicious use, AI race dynamics, organizational risks, and rogue AI agents. Hendrycks is the safety adviser of xAI , an AI startup company founded by Elon Musk in 2023. To avoid any potential conflicts of interest, he receives a symbolic one-dollar salary and holds no company equity. As of November 2024, he

3422-517: Was founded by Holden Karnofsky and Elie Hassenfeld in 2007 to address poverty, where they believe additional donations to be the most impactful. GiveWell's leading recommendations include: malaria prevention charities Against Malaria Foundation and Malaria Consortium , deworming charities Schistosomiasis Control Initiative and Deworm the World Initiative, and GiveDirectly for direct cash transfers to beneficiaries. The organization The Life You Can Save, which originated from Singer's book of

3481-754: Was raised in a Christian evangelical household in Marshfield, Missouri . He received a B.S. from the University of Chicago in 2018 and a Ph.D. from the University of California, Berkeley in Computer Science in 2022. Hendrycks' research focuses on topics that include machine learning safety , machine ethics , and robustness. He credits his participation in the effective altruism (EA) movement-linked 80,000 Hours program for his career focus towards AI safety, though denied being an advocate for EA. In February 2022, Hendrycks co-authored recommendations for

#857142