Misplaced Pages

Tatoeba

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

The Google Summer of Code , often abbreviated to GSoC , is an international annual program in which Google awards stipends to contributors who successfully complete a free and open-source software coding project during the summer. As of 2022 , the program is open to anyone aged 18 or over, no longer just students and recent graduates. It was first held from May to August 2005. Participants get paid to write software, with the amount of their stipend depending on the purchasing power parity of the country where they are located. Project ideas are listed by host organizations involved in open-source software development, though students can also propose their own project ideas.

#256743

76-450: Tatoeba is a free collection of example sentences with translations geared towards foreign language learners . It is available in more than 400 languages. Its name comes from the Japanese phrase tatoeba ( 例えば ) , meaning 'for example'. It is written and maintained by a community of volunteers through a model of open collaboration . Individual contributors are known as "Tatoebans". It

152-508: A graph that has more than 23,700,000 links. 253 language pairs have over 10,000 translated sentences. Tatoeba received a grant from Mozilla Drumbeat in December 2010. Some work on the Tatoeba infrastructure was sponsored by Google Summer of Code , 2014 edition. In May 2018 they received a $ 25,000 Mozilla Open Source Support (MOSS) program grant. In August 2019 they received

228-635: A $ 15,000 Mozilla Open Source Support (MOSS) program grant. By default, the sentences of the Tatoeba Corpus are published under a CC BY license, freeing it for academic and other use. Users can also contribute sentences under CC0 , though translations of those sentences currently can't share the same license. Audio recordings of the sentences use the speaker's choice of license, such as CC BY, CC BY-SA, CC BY-NC, or no public license at all. Visitors can download tab-delimited sentence pairs ready for import into Anki and similar Spaced Repetition Software at

304-449: A badge indicating that they are "approved for free cultural works". Repositories exist which exclusively feature free material and provide content such as photographs, clip art , music, and literature. While extensive reuse of free content from one website in another website is legal, it is usually not sensible because of the duplicate content problem. Misplaced Pages is amongst the most well-known databases of user-uploaded free content on

380-737: A base for exercises. Charles Kelly and Paul Raine, both EFL teachers in Japan, have developed language learning activities based on sentences curated from the Tatoeba ;Corpus. Clozemaster is a language self-study program that generates gamified cloze tests from Tatoeba sentence pairs. Some Anki users share flashcards that were created using Tatoeba. Some language digital activists contribute to open collaborative projects like Tatoeba, Misplaced Pages , and Common Voice to promote their minority language in digital spaces. Regional languages like Kabyle , Catalan , or Basque can register more than

456-608: A brief abstract message that is publicly viewable and completely separate from the content of the actual proposal that was submitted to Google. The Summer of Code 2006 ended on 2006-09-08. According to Google, 82% of the students received a positive evaluation at the end of the program. In 2007, Google accepted 131 organizations and over 900 students. Those 131 organizations had a total of nearly 1500 mentors. Students were allowed to submit up to 20 applications although only one could be accepted. Google received nearly 6,200 applications. To allow more students to apply, Google extended

532-402: A copyright holder's power to license their work, as copyleft which also utilizes copyright for such a purpose. The public domain is a range of creative works whose copyright has expired or was never established, as well as ideas and facts which are ineligible for copyright. A public domain work is a work whose author has either relinquished to the public or no longer can claim control over,

608-504: A definition would exclude the Open Content License because that license forbids charging for content; a right required by free and open-source software licenses. It has since come to describe a broader class of content without conventional copyright restrictions. The openness of content can be assessed under the '5Rs Framework' based on the extent to which it can be retained, reused, revised, remixed and redistributed by members of

684-537: A diverse set of high-quality example sentences from the English Tatoeba Corpus. Tatoeba datasets can power incidental learning experiences that blend the acquisition of a foreign language with the user's everyday activities like web browsing or book reading. A team at MIT Media Lab used example sentences from Tatoeba in WordSense, a mixed reality platform that enables " serendipitous language learning in

760-506: A final list of projects accepted into the program on the SoC website. The proposals themselves were visible to the public for a few hours, after which they were taken down in response to complaints by the participants about the "sensitive and private" information that their applications contained. However, Google has since resolved these issues by allowing each student involved in Summer of Code to provide

836-409: A free way of obtaining higher education that is "focused on collective knowledge and the sharing and reuse of learning and scholarly content." There are multiple projects and organizations that promote learning through open content, including OpenCourseWare and Khan Academy . Some universities, like MIT , Yale , and Tufts are making their courses freely available on the internet. There are also

SECTION 10

#1733092972257

912-510: A hundred members on Tatoeba. Selected content from Tatoeba in Esperanto is available in the multilingual DVD Esperanto Elektronike published by E@I. As of November 2022, Esperanto is Tatoeba's fifth pivot language , with over 330,000 sentences translated into at least two languages. Other constructed languages like Toki Pona , Interlingua , Klingon , Lojban , and Ido also have a significant footprint. From 2008 to 2011, Francis Bond used

988-605: A number of organizations promoting the creation of openly licensed textbooks such as the University of Minnesota's Open Textbook Library, Connexions , OpenStax College , the Saylor Academy, Open Textbook Challenge, and Wikibooks . Any country has its own law and legal system, sustained by its legislation, which consists of documents. In a democratic country , laws are published as open content, in principle free content; but in general, there are no explicit licenses attributed for

1064-405: A student at SUPINFO — became a core developer of Tatoeba. Together with Trang Ho and other young developers, they made Tatoeba more social: sentence lists, user profiles, private messaging, and Facebook-inspired Wall . They also introduced significant features like sentence linking, tagging, and "translation of translation" search. In November 2010, Tatoeba passed the 600,000 sentences mark. Within

1140-476: A work. The aim of copyleft is to use the legal framework of copyright to enable non-author parties to be able to reuse and, in many licensing schemes, modify content that is created by an author. Unlike works in the public domain, the author still maintains copyright over the material, however, the author has granted a non-exclusive license to any person to distribute, and often modify, the work. Copyleft licenses require that any derivative works be distributed under

1216-760: A year, the number of sentences added daily had increased almost 50-fold. Between 2014 and 2016, a new team of developers formed around Trang Ho.  They mentored students at the Google Summer of Code 2014 and added features to improve corpus quality. Over the 2018-2020 period, support from the Mozilla Foundation as part of the Common Voice project allowed Tatoeba to make its platform more open and user-friendly. Users can search for words to retrieve sentences that use them. Results can be filtered by language, number of words, tag, and other criteria. Each sentence

1292-430: Is any kind of creative work, such as a work of art , a book, a software program , or any other creative content for which there are very minimal copyright and other legal limitations on usage, modification and distribution. These are works or expressions which can be freely studied, applied, copied and modified by anyone for any purpose including, in some cases, commercial purposes. Free content encompasses all works in

1368-709: Is described as synonymous to the definitions of open/free in the Open Source Definition, the Free Software Definition, and the Definition of Free Cultural Works. A distinct difference is the focus given to the public domain, open access , and readable open formats . OKF recommends six conformant licenses: three of OKN's (Open Data Commons Public Domain Dedication and Licence, Open Data Commons Attribution License, Open Data Commons Open Database License ) and

1444-546: Is displayed next to its translations and "translations of translations". A comment section facilitates feedback and corrections. Registered users can build downloadable lists of sentences, which are private, public or collaborative. Tatoebans are encouraged to contribute in their strongest language. They can add original sentences and translate existing ones. They can proofread or comment on other users' sentences, and "adopt" sentences without an owner. Advanced contributors are also allowed to tag, link, and unlink sentences. When

1520-522: Is distributed via Internet to the general public. Publication of such resources may be either by a formal institution-wide program, or informally, by individual academics or departments. Open content publication has been seen as a method of reducing costs associated with information retrieval in research, as universities typically pay to subscribe for access to content that is published through traditional means. Subscriptions for non-free content journals may be expensive for universities to purchase, though

1596-417: Is run by Association Tatoeba, a French non-profit organization funded through donations. In 2006, Trang Ho was frustrated that unlike some of their Japanese counterparts, German bilingual dictionaries didn't feature full-text search of usage examples with translations. It led her to imagine her ideal dictionary and to build a prototype hosted on SourceForge under the name "multilangdict." The main focus

SECTION 20

#1733092972257

1672-941: Is used by the Wikimedia Foundation . In 2009, the Attribution and Attribution-ShareAlike Creative Commons licenses were marked as "Approved for Free Cultural Works". Another successor project is the Open Knowledge Foundation , founded by Rufus Pollock in Cambridge , in 2004 as a global non-profit network to promote and share open content and data. In 2007 the OKF gave an Open Knowledge Definition for "content such as music, films, books; data be it scientific, historical, geographic or otherwise; government and other administrative information". In October 2014 with version 2.0 Open Works and Open Licenses were defined and "open"

1748-436: Is very similar to open content . An analogy is a use of the rival terms free software and open-source, which describe ideological differences rather than legal ones. The term Open Source, by contrast, sought to encompass them all in one movement. For instance, the Open Knowledge Foundation 's Open Definition describes "open" as synonymous with the definition of free in the "Definition of Free Cultural Works" (as also in

1824-474: The CC BY , CC BY-SA , and CC0 Creative Commons licenses. Google Summer of Code The idea for the Summer of Code came directly from Google's founders, Sergey Brin and Larry Page . From 2007 until 2009 Leslie Hawthorn, who has been involved in the project since 2006, was the program manager. From 2010 until 2015, Carol Smith was the program manager. In 2016, Stephanie Taylor took over management of

1900-548: The Open Content Project , describing works licensed under the Open Content License (a non-free share-alike license, see 'Free content' below) and other works licensed under similar terms. The website of the Open Content Project once defined open content as 'freely available for modification, use and redistribution under a license similar to those used by the open-source / free software community'. However, such

1976-523: The Open Source Definition and Free Software Definition ). For such free/open content both movements recommend the same three Creative Commons licenses , the CC BY, CC BY-SA, and CC0. Copyright is a legal concept, which gives the author or creator of a work legal control over the duplication and public performance of their work. In many jurisdictions, this is limited by a time period after which

2052-465: The public domain and also those copyrighted works whose licenses honor and uphold the definition of free cultural work. In most countries, the Berne Convention grants copyright holders control over their creations by default. Therefore, copyrighted content must be explicitly declared free by the authors, which is usually accomplished by referencing or including licensing statements from within

2128-532: The 10 Google-sponsored Mozilla projects survived after the event. However, the Gaim (now Pidgin) project was able to enlist enough coding support through the event to include the changes into Gaim (now Pidgin) 2.0; the Jabber Software Foundation (now XMPP Standards Foundation) and KDE project also counted a few surviving projects of their own from the event (KDE only counted 1 continuing project from out of

2204-427: The 24 projects which it sponsored). In 2006, around 6,000 applications were submitted, less than the previous year because all applicants were required to have Google Accounts which reduced the number of spam applications received. Google and most mentors found that the proposals were of much higher quality than 2005's applications. Also, the number of participating organizations more than doubled to 102. In addition to

2280-583: The Google Summer of Code 2012 on February 4, 2012. On April 23, 2012, Google announced that 1,212 proposals were accepted in 180 organizations. For the first time since inception, the highest number of GSoC participants came from India (227) followed by the USA (173) and Germany (72). The University of Moratuwa continued its dominance with 29 selections, followed by Dhirubhai Ambani Institute of Information and Communication Technology leading from India at 3rd rank. For

2356-402: The Google Summer of Code 2014 on February 3, 2014. On April 21, 2014, Google announced that 190 open source projects and organizations would take part that year. 1,307 student project proposals were accepted. The 2014 edition was the first time for students from Ethiopia, Honduras, Kenya, Malawi and Uganda have been accepted to this program. Kenya and Cameroon taking the lead with 3 students and

Tatoeba - Misplaced Pages Continue

2432-410: The Summer of Code. Despite these criticisms there were 41 organizations involved, including FreeBSD , Apache , KDE, Ubuntu, Blender , Mozdev , and Google itself. According to a blog post by Chris DiBona , Google's open source program manager, "something like 30 percent of the students stuck with their groups past SoC [Summer of Code]." Mozilla developer Gervase Markham also commented that none of

2508-495: The Tanaka Corpus — a public-domain compilation released in 2001 by Hyogo University professor Yasuhito Tanaka and maintained by Jim Breen and Paul Blay — were imported into the Tatoeba Corpus. In December 2008, Trang Ho released the first version of the current codebase built around a more flexible data model . The following month, the website moved to the tatoeba.org domain. Over the 2009-2010 academic year, Allan Simon — then

2584-691: The Tatoeba Corpus for his research on the Japanese language. Since 2013, Jörg Tiedemann has been spreading Tatoeba parallel corpora more widely in the machine translation community by sharing them on the OPUS repository and organizing the "Tatoeba Translation Challenge". With the rise of deep learning , researchers increasingly use Tatoeba's data sets to train and evaluate their massively multilingual models in tasks like machine translation , language identification , semantic search , and speech recognition . Free content Free content , libre content , libre information , or free information

2660-413: The Tatoeba website. An unstable API is available for software developers. Tatoeba sentences can be used to build lexicographic references for language learners. The JMdict Japanese-English dictionary selects its example sentences from the Tatoeba Corpus. OpenRussian is a free Russian dictionary built primarily from the content of Wiktionary and Tatoeba. GoodExample tries to automatically extract

2736-767: The US National Institutes of Health , Research Councils UK (effective 2016) and the European Union (effective 2020). At an institutional level, some universities, such as the Massachusetts Institute of Technology , have adopted open access publishing by default by introducing their own mandates. Some mandates may permit delayed publication and may charge researchers for open access publishing. For teaching purposes, some universities, including MIT , provide freely available course content, such as lecture notes, video resources and tutorials. This content

2812-588: The USA (127) and Sri Lanka (58). Google announced the Google Summer of Code 2016 on February 9, 2016. The deadline for organization application was set to February 19, 2016. The student application period began on March 14, 2016, and student application deadline was set to March 25, 2016. 180 organizations were accepted. It saw 18,981 total registered students (up 36% from 2015) with 7,543 student proposals from 5,107 students in 142 countries. The accepted student proposals were announced on April 22, 2016, with 1,206 student proposals accepted. The number of organizations

2888-545: The application deadline from March 24 to March 26 at the last minute. It was then extended again to March 27. On April 11, the acceptance letters were delayed due to additional efforts involved in resolving duplicate submissions. At one point, the web interface changed each application to have a status of Not Selected . Google officials reported that only the acceptance email was the definitive indication of acceptance. In 2008, Google chose 174 open source organizations to participate in Summer of Code, greatly increased from 131

2964-466: The articles are written and peer-reviewed by academics themselves at no cost to the publisher. This has led to disputes between publishers and some universities over subscription costs, such as the one that occurred between the University of California and the Nature Publishing Group . Free and open content has been used to develop alternative routes towards higher education. Open content is

3040-567: The automotive industry, and even agricultural areas. Technologies such as distributed manufacturing can allow computer-aided manufacturing and computer-aided design techniques to be able to develop small-scale production of components for the development of new, or repair of existing, devices. Rapid fabrication technologies underpin these developments, which allow end-users of technology to be able to construct devices from pre-existing blueprints, using software and manufacturing hardware to convert information into physical objects. In academic work,

3116-403: The corresponding mentoring organization, with mentors and organizational administrators reviewing the applications and deciding how many "slots" to request from Google, and which proposals to accept. Google allocates slots to each organization, taking into account organizational capacity, mentoring history, and the number of applications the organization has received. Finally, organizations select

Tatoeba - Misplaced Pages Continue

3192-555: The cost of publication and reduced the entry barrier sufficiently to allow for the production of widely disseminated materials by individuals or small groups. Projects to provide free literature and multimedia content have become increasingly prominent owing to the ease of dissemination of materials that are associated with the development of computer technology. Such dissemination may have been too costly prior to these technological developments. In media, which includes textual, audio, and visual content, free licensing schemes such as some of

3268-413: The distribution and usage of the work. As such, any person may manipulate, distribute, or otherwise use the work, without legal ramifications. A work in the public domain or released under a permissive license may be referred to as "copycenter". Copyleft is a play on the word copyright and describes the practice of using copyright law to remove restrictions on distributing copies and modified versions of

3344-428: The first time, Mauritius , an African country, participated in the Google Summer of Code. Google announced the Google Summer of Code 2013 on February 11, 2013. On April 8, 2013, Google announced that 177 open source projects and organizations would take part that year. 1,192 student project proposals were accepted. This was the first time that Cameroon was represented in the program by 2 students. Google announced

3420-537: The latter must be available for commercial use by the public. However, it is similar to several definitions for open educational resources, which include resources under noncommercial and verbatim licenses. In 2003, David Wiley announced that the Open Content Project had been succeeded by Creative Commons and their licenses; Wiley joined as "Director of Educational Licenses". In 2005, the Open Icecat project

3496-404: The licenses made by Creative Commons have allowed for the dissemination of works under a clear set of legal permissions. Not all Creative Commons licenses are entirely free; their permissions may range from very liberal general redistribution and modification of the work to a more restrictive redistribution-only licensing. Since February 2008, Creative Commons licenses which are entirely free carry

3572-501: The main one is that published by a government gazette . So, law-documents can eventually inherit license expressed by the repository or by the gazette that contains it. The concept of applying free software licenses to content was introduced by Michael Stutz, who in 1997 wrote the paper "Applying Copyleft to Non-Software Information" for the GNU Project . The term "open content" was coined by David A. Wiley in 1998 and evangelized via

3648-826: The majority of works are not free, although the percentage of works that are open access is growing. Open access refers to online research outputs that are free of all restrictions to access and free of many restrictions on use (e.g. certain copyright and license restrictions). Authors may see open access publishing as a way of expanding the audience that is able to access their work to allow for greater impact, or support it for ideological reasons. Open access publishers such as PLOS and BioMed Central provide capacity for review and publishing of free works; such publications are currently more common in science than humanities. Various funding institutions and governing research bodies have mandated that academics must produce their works to be open-access, in order to qualify for funding, such as

3724-451: The number of awards received by students for the five-year period 2005–2009 securing 79 accepted students. In 2010 Google accepted 150 software projects and 1,026 students from 69 countries worldwide. The top ten countries by number of students accepted in 2010 are: United States (197), India (125), Germany (57), Brazil (50), Poland (46), Canada (40), China (39), United Kingdom (36), France (35), Sri Lanka (34). The number of organizations

3800-639: The number of consumers. In some cases, free software vendors may use peer-to-peer technology as a method of dissemination. Project hosting and code distribution is not a problem for most free projects as a number of providers offer these services free of charge. Free content principles have been translated into fields such as engineering, where designs and engineering knowledge can be readily shared and duplicated, in order to reduce overheads associated with project development. Open design principles can be applied in engineering and technological applications, with projects in mobile telephony , small-scale manufacture,

3876-537: The organizations that participated in 2005, organizations such as Debian , GNU , Gentoo , Adium , PHP , and ReactOS participated in 2006. Google had decided to sponsor around 600 projects. The student application deadline was extended until 2006-05-09, at 11:00 PDT. Although the results were to be declared by 5:00 PM PDT, there was considerable delay in publishing it as Google had not expected several students to be selected in more than one organization. Google allows one student to undertake only one project as part of

SECTION 50

#1733092972257

3952-442: The original author, to maintain the original license of the reused content) or restrictions (excluding commercial use, banning certain media) chosen by the author. There are a number of standardized licenses offering varied options that allow authors to choose the type of reuse of their work that they wish to authorize or forbid. There are a number of different definitions of free content in regular use. Legally, however, free content

4028-659: The other countries with one student. Google announced the Google Summer of Code 2015 on February 9, 2015. On March 2, 2015, Google announced that 137 open source projects and organizations would take part that year, some notable exceptions including Mozilla , the Linux Foundation , and the Tor Project . The student application period began on March 16, 2015. The accepted student proposals were announced on April 27, 2015, with 1051 student proposals accepted. The highest number of GSoC participants came from India (335) followed by

4104-623: The owner of a sentence does not respond to a correction request, only a corpus maintainer has the power to update or delete the sentence. As founder of Tatoeba, Trang Ho has long been the project's BDFL . In 2011, she set up a nonprofit organization to oversee the project. In 2022, she decided to step aside in favor of a small group of experienced Tatoebans. As of February 2024, the Tatoeba Corpus has over 11,900,000 sentences in 422 languages. 59 of these languages have 10,000 or more sentences. Over 1 million sentences have audio recordings. The sentences are interrelated within

4180-445: The program to 419 positions. The mentoring organizations were responsible for reviewing and selecting proposals, and then providing guidance to those students to help them complete their proposal. Students that successfully completed their proposal to the satisfaction of their mentoring organization were awarded $ 4500 and a Google Summer of Code T-shirt, while $ 500 per project was sent to the mentoring organization. Approximately 80% of

4256-424: The program. Each year, the program follows a timeline. First, open-source organizations apply to participate. If accepted, each organization provides a list of initial project ideas and invites contributors to their development communities. Contributors who meet the eligibility criteria then submit up to 3 proposals that detail the software-coding projects that interest them. These applications are then evaluated by

4332-525: The program. It took Google several hours to resolve the duplicate acceptances. The acceptance letters were sent out on May 24, at 3:13 AM PDT, but the letters were also sent out to some 1,600 applicants who had in fact, not been accepted by Google's SoC committee. At 3:38 AM PDT, Chris DiBona posted an apology to the official mailing list, adding that "We're very deeply sorry for this. If you received two e-mails, one that said you were accepted and one that you were not, this means you were not." Google has released

4408-523: The projects were successfully completed in 2005, although completion rates varied by organization: Ubuntu , for example, reported a completion rate of only 64%, and KDE reported a 67% completion rate. Many projects were continued past summer, even though the SOC period was over, and some changed direction as they developed. For the first Summer of Code, Google was criticized for not giving sufficient time to open source organizations so they could plan projects for

4484-401: The public without violating copyright law. Unlike free content and content under open-source licenses , there is no clear threshold that a work must reach to qualify as 'open content'. The 5Rs are put forward on the Open Content Project website as a framework for assessing the extent to which content is open: This broader definition distinguishes open content from open-source software, since

4560-507: The same terms and that the original copyright notices be maintained. A symbol commonly associated with copyleft is a reversal of the copyright symbol , facing the other way; the opening of the C points left rather than right. Unlike the copyright symbol, the copyleft symbol does not have a codified meaning. Projects that provide free content exist in several areas of interest, such as software, academic literature, general literature, music, images, video, and engineering . Technology has reduced

4636-496: The second place in "2008 GSoC Accepted: Top 10 Universities" category, while Universidade Estadual de Campinas became second in "2008 GSoC Applicants: Top 10 Universities" category. For 2009 Google reduced the number of software projects to 150, and capped the number of student projects it would accept at 1,000, 85 percent of which were successfully completed. As of 2009, University of Moratuwa in Sri Lanka ranks first in terms of

SECTION 60

#1733092972257

4712-422: The social structures that result leading to decreased production costs. Given sufficient interest in a software component, by using peer-to-peer distribution methods, distribution costs may be reduced, easing the burden of infrastructure maintenance on developers. As distribution is simultaneously provided by consumers, these software distribution models are scalable; that is, the method is feasible regardless of

4788-478: The text of each law, so the license must be assumed as an implied license . Only a few countries have explicit licenses in their law-documents, as the UK's Open Government Licence (a CC BY compatible license). In the other countries, the implied license comes from its proper rules (general laws and rules about copyright in government works). The automatic protection provided by the Berne Convention does not apply to

4864-411: The texts of laws: Article 2.4 excludes the official texts from the automatic protection. It is also possible to "inherit" the license from context. The set of country's law-documents is made available through national repositories. Examples of law-document open repositories: LexML Brazil , Legislation.gov.uk , and N-Lex . In general, a law-document is offered in more than one (open) official version, but

4940-425: The top proposals to fill their slots and Google verifies eligibility before announcing accepted contributors. In the event of a single contributor being selected by more than one organization, the organization which allocates a slot to the student first is given priority. In 2005, more than 8,740 project proposals were submitted for the 200 available student positions. Due to the overwhelming response, Google expanded

5016-694: The web. While the vast majority of content on Misplaced Pages is free content, some copyrighted material is hosted under fair-use criteria . Free and open-source software , which is often referred to as open source software and free software , is a maturing technology with companies using them to provide services and technology to both end-users and technical consumers. The ease of dissemination increases modularity, which allows for smaller groups to contribute to projects as well as simplifying collaboration. Some claim that open source development models offer similar peer-recognition and collaborative benefit incentive as in more classical fields such as scientific research, with

5092-598: The wild." More recently, Japanese researchers implemented a Tatoeba search feature in an integrated writing assistance environment. Although the sentences in the Tatoeba Corpus are not all authentic, they are sometimes used to build data-driven learning applications. BES (Basic English Sentence) Search is a non-commercial tool for finding beginner-level English sentences for use in teaching materials. It has over 1 million sentences, most of them from Tatoeba. Reverso uses Tatoeba parallel corpora in its commercial bilingual concordancer . Example sentences are also used as

5168-445: The work of the author to those who either pay royalties to the author for usage of the author's content or limit their use to fair use. Secondly, it limits the use of content whose author cannot be found. Finally, it creates a perceived barrier between authors by limiting derivative works, such as mashups and collaborative content. Although open content has been described as a counterbalance to copyright , open content licenses rely on

5244-403: The work. The right to reuse such a work is granted by the authors in a license known as a free license , a free distribution license, or an open license, depending on the rights assigned. These freedoms given to users in the reuse of works (that is, the right to freely use, study, modify or distribute these works, possibly also for commercial purposes) are often associated with obligations (to cite

5320-406: The works then enter the public domain . Copyright laws are a balance between the rights of creators of intellectual and artistic works and the rights of others to build upon those works. During the time period of copyright the author's work may only be copied, modified, or publicly performed with the consent of the author, unless the use is a fair use . Traditional copyright control limits the use of

5396-658: The year before and 102 in 2006. Each organization was chosen based on a number of criteria, such as the virtue of the projects, the ideas given for students to work on, and the ability of the mentors to ensure students successfully completed projects. Nearly 7100 proposals were received for the 2008 Summer of Code, of which 1125 were selected. The university results were announced on May 8, 2008 at Google's "Open Source at Google" blog. According to it, University of Moratuwa came first in both "Top 10 Universities of 2008 GSoC applicants" and "Top 10 accepted universities 2008 GSoC" categories. Wrocław University of Technology able to secure

5472-405: Was already the crowdsourcing of translated sentences: "A Misplaced Pages type of thing, except people add sentences, not articles." Alongside her studies at the University of Technology of Compiègne , Trang Ho gradually improved her website with a few classmates. She rebuilt the project from scratch twice and rebranded it as Tatoeba. In September 2007, about 150,000 English-Japanese sentence pairs from

5548-468: Was increased to 175, of which 50 were new. 1,115 students were accepted. A total of 595 different universities participated in the program, 160 of which were new to the program. The 13 universities with the highest number of students accepted into the 2011 Google Summer of Code account for 14.5% of the students. University of Moratuwa, Sri Lanka secured first position in 2011's program with 27 accepted students. Polytechnic University of Bucharest, Romania

5624-648: Was increased to 201, of which 39 were new. The 1,318 students accepted into the program hailed from 575 universities, of which 142 have students participating for the first time. Over 20,651 students from 144 countries registered for the program, which is an 8.8% increase over the previous high for the program. 4,764 students from 108 countries submitted a total of 7,089 project proposals. 212 organizations were accepted in 2018. 207 organizations were accepted in 2019. 199 organizations and 1199 student projects were accepted in 2020. 202 organizations and 1292 student projects were accepted in 2021. Google announced

5700-603: Was launched, in which product information for e-commerce applications was created and published under the Open Content License. It was embraced by the tech sector, which was already quite open source minded. In 2006, a Creative Commons' successor project, the Definition of Free Cultural Works , was introduced for free content. It was put forth by Erik Möller , Richard Stallman , Lawrence Lessig , Benjamin Mako Hill , Angela Beesley, and others. The Definition of Free Cultural Works

5776-423: Was the second with 23 accepted students while Indian Institute of Technology, Kharagpur, India placed third with 14 students. The breakdown of college degrees for the 2011 Google Summer of Code program was as follows: 55% of the students were undergraduates, 23.3% were pursuing their master's degrees, 10.2% were working on their PhDs and 11.5% did not specify which degree they were working toward. Google announced

#256743