SUMMARY

Languages of science

Scientific languages are vehicular languages used by one or several scientitific communities for international communication. According to Michael Gordin, they are “either specific forms of a given language that are used in conducting science, or they are the set of distinct languages in which science is done”.

Until the 19th century, classical languages such as Latin, Classical Arabic, Sanskrit, or Classical Chinese were commonly used across Eurasia for the purpose of international scientific communication. A combination of structural factors, the emergence of nation-states in Europe, the Industrial Revolution and the expansion of colonization entailed the global use of three European national languages: French, German and English. Yet new languages of science such as Russian or Italian had started to emerge by the end the 19th century, to the point that international scientific organizations started to promote the use of constructed languages like Esperanto as a non-national global standard.

After the First World War, English gradually outpaced French and German and became the leading language of science, but not the only international standard. Research in the Soviet Union had rapidly expanded in the years following the Second World War and access to russian journals became a major policy issue in the United States, prompting the early development of Machine Translation. In the last decades of the 20th century, an increasing number of scientific publications relied primarily on English in part due to the preeminence of English-speaking scientific infrastructures, indexes and metrics like the Science Citation Index.

The development of open science has revived the debate over linguistic diversity in science, as social and local impact has become an important objective of open science infrastructures and platforms. In 2019, 120 international research organizations co-signed the Helsinki Initiative on Multilingualism in Scholarly Communication and called for supporting multilingualism and the development of “infrastructure of scholarly communication in national languages”. The 2021 Unesco Recommendation for Open Science includes linguistic diversity as one of the core features of open science, as it aims to “make multilingual scientific knowledge openly available, accessible and reusable for everyone”.

This article is published on this website and as an independently updated Wikipedia article.

Wikipedia

Plan

Notes

Books & thesis

  • Bourne, Charles P.; Hahn, Trudi Bellardo (2003-08-01). A History of Online Information Services, 1963-1976. MIT Press. ISBN 978-0-262-26175-3.
  • Wouters, P. F. (1999). The citation culture (Thesis). Retrieved 2018-09-09.
  • Andriesse, Cornelis D. (2008-09-15). Dutch Messengers: A History of Science Publishing, 1930–1980. Leiden ; Boston: Brill. ISBN 978-90-04-17084-1.
  • Behrens, Julia; Fischer, Lars; Minks, Karl-Heinz; Rösler, Lena (2010). Die internationale Positionierung der Geisteswissenschaften in Deutschland. Hannover: HIS:Projektbericht.
  • Montgomery, Scott L. (2013-05-06). Does Science Need a Global Language?: English and the Future of Research. University of Chicago Press. ISBN 978-0-226-01004-5.
  • Gordin, Michael D. (2015-04-13). Scientific Babel: How Science Was Done Before and After Global English. University of Chicago Press. ISBN 978-0-226-00032-9.
  • Wächter, Bernd; Maiworm, Friedhelm (2014). English-taught Programmes in European Higher Education: The State of Play in 2014. Lemmens Medien GmbH. ISBN 978-3-86856-017-6.
  • Alperin, Juan Pablo (2015). The public impact of Latin America’s approach to open access (Thesis). Stanford University.
  • Olechnicka, Agnieszka; Ploszaj, Adam; Celińska-Janowicz, Dorota (2018-10-08). The Geography of Scientific Collaboration. Routledge. ISBN 978-1-315-47192-1.
  • Bowker, Lynne; CIro, Jairo Buitrago (2019-05-01). Machine Translation and Global Research: Towards Improved Machine Translation Literacy in the Scholarly Community. Emerald Group Publishing. ISBN 978-1-78756-721-4.
  • Moore, Samuel (2019-05-02). Common Struggles: Policy-based vs. scholar-led approaches to open access in the humanities (Thesis). Retrieved 2021-12-11.
  • Poibeau, Thierry (2019-05-09). Babel 2.0: Où va la traduction automatique ?. Odile Jacob. ISBN 978-2-7381-4850-6.

Report

Articles & chapters

Conference

Other article

Declaration

Ressources

Open science multilingual

Three core definitions of Open Science according to the UNESCO recommendation on Open Science - Multilingualism is one of the core feature of Open Science according to UNESCO

CC BY-SA 3.0

UNESCO

Evolution of languages in the Open Library Dataset for all editions (up) or discarding editions from the US and the UK (low)

Pas de licence

Pierre-Carl Langlais

Languages of science

Langlais, Pierre-Carl
CC BY 4.0
published on 1 May 2024
Citer Imprimer Linkedin Bluesky

Langlais, Pierre-Carl, « Languages of science », Petite encyclopédie de la science ouverte / Small encyclopedia of Open Science, published on 1 May 2024.
DOI : https://doi.org/10.52949/71
URL : https://encyclo.ouvrirlascience.fr/articles/languages-of-science/

×

ARTICLE

Open science multilingual

Three core definitions of Open Science according to the UNESCO recommendation on Open Science - Multilingualism is one of the core feature of Open Science according to UNESCO

CC BY-SA 3.0

UNESCO

Scientific languages are vehicular languages used by one or several scientitific communities for international communication. According to Michael Gordin, they are “either specific forms of a given language that are used in conducting science, or they are the set of distinct languages in which science is done.”[footnote “Gordin 2015, p. 24”]

Until the 19th century, classical languages such as Latin, Classical Arabic, Sanskrit, or Classical Chinese were commonly used across Eurasia for the purpose of international scientific communication. A combination of structural factors, the emergence of nation-states in Europe, the Industrial Revolution and the expansion of colonization entailed the global use of three European national languages: French, German and English. Yet new languages of science such as Russian or Italian had started to emerge by the end the 19th century, to the point that international scientific organizations started to promote the use of constructed languages like Esperanto as a non-national global standard.

After the First World War, English gradually outpeaced French and German and became the leading language of science, but not the only international standard. Research in the Soviet Union had rapidly expanded in the years following the Second World War and access to russian journals became a major policy issue in the United States, prompting the early development of Machine Translation. In the last decades of the 20th century, an increasing number of scientific publications relied primarily on English in part due to the preeminence of English-speaking scientific infrastructures, indexes and metrics like the Science Citation Index. Local languages still remain largely relevant in major countries and world regions (China, Latin America, Indonesia) as well as in disciplines and fields of study with a significant degree of public engagement (social sciences, environmental studies, medicine).

The development of open science has revived the debate over linguistic diversity in science, as social and local impact has become an important objective of open science infrastructures and platforms. In 2019, 120 international research organizations co-signed the Helsinki Initiative on Multilingualism in Scholarly Communication and called for supporting multilingualism and the development of “infrastructure of scholarly communication in national languages”.[footnote “Helsinki Initiative on Multilingualism in Scholarly Communication”] The 2021 Unesco Recommendation for Open Science includes linguistic diversity as one of the core features of open science, as it aims to “make multilingual scientific knowledge openly available, accessible and reusable for everyone.”[footnote “UNESCO Recommendation on Open Science, 2021, CL/4363”] In 2022, the Council of the European Union officially supported “initiatives to promote multilingualism” in science, such as the Helsinki declaration.[footnote “Council of the European Union 2022, p. 11.”]

History

From classical languages to vernaculars

Until the 19th century, classical languages played an instrumental role in the diffusion of languages in Europe, Asia and North Africa.

In Europe, Latin had remained the vehiculary language of religion, law and administration until the Early Modern period. It became a language of science “through
its encounter with Arabic”: during the Renaissance of the 12th century, a large corpus of Arabian scholarly texts was translated in Latin, in order to become available in the emerging network of European universities and centers of knowledge.[footnote “Gordin 2015, p. 34”] In this process, the Latin language changed and acquired the specific features of scholastic latin, through numerous lexical and even syntaxic borrows to Greek and Arabic. The practice of scientific Latin persisted long after the replacement of Latin by vernacular languages in most European administrations: “Latin’s status as a language of science rested on the contrast it made with the use of the vernacular in other contexts” and created “a European community of learning” entirely distinct from the local communities where the scholars lived.[footnote “Gordin 2015, p. 35”] Latin never was the sole language of science and education. Beyond local publications, vernaculars very early attained a status of international scientific languages, that could be expected to be understood and translated across Europe. In the mid-16th century, a significant amount of printed output in France was in Italian.

In the Indian and South Asian region, Sanskrit was a leading vehiculary language for science. Sanskrit has been remodeled even more radically than Latin for the purpose of scientific communication as it shifted “toward ever more complex noun forms to encompass the kinds of abstractions demanded by scientific and mathematical thinking.”[footnote “Gordin 2015, p. 37”] Classical Chinese held a similarly prestigious position in East Asia, being largely adopted by scientific and Buddhist communities beyond the Chinese Empire, notably in Japan and Korea.[footnote “Gordin 2015, p. 38”]

The decline of classical languages occurred throughout Eurasia during the 2d millenium. Sanskrit was increasingly marginalized after the 13th century.[footnote “Hock, Hans Henrich (1983). Kachru, Braj B. (ed.). Language-death phenomena in Sanskrit: grammatical evidence for attrition in contemporary spoken Sanskrit. Studies in the Linguistic Sciences. 13:2.”] Until the end of the 17th century, there was no clear trend of displacement of Latin in Europe by vernacular languages: while in the 16th century, medical books started to use French as well, this trend was reversed after 1597 and most medical literature in France remained only accessible in Latin until the 1680s.[footnote “Gordin 2015, p. 42”] In 1670, as many books were printed in Latin as in German in the German states; in 1787, they accounted for no more 10%.[footnote “Gordin 2015, p. 167”] At this point, the decline became irreversibile: since less and less European scholars were conversant in latin, publications dwindled and there was less incentive to maintain linguistic training in latin.

The emergence of scientific journals was both a symptom and an accelerating factor of the declining use of a classical language. The first modern scientific journals were published simultaneously in 1665: the Journal des Sçavants in France and the Philosophical Transactions of the Royal Society in England. They both used the local vernacular, which “made perfect historical sense” as both the Kingdom of France and the Kingdom of England were engaged in an active policy of linguistic promotion of the language standard.[footnote “Montgomery 2013, p. 70”]

French, English, German and the quest for an auxiliary language (1800-1920)

The gradual disuse of Latin opened an uneasy transition period as more and more works were only accessible in local languages. Every national European language held the potentiel to become a language science within a specific research field: some scholars “took measures to learn Swedish so they could follow the work of [the Swedish chemist] Bergman and his compatriots.”[footnote “Gordin 2015, p. 45”]

Language preferences and use across scientific communities were gradually consolidated into a triumvirat or triad of dominant languages of science: French, English and German. While each language would be expected to be understood for the purpose of international scientific communication, they also held “different functional distributions evident in various scientific fields”.[footnote “Goranka 2020, p. 79”] French had been almost acknowledged as the international standard of European science in the late 18th century and remained “essential” throughout the 19th century.[footnote “Montgomery 2013, p. 73”] German became a major vehiculary language after 1800 in “covered portions of the physical sciences, particularly physics and chemistry, plus mathematics and medicine.”[footnote “Montgomery 2013, p. 73”] English was largely used by researchers and engineers due to the seminal contribution of English technology to the industrial revolution.[footnote “Montgomery 2013, p. 73”]

In the years preceding the First World War, linguistic diversity of scientific publications has increased significantly. The emergence of nationalities and early decolonization movements created new incentives to publish scientific knowledge in the local language.[footnote “Gordin 2015, p. 106”] Russian was one of the most successful development of a new language of science. In the 1860s and 1870s, russian researchers in chemistry and other physical sciences ceased to publish in German in favor of local periodicals, following a major work of adaptation and creation of names for scientific concepts or elements (such as chemical compounds)[footnote “Gordin 2015, p. 73”] A controversy over the meaning of the periodic table of Dmitri Mendeleev contributed to acknowledge original publications in russians in the global scientific debate: the original version was deemed more authoritative than its first imperfect translation in German.[footnote “Gordin 2015, p. 75”]

Linguistic diversity became framed as a structural problem that ultimately limited the spread of scientific knowledge. In 1924, the linguist Roland Grubb Kent underlined that scientific communication could be significantly disrupted in the near future by the use of as many as “twenty” languages of science:

Today with the recrudescence of certain minor linguistic units and the increased nationalistic spirit of certain larger ones, we face a time when scientific publcations of value may appear in perhaps twenty languages [and] be facing an era in which important publications will appear in Finnish, Lithuanian, Hungarian, Serbian, Irish, Turkish, Hebrew, Arabic, Hindustani, Japanese, Chinese.[footnote “Kent, Roland G. (1924-06-20). The Scientist and an International Language. Science. 59 (1538): 554–555. doi:10.1126/science.59.1538.554.b. PMID 17818586. S2CID 239785051. Retrieved 2022-01-08.”]

The definition of an auxiliary language for science became a major issue discussed in the emerging international scientific institutions. On January 17, 1901, the newly established International Association of Academies created a Delegation for the Adoption of an International Auxiliary Language “with support from 310 member organizations”.[footnote “Gordin 2015, p. 128”] The Delegation was tasked to find an auxiliary languagues that could be used for “scientific and philosophical exchanges” and could not be any “national language”.[footnote “Délégation pour l’adoption d’une langue auxiliaire internationale, Publications de la Société Linnéenne de Lyon, 1903, p. 10-13”] In a context of increased nationalistic tensions any of the dominant language of science would appear as a non-neutral choice.[footnote “Gordin 2015, p. 110”] The Delegation had consequently a limited set of options that included the unlikely revival of a classical language like latin[footnote “Gordin 2015, p. 111”] or a new constructed language such as Volapük, Idiom Neutral or Esperanto.

Throughout the first part of the 20th century, Esperanto was seriously considered as a potential international language of science. As late as 1954, the UNESCO passed a recommendation to promote the use of Esperanto for scientific communication.[footnote “Gordin 2015, p. 218”] In contrast with the Idiom Neutral or the simplified version or latin Interlingua, Esperanto was not primarily conceived as a scientific language. Yet, by the early 1900s, it was by far the most successful constructed language with a large international community as well as numerous dedicated publications. Starting in 1904, the Internacia Science Revuo aimed to adapt Esperanto to the specific needs of scientific communication.[footnote “Gordin 2015, p. 124”] The development of a specialized technical vocabulary was a challenging task, as the extensive system of derivation of Esperanto made it complicated to import directly words commonly used in German, French or English scientific publications.[footnote “Gordin 2015, p. 127”] In 1907, the Delegation for the Adoption of an International Auxiliary Language seemed close to retain Esperanto as its preferred language. Significant criticism were nevertheless still adressed at a few remaining complexities of the language as well as its lack of scientific purpose and technical vocabulary. Unexpectedly, the Delegation supported a new variant of the Esperanto, the Ido, that was submitted very late in the process by an unknown contributor. While it was framed as a compromise between the esperantist and the anti-esperantist faction, this decision ultimately disappointed all the proponents of an international medium for scientific communication and durably harmed the adoption of constructed languages in academic circles.[footnote “Gordin 2015, p. 145”]

A transition period: English, new competitors and machine translation (1920-1965)

The two world conflicts had a lasting impact on scientific languages. A combination of political, economic and social factors durably weakened the triumvirat of the three main languages of science in 19th century and paved the way for the domination in English in the later part of the 20th century. There is still an ongoing debate if the world conflicts accelerated a structural tendency toward English predominance or created the conditions for it. For Ulrich Ammon, “even without the World Wars
the English language community would have gained economic and, consequently, scientific superiority and, thus, preference of its language for international scientific communication.”[footnote “Ammon 2012, p. 337”] In contrast, Michael Gordin underlines that until the 1960s the privileged status or English was far from settled.

The First World War had an immediate impact on the global use of German in academic settings.[footnote “Montgomery 2013, p. 73”] For nearly a decade after the First World War, German researchers were boycotted from international scientific events. The German scientific communities had been compromised by the nationalistic propaganda in favor of German science during the war, as well as by the exploitation of scientific research for war crimes. German was no longer acknowledged as a global scientific language. While the boycott did not last, its effects were lasting. In 1919, the International Research Council was created to replace International Association of Academies: it only used French and English as a working language.[footnote “Gordin 2015, p. 176”] In 1932, 98.5% of international scientific conferences admitted contributions in French, 83,5% in English and only 60% in German.[footnote “Gordin 2015, p. 180”] In parallel, the focus of German periodicals and conferences had become increasingly local and included less and less frequently research from non-germanic countries.[footnote “Gordin 2015, p. 180”] German never recovered its privileged status as a leading language of science in the United States, and due to the lack of alternative beyond French, American education became “increasingly monoglot” and isolationist.[footnote “Gordin 2015, p. 183”] Not affected by international boycott, the use of French reached “a plateau between the 1920s and 1940s”: while it did not declined, it did not profit either from the marginalization of German and decreased relatively to the expansion of English.[footnote “Montgomery 2013, p. 73”]

The rise of totalitarianism in the 1930s reinforced the status of English as the leading scientific language. While in absolute value, German publications remained not irrelevant, German scientific research was structurally weakened by antisemitic and political purges, rejection of international collaborations and emigration.[footnote “Gordin 2015, p. 202”] The German language was not boycotted again in international scientific conferences after the Second World War, as its use has quickly become marginal, even in Germany itself: even after the end of the occupied zone, English in the West and Russian in the East became major vehiculary languages for higher education.[footnote “Gordin 2015, p. 278”]

In the two decades following the Second World War, English had become the leading language science. Yet, a large share of global research remain published in another language and language diversity seem even to increase until the 1960s. Russian publications in numerous field, especially chemistry or astronomics had known a very rapid growth after the war: “in 1948, more than 33% of all technical data published in a foreign language now appeared in Russian.”[footnote “Gordin 2015, p. 217”] In 1962, Christopher Wharton Hanson still raised doubts about the future of English as the leading language in science, with Russian and Japanese rising as major languages of science and the new decolonized states seeming poised to favor local languages:

It seems wise to assume that in the long run the number of significant contributions to scientific knowledge by different countries will be roughly proportional to their populations, and that except where populations are very small contributions will normally be published in native languages.[footnote “Gordin 2015, p. 307”]

The expansion of Russian scientific publication became a source of recurring tensions in the United States during the decade of the cold war. Very few American researchers were able to read Russian which contrasted with a still widespread familiarity in the two oldest languages of science, French and German: “In a 1958 survey, 49% of American scientific and technical personnel claimed they could read at least one foreign language, yet only 1.2% could handle Russian.”[footnote “Gordin 2015, p. 218”] Science administrators and funders had recurring fears that they were not able to track efficiently the progress of academic research in the URSS. This ongoing anxiety became an overt crisis after the successful launch of Sputnik in 1958, as the decentralized American research system seemed for a time outpaced by the efficiency of Soviet planning.

Although the Sputnik crisis would not last long, one of its consequences will have far reaching consequences on linguistic practices in science: the development of Machine translation. Research in this area emerged very precociously: automated translation appeared as a natural extension of the initial purpose of the first computers, code-breaking.[footnote “Gordin 2015, p. 232”] Despite the initial reluctance of leading figures in computing like Norbert Wiener, several well-connected science administrator like Warren Weaver and Léon Dostert set up a series of major conference and experiments in the nascent field, out of a concern that “translation was vital to national security”.[footnote “Gordin 2015, p. 232”] On Janury 7, 1954 Dostert coordinated the Georgetown–IBM experiment aimed to demonstrate that the technique was sufficiently mature despite the significant shortcomings of the computing infrastructure of the time: a selection of sentences from Russian scientific articles were automatically translated using a dictionary of 250 words and 6 basic syntax rules.[footnote “Gordin 2015, p. 237”] It was not made clear at the time that the sentences had been purposedly selected for their fitness to automated translation. At most Dostert argued that scientific Russian was easier to translate since it was more formulaic and less grammatically diverse than daily Russian.

Machine translation became a major priority in Federal research fundings in 1956 due to an emerging arm race with Soviet researchers. While the Georgetown–IBM experiment did not have a large impact at first in the United States, it was immediately noticed in the USSR. First articles in the field appeared in 1955 and only one year later, a major conference was held attracting 340 representatives.[footnote “Gordin 2015, p. 242”] In 1956, Léon Dostert secured a large funding with the support of the CIA and had enough resources to overcome the technical limitations of exiting computing infrastructure: in 1957, automated translation from Russian to English could run on a vastly expanded dictionary of 24,000 and rely on hundreds of predefined syntax rules.[footnote “Gordin 2015, p. 246”] At this scale, automated translation remained costly as it had to be proceeded by numerous computer operators using thousands of punch cards.[footnote “Gordin 2015, p. 246”] Yet the quality of the output did not progress significantly: in 1964, the automated translation of the few sentences submitted during the Georgetown–IBM experiment yielded a much less readable output, as it was no longer possible to tweak the rules on a predefined corpus.[footnote “Gordin 2015, p. 263”]

English as a global standard (1965-…)

During the 1960s and the 1970s, English was no longer a majority language of science but a scientific lingua franca. The transformation had more wide-ranging consequences than the substitution or two or three main language of science by one language: it marked “the transition from a triumvirate that valued, at least in a limited way, the expression of identity within science, to an overwhelming emphasis on communication and thus a single vehicular language.”[footnote “Gordin 2015, p. 263”] Ulrich Ammon characterizes English as an “asymetrical lingua franca”, as it is “the native tongue and the national language of the most influential segment of the global scientific community, but a foreign language for the rest of the world.”[footnote “Ammon 2012, p. 335”] This paradigm is usually connected with the globalization of American and English-speaking culture in the later part of the 20th century.[footnote “Ammon 2012, p. 335”]

No specific event accounts for the entire shift although numerous transformations highlight an accelerated conversion to English science in the later part of the 1960s. On June 11, 1965, President Lyndon B. Johnson acted that the English language has became a lingua franca that opened “doors to scientific and technical knowledge” and whose promotion should be a “major policy” of the United States.[footnote “US Governement Policy on English Language Teaching Abroad, declaration of Lyndon B. Johnson on June 11, 1965”] In 1969, the most prestigious abstract collection in chemistry of the early 20th century, the German Chemisches Zentralblatt disappeared: this polyglot compilation in 36 languages could no longer compete with the English-focused Chemical abstract as more than 65% of publications in the field were in English.[footnote “Gordin 2015, p. 282”] By 1982, the Compte-rendu of the Académie des Sciences admitted that “English is by now the international standard language of science and it could very nearly become its unique language” and is already the main “mean of communication” in European countries with a long standing tradition of publication in the local language like Germany and Italy.[footnote “Rapport de l’Académie des sciences sur la langue française et le rayonnement de la science française, Compte rendu de l’académie des sciences, 1982, p. 133”] In the European Union, the Bologna Declaration of 1999 “obliged universities throughout Europe and beyond to align their systems with that of the United Kingdom” and created strong incentives to publish academic results in English.[footnote “Bowker Ciro 2019, p. 10”] From 1999 to 2014, the number of English-speaking course in European universities increased ten-fold.[footnote “Wächter Maiworm 2014, p. 16”]

Machine translation, which has been booming since 1954 thanks to soviet-american competition, was immediately affected by the new paradigm. In 1964, the National Science Foundation underlined that “there is no emergency in the field of translation” and that translators were easily up to the task of making foreign research accessible.[footnote “Gordin 2015, p. 263”] Funding stopped simulataneously in the United States and the Soviet Union and Machine Translation did not recover from this research winter until the 1980s and, by then, the translation of scientific publications was no longer the main incentive. Research this area were still pursued in a few countries where bilingualism was an important political and cultural issue: in Canada, a METEO system was successfully set up to “translate
weather forecasts from English into French”.[footnote “Bowker Ciro 2019, p. 38”]

English content became gradually prevalent in originally non-english journals, first as an additional language and then as the default language. In 1998, seven leading European journals published in their local languages (Acta Physica Hungarica, Anales de Física, Il Nuovo Cimento, Journal de Physique, Portugaliae Physica and Zeitschrift für Physik) merged and become the European Physical Journal, an international journal only accepting English submissions. The same process occurred repeteadly in less prestigious publications:

The pattern has become so routine as to be almost cliché: first, a periodical publishes only in a particular ethnic language (French, German, Italian); then, it permits publication in that language and also a foreign tongue, always including English but sometimes also others; finally, the journal excludes all other languages but English and becomes purely Anglophone.[footnote “Gordin 2015, p. 299”]

Early scientific infrastructures have been a leading factor in the conversion to a single vehicular languages. Critical developments in applied scientific computing and information retrieval system occurred in the United States after the 1960s.[footnote “Bourne Hahn 2003, p. 12”] The Sputnik crisis has been the main incentive, as it “turned the librarians’ problem of bibliographic control into a national information crisis.”[footnote “Wouters 1999, p. 62”] and favored ambitious research plans like SCITEL (an ultimately failed proposal to create a centrally planned system of electronic publication in the early 1960s), MEDLINE (for medicine journals) or NASA/RECON (for astronomics and engineering). In contrast with the decline of Machine Translation, scientific infrastructure and database became a profitable business in the 1970s. Even before the emergence of global network like the World Wide Web, “it was estimated in 1986 that fully 85% of the information available in worldwide networks was already in English.”[footnote “Gordin 2015, p. 308”]

The predominant use of English was not limited to the architecture of networks and infrastructures but affected the content as well. The Science Citation Index created by Eugene Garfield on the ruins of the SCITEL had a massive and lasting influence on the structuration of global scientific publication in the last decades of the 20th century, as its most important metrics, the Journal Impact Factor, “ultimately came to provide the metric tool needed to structure a competitive market among journal.”[footnote “Future of scholarly publishing 2019, p. 15”] The Science Citation Index had a better coverage of English-speaking journals which yielded them a stronger Journal Impact Factor and created incentives to publish in English: “Publishing in English placed the lowest barriers toward making one’s work “detectable” to researchers.”[footnote “Gordin 2015, p. 309”] Due to the conveniency of dealing with monolingual corpus, Eugene Garfield called for acknowledging English as the only international for science:

Since Current Contents has an international audience, one might say that the ideal publication would be multi-lingual, listing all titles in five languages — one or more of which is read by most of our subscribers, including German, French, Russian and Japanese, as welf as English. This is, of course, impractical since it would quadruple the size of Current Contents (…) the only reasonable solution is to pubfish as many contents pages in English as is economically and technicrdly feasible. To do this we need the cooperation of pubfishers and authors.[footnote “Garfield 1967”]

Current trends

English standardization

Nearly all the scientific publications indexed on the leading commercial academic search engines are in English. In 2022, this concerns 95.86% of the 28,142,849 references indexed on the Web of Science and 84.35% of the 20,600,733 references indexed on Scopus.[footnote “Beigel 2022.”]

The lack of coverage of non-English languages creates a feedback loop as non-English publications can be held less valuable since they are not indexed in international rankings and fare poorly in evaluation metrics. As many as 75,000 articles, book titles and book reviews from Germany were excluded from Biological abstracts from 1970 to 1996.[footnote “Ammon 2012, p. 344”] In 2009, at least 6555 journals were published in Spanish and Portuguese on a global scale and “only a small fraction are included in the Scopus and Web of Science indices.”[footnote “Montgomery 2013, p. 83”]

Criteria for inclusion in commercial databases not only favor English journals but incentivize non-English journals to give up on their local journals. They “demand that articles be in English, have abstracts in English, or at least have their references in English”.[footnote “Montgomery 2013, p. 82”] In 2012, the Web of Science was explicitly committed to the anglicization (and romanization) of published knowedge

English is the universal language of science. For this reason Thomson Reuters focuses on journals that publish full text in English, or at very least, bibliographic information in English. There are many journals covered in Web of Science that publish articles with bibliographic information in English and full text in another language. However, going forward, it is clear that the journals most important to the international research community will publish full text in English. This is especially true in the natural sciences. There are notable exceptions to this rule in the Arts & Humanities and in Social Sciences topics.[footnote “The Thomson Reuters Journal Selection Process, archived on Internet Archive, also quoted in Montgomery 2013, p. 82”]

This committment toward English science has a significant performative effect. Commercial databases “now wield on the international stage is considerable and works very much in favor of English” as they provide a wide range of indicators of research quality.[footnote “Montgomery 2013, p. 83”] They contributed “large-scale inequality, notably between Northern and Southern countries”.[footnote “OA Diamond Study 2021, p. 102”] While leading scientific publishers had initially, “failed to grasp the significance of electronic publishingl”[footnote “Andriesse 2008, p. 257-258”] they have successfully pivoted to a “data analytics business” by the 2010s. Actors like Elsevier or Springer are increasingly able to control “all aspects of the research lifecycle, from submission to publication and beyond”[footnote “Moore 2019, p. 156”] Due to this vertical integration, commercial metrics are no longer restricted to journal article metadata but can include a wide range of individual and social data extracted among scientific communities.

National databases of scientific publications shows that the use English has continued to expand in the 2000s and the 2010s at the expanse of local language. A comparison of seven national database in Europe from 2011 to 2014 shows that in “all countries, there was a growth in the proportion of English publications”.[footnote “Kulczycki et al. 2018, p. 476.”] In France, data from the Open Science Barometer shows that the share of publication in French has shrunk from 23% in 2013 to 12-16% by 2019-2020.[footnote “Baromètre de la Science ouverte, data.enseignementsup-recherche.gouv.fr”]

For Ulrich Ammon the predominance of English has created a hierarchy and a “central-peripheral dimension” within the global scientific publication landscape, that affects negatively the reception of research published in a non-English language.[footnote “Ammon 2012, p. 336”] The unique use of English has a discriminating effects on scholar who are not sufficiently conversant in the language: in a survey organized in Germany in 1991, 30% of researchers in all disciplines gave up on publication whenever English was the only option.[footnote “Ammon 2012, p. 341”] In this context, the emergence of new scientific powers is no longer linked with the apparition of a new language science as it used to be the case until the 1960s. China has fast become a major player in international research ranking second behind the United States in numerous rankings and disciplines.[footnote “Superpowered science: charting China’s research rise, Nature, May 26, 2021”] Yet, most of this research is English-speaking and abide to the linguistic norms set up by commercial indexes.

The dominant position of English has also been strenghtened by the “lexical deficit” accumulated through the past decades by alternative language of sciences: after the 1960s “new terms were being coined in English at a much faster rate than they were being created in French.”[footnote “Bowker Ciro 2019, p. 8”]

Persistence of linguistic diversity

Several languages have kept a secondary status of international language of science, either due to the extent of the local scientific production or to their continued use as a vehiculary language in specific contexts. This includes generally “Chinese, French, German, Italian, Japanese, Russian, and Spanish”[footnote “Ammon 2012, p. 336”] Local languages have remained prevalent in major scientific countries: “most scientific publications are still published in Chinese in China”.[footnote “Zhang Sivertsen 2020.”].

Empirical studies of the use of languages in scientific publications have long been constrained by structural bias in the most readily accessible sources: commercial databases like the Web of Science.[footnote “Larivière 2018, p. 341”] Unprecedented access to larger corpus not covered by global index showed that multilingualism remain non-negligible, although it remains little studied: by 2022 there are “few examples of analyses at scale” of multilingualism in science.[footnote “Kramer Neylon 2022.”] In seven European countries with a limited international reach of the local language, one third of researcher in Social Sciences and the Humanities publishes in two different languages or more: “research is international, but multilingual publishing keeps locally relevant research alive with the added potential for creating impact.”[footnote “Kulczycki et al. 2020, p. 13”] Due to the discrepancy between the actual practices and their visibility, multilingualism has been described as a “hidden norm of academic publication”.[footnote “Curry Lillis 2022.”]

Overall, the social sciences and the humanities have preserved more diverse linguistic practices: “while natural scientists of any linguistic background have largely shifted to English as their language of publication, social scientists and scholars of the humanities have not done so to the same extent.”[footnote “Ammon 2012, p. 339”] In these disciplines, the need for global communication is balanced by an implication in local culture: “the SSH are typically collaborating with, influencing and improving culture and society. To achieve this, their scholarly publishing is partly in the native languages.”[footnote “Sivertsen 2018”] Yet, the specificity of the social science and the humanities has been increasingly reduced after 2000: by the 2010s, a large proportion of German and French articles in the Art and the Humanities indexed in the Web of Science were in English[footnote “Larivière 2018, p. 348”] While German has been outpaced by English even in Germanic-speaking countries since the Second World War, it has also continued to be used marginally as a vehiculary scientific language in specific disciplines or research fields (the Nischenfächer or “niche-disciplines”).[footnote “Goranka 2020.”] Linguistic diversity is not specific to social sciences but this persistence may be invisibilized by the high prestige attached to international commercial databases: in the Earth sciences, “the proportion of English-language documents in the regional or national databases (KCI, RSCI, SciELO) was approximately 26%, whereas virtually all the documents (approximately 98%) in Scopus and WoS were in English”[footnote “Irawan et al. 2021.”]

Beyond the generic distinction between social sciences and natural sciences, there are finer-grained distribution of language practices. In 2018, a bibliometric analysis of the publications of eight European countries in Social sciences and the Humanities (SSH) highlighted that “patterns in the language and type of SSH publications are related not only to the norms, culture, and expectations of each SSH discipline but also to each country’s specific cultural and historic heritage”[footnote “Kulczycki et al. 2018, p. 465.”] Use of English was more prevalent in Northern Europe than in Eastern Europe and publication in the local languages remain especially significant in Poland due to a large “‘local’ market of academic output”.[footnote “Kulczycki et al. 2018, p. 481.”] Local research policies may have a significant impact as preference for international commercial database like Scopus or the Web of Science may account for a steeper decline of publications in the local language in the Czech Republic, in comparison with Poland.[footnote “Kulczycki et al. 2018, p. 482.”] Additional factors include the distribution of economic model within the journals: non-commercial publications have a much stronger “language diversity” than commercial publications [footnote “OA Diamond Study 2021, p. 48”]

Since the 2000s, the expansion of digital collections had conributed to a relative increase in linguistic diversity academic indexes and search engines.[footnote “Larivière 2018, p. 341”] The Web of Science enhanced its regional coverage during the 2005-2010 period, which had the effect to “increase the number of non-English papers such as Spanish papers”.[footnote “Liu 2017, p. 121”] In the Portuguese research communitee, there have been a steep rise of Portugueses-speaking papers during the 2007-2018 period in commercial indexes which is both indicative of remaining “spaces of resilience and contestation of some hegemonic practices” and of a potential new paradigm of scientific publishing “steered towards plurilingual diversity”.[footnote “Solovova et al. 2018, p. 12”] Multiligualism as a pratice and competency has also increased: in 2022, 65% of early career researchers in Poland have published in two or more languages whereas only 54% of the older generations have done so.[footnote “Kulczycki, Engels Pölönen 2022.”]

In 2022, Bianca Kramer and Cameron Neylon have led a large scale analysis of the metadata available for 122 millions of Crossref objects indexed by a DOI.[footnote “Kramer Neylon 2022.”] Overall, non-English publications make up for “less than 20%”, although they can be under-estimated due to a lower adoption rate of DOIs or the use of local DOIs (like the Chinese National Knowledge Infrastructure).[footnote “Kramer Neylon 2022.”] Yet, multilingualism seem to have improved through the past 20 years, with a significant growth of publication in Portuguese, Spanish and Indonesian.[footnote “Kramer Neylon 2022.”]

Monographies

While quantitative studies on scientific languages have focused on periodicals, monographs have remained an important academic medium, especially in disciplines that are more likely to maintain some degree of linguistic diversity (the humanities and the social sciences). Large bibliographic datasets for books include the Open Library, a project supported by the Internet Archive that document more than 34,734,959 unique editions. Even though the collection is not focused on scientific production, it is possible to extract a subset of 2 millions monographies with Dewey classification codes matching a scientific discipline. Available data shows that the collection is heavily biased in favor of editions printed in the United States (45%) and the United Kingdom (18%) which unsurprisingly result in a predominance of English. Differences between discipline are still significant and echo observations already made on periodical sources.

If editions produced in the United States and the United Kingdom are excluded, the landscape become very different, with a majority of monographies in the humanities being now published in a non-English languages and half of the monographies in the social science. Overall, in non-English countries, monographies have continued to rely on the use of local language, possibly because they could more easily be sold to an expanded “local market” beyond professional academics.

The Open Library dataset also include time data that deliver a very different historical perspective than the temporal analysis of periodical publications. Rather than a growing share of English publications, it highlight different periods of ascension and recession. Counter-intuitively the 1970-1990 period can be characterized as a strong come-back of multilingualism with important growth in the humanities and the social science and even, to a lesser extent, in the STM. Due to the bias of the data collection, it’s possible that this phenomena is due to changes in the acquisition policies of North-American libraries, which could have become more opened to the integration of international works. Otherwise, the trend comes in stark contrast which the tendencies observed on periodical publications in the same period, as leading commercial publishers and indexes had at this point definitely secured the position of English as the default language of science.

Evolution of languages in the Open Library Dataset for all editions (up) or discarding editions from the US and the UK (low)

Pas de licence

Pierre-Carl Langlais

Evolutions between periodicals and monographs become convergent after the 2000s, as we observe a continuous decline of non-English publication. The emerging landscape of open access monographies has seen a similar pattern. The complete data of the Directory of Open Access Books shows that the share of English-speaking publication has strongly increased after 2010 and represent more than 80% of new publications.

Machine Translation

Scientific publication has been the first major use case of Machine Translation with early experiments going back to 1954. Developments in this area were slowed after 1965, due to the increasing domination of English, the limitations of the computing infrastructure, and the shortcomings of the leading approach, rule-based machine translation. Rule-based methods favored by design translations between a few major languages (English, Russian, French, German…), as a “transfer module” had to be developed for “each pair of languages” which quickly led to a combinatory explosions whenever more languages were contemplated.[footnote “Bowker Ciro 2019, p. 41”] After the 1980s, the field of Machine Translation was revived as it underwent a “full-scale paradigm shift”: explicit rules were replaced by statistical and machine learning methods applied to large aligned corpus[footnote “Ramati Pinchevski 2018, p. 2556”][footnote “Bowker Ciro 2019, p. 41”] By then, most of the demand stemmed non longer from scientific publication but from commercial translations such as technical and engineering manuals.[footnote “Hutchins 2007”] A second paradigm shift occurred in the 2010s, with the development of deep learning methods, that can be partially trained on non-aligned corpus (“zero-shot translation”). Requiring little supervision inputs, deep learning models makes it possible to incorporate a wider diversity of languages, but also a wider diversity of linguistic contexts within one language.[footnote “Ramati Pinchevski 2018, p. 2562”] The results are significately more accurate: after 2018, the automated
translation of PubMed abstracts was deemed better than human translation for a few languages (like English to Portuguese).[footnote “Soares et al. 2020, p. 53”] Scientific publications are a rather fitting use case for neural-network translation model since they work best “in restricted fields for which it has a lot of training data.”[footnote “Bowker Ciro 2019, p. 45”]

In 2021, there are “few in-depth studies on the efficiency of Machine Translation in social science and the humanities” as “most research in translation studies are focused on technical, commercial or law texts”.[footnote “Anglaret Sofio 2021”] Uses of Machine Translation are especially difficult to estimate and ascertain, as freely accessible tools like Google Translate have become ubiquituous: “There is an emerging yet rapidly increasing need for machine translation literacy among members of the scientific research and scholarly communication communities. Yet in spite of this, there are very few resources to help these community members acquire and teach this type of literacy.”[footnote “Bowker Ciro 2019, p. 1”]

In an academic setting, Machine Translation covers a variety of uses. Production of written translations remain constrained by a lack of accuracy and, consequently, of efficiency, as the post-editing of an imperfect translation needs to take less time than human translation.[footnote “Bowker Ciro 2019, p. 26”] Automated translation of foreign language text in the context of literature survey or “information assimilation” is more widespread, as the quality requirements are generally lower and a global understanding of a text is sufficient.[footnote “Bowker Ciro 2019, p. 32”]. The impact of Machine Translation on linguistic diversity in science depends on these use:

If machine translation for assimilation purposes makes it possible, in principle, for researchers to publish in their own language and still reach a wide audience, then machine translation for dissemination purposes could be seen to favor the opposite and to support the use of a common language for research publication.[footnote “Bowker Ciro 2019, p. 80”]

Increased use machine translation has created concerns of “uniform multilingualism”. Research in the field has largely been focused on English and a few major European languages: “While we live in a multilingual word, this is paradoxically not taken into account by machine translation”.[footnote “Poibeau 2019, p. 182”] English has frequently been used as a “pivotal” language and served as a hidden intermediary state for the translation of two non-English languages.[footnote “Kaplan 2014.”] Probabilistic methods tend to favor the most expected possible translation from the training corpus and to rule out more unusual alternatives: “A common argument against the statistical methods in translation is that when the algorithm suggests the most probable translation, it eliminates alternative options and makes the language of the text so produced conform to well-documented modes of expression.”[footnote “Ramati Pinchevski 2018, p. 2560”] While deep learning models are able to deal with a wider diversity of language construct, they can still be limited by collection bias of the original corpus: “the translation of a word can be affected by the prevailing theories or paradigms in the corpus harvested to train the AI”.[footnote “Anglaret Sofio 2021”]

In its 2022 research assessment of Open Science, the Council of European Union welcomed the “promising developments that have recently emerged in the area of automatic translation” and supported a more widespread use of “semi-automatic translation of scholarly publications within Europe” due to its “major potential in terms of market creation”[footnote “Council of the European Union 2022, p. 11.”]

Open Science and multilingualism

Open science infrastructures

The development of open science infrastructure or “community-controlled infrastructure” has become a major policy issue of the open science movement. In the 2010s expansion of commercial scientific infrastructure created a large acknowledgment of the fragility of open scholarly publishing and open archives[footnote “Joseph 2018, p. 1”]. The concept of open science infrastructure emerged in 2015 with the publication of Principles for Open Scholarly Infrastructures. In November 2021, the UNESCO Recommendation acknowledged open science infrastructure as one of the four pillar of open science, along with open science knowledge, open engagement of societal actors and open dialog with other knowledge system and called for sustained investment and funding: “open science infrastructures are often the result of community-building efforts, which are crucial for their longterm sustainability and therefore should be not-for-profit and guarantee permanent and unrestricted access to all public to the largest extent possible.”[footnote “UNESCO Recommendation on Open Science, 2021, CL/4363”] Examples of Open science infrastructure include indexes, publishing platforms, shared databases or computer grids.

Open infrastructures have supported linguistic diversity in science. The leading free software for scientific publishing, Open Journal Systems is avaible in 50 languages[footnote “Language Dashboard of the Open Journal System”] and is widespread among non-commercial open access journals[footnote “OA Diamond Study 2021, p. 93”]. A landscape study of SPARC in 2021 shows that European open science infrastructures “provide access to a range of language content of local and international significance.”[footnote “Ficarra et al. 2020, p. 20”] In 2019, leading open science infrastructure have endorsed the Helsinki Initiative on Multilingualism in Scholarly Communication and thus committed to “protect national infrastructures for publishing locally relevant research.”[footnote “Helsinki Initiative on Multilingualism in Scholarly Communication”] Signatories include the DOAJ, DARIAH, LATINDEX, OpenEdition, OPERAS or SPARC Europe.[footnote “Signatories of the Helsinki Initiative on Multilingualism in Scholarly Communication”]

In contrast with commercial index, the Directory of Open Access Journals does not prescribe the use of English. Consequently only half of the journals indexed are primarily published in English, which comes in stark contrast with the large prevalence of English in commercial indexes like Web of Science (> 95%). Six languages are represented by more than 500 journals: Spanish (2776 journals, or 19.3%), Portuguese (1917 journals), Indonesia (1329 journals), French (993 journals), Russian (733 journals) and Italian (529 journals).[footnote “OA Diamond Study 2021, p. 42”] Most of the language diversity is due to non-commercial journals (or diamond open access): 25.7% of these publications accept contributions in Spanish vs. only 2.4% of APC-based journals.[footnote “OA Diamond Study 2021, p. 42”] On the 2020-2022 period, “for English articles in DOAJ journals, 21% are in non-APC journals, but for articles in languages other than English, this percentage is a massive 86%.”[footnote “Kramer Neylon 2022.”]

Non-English open infrastructures have experimented a significant growth: in 2022, “national repositories and databases are growing everywhere (see the databases such as Latindex in Latin America, or the new repositories in Asia, China, Russia, India)”.[footnote “Milia, Giralt Arvanitis 2022, p. 13.”] This development opens up new research opportunities for the study of multilingualism in a scientific context: it will become increasingly feasible to study ” differences between locally published research in non-English speaking contexts and English-speaking international authors”.[footnote “Milia, Giralt Arvanitis 2022, p. 13.”]

Multilingualism and social impact

Publication in open access platforms has created new incentives for publishing in a local language. In commercial indexes, non-English publications were penalized by the lack of international reception and had a significantly lower impact factor.[footnote “Larivière 2018, p. 350”] Without a paywall, local language publication can find there own specific audience among a large non-academic public that may be less competent in English.

In the 2010s, quantitative studies have started to highlight the positive impact of local languages on the reuse of open access resources in varied national contexts such as Finland[footnote “Pölönen et al. 2021b.”], Québec[footnote “Cameron-Pesant 2018.”], Croatia[footnote “Stojanovski, Petrak Macan 2009.”] or Mexico. A study of the finnish platform Journal.fi shows that the audience of Finnish-speaking articles is significantly more diverse: “in case of the national language publications students (42%) are clearly the largest group, and besides researchers (25%), also private citizens (12%) and other experts (11%)”.[footnote “Pölönen et al. 2021b”] Comparatively, English-speaking publications attract mostly professional researchers. Due to the ease of access, open science platforms in a local language can also attain a more global reach. The French-Canadian journal consoritum Érudit has mostly an international audience, with less than one third of the readers coming from Canada.[footnote “Cameron-Pesant 2018, p. 372.”]

The development of a strong network of open science infrastructures in South America (such as Scielo or Redalyc) and the Iberian region has concurred to the resurgence of the Spanish and Portuguese language in international scientific communication: regional growth “may also be associated with the boom in open access publishing. Both Portuguese and Spanish (as well as Brazil and Spain) play important roles in open access publishing[footnote “Liu 2017, p. 121”]

While multilingualism have been either neglected or even discriminated in commercial databases, it has been valued as a significant component of the social impact of open science platforms and infrastructure. In 2015, Juan Pablo Alperin introduced a systematic measure of social impact that highlighted the relevancy of scientific content for local communities : “By looking at a broad range of indicators of impact and reach, far beyond the typical measures of one article citing another, I argue, it is possible to gain a sense of the people that are using Latin American research, thereby opening the door for others to see the ways in which it has touched those individuals and communities.[footnote “Alperin 2015, p. 4.”] In this context, new indicators for linguistic diversity. Proposals include the PLOTE-index[footnote “Dahler-Larsen 2018.”] and the Linguistic Diversity Index[footnote “Linkov et al. 2021.”]. Yet, as of 2022, they have had “limited traction in the scholarly anglophone literature”[footnote “Kramer Neylon 2022.”] Comprehensive indicators for the local impact of research remain largely non-existent: “many aspects of research cannot be measured quantitatively, especially its socio-cultural impact.”[footnote “Irawan et al. 2021, p. 6.”]

Policies in favor of multilingualism

A new scientific and policy debate over linguistic diversity emerged after 2015[footnote “Kulczycki et al. 2020, p. 2.”]: “in recent years, policies for Responsible Research and Innovation (RRI) and Open Science call for increasing access to research, interaction between science and society and public understanding of science”[footnote “Kulczycki, Engels Pölönen 2022, p. 9.”] It initially stemmed from a wider discussion over the evaluation of open science and the limitations of commercial metrics: in 2015, the Leiden Manifesto issued ten principles to “guide research evaluation” that included a call to “protect excellence in locally relevant research”[footnote “Hicks et al. 2015”] Building up on empirical data showing the persistence of non-English research communities in Europe, Gunnar Sivertsen has in 2018 theorized the need for a balanced multilingualism: “to consider all the communication purposes in all different areas of research, and all the languages needed to fulfil these purposes, in a holistic manner without exclusions or priorities.”[footnote “Sivertsen 2018”] In 2016, Sivertsen has contributed to the “Norwegian model” of scientific evaluation by proposing a flat hierarchy between a few large international journals and a wide selection of journals that would not discrimination local publications and encouraged journals in Social Sciences and the Humanities to favor Norwegians publications[footnote “Sivertsen 2018”]

These local initiatives developed into a new international movement in favor of multilingualism. In 2019, 120 research organizations and several hundred individual researchers co-signed Helsinki Initiative on Multilingualism in Scholarly Communication. The declaration include three principles:

  • “Support dissemination of research results for the full benefit of the society”, which implies that they should be available “in a variety of languages”.
  • “Protect national infrastructures for publishing locally relevant research” through a specific support of the non-commercial/diamond model “make sure not for-profit journals and book publishers have both sufficient resources”. Non-commercial journals are more likely to be published in a local language.[footnote “OA Diamond Study 2021, p. 48”]
  • “Promote language diversity in research assessment, evaluation, and funding systems”, in line with third recommendation of the Leiden Manifesto.

In the wake of the Helsinki Initiative, multilingualism has been increasingly associated to Open Science. This trend was accelerated in the context of the COVID pandemic, which “saw a widespread need for multilingual scholarly communication, not only between researchers, but to enable research to reach decision-makers, professionals and citizens”[footnote “Pölönen et al. 2021a”] Multilingualism has also re-emerged as a topic of debate beyond the social sciences: in 2022, the Journal of Science Policy and Governance published a “Call to Diversify the Lingua Franca of Academic STEM Communities​”, that stressed that “cross-cultural solutions are necessary to prevent critical information from being missed by English-speaking researchers.”[footnote “Henry et al. 2021, p. 1.”]

In November 2021, the UNESCO Recommendation for Open Science included multilingualism at the core of its definition of Open Science: “For the purpose of this Recommendation, open science is defined as an inclusive construct that combines various movements and practices aiming to make multilingual scientific knowledge openly available, accessible and reusable for everyone.[footnote “UNESCO Recommendation on Open Science, 2021, CL/4363″]”

In the early 2020s, the European Union has started to officially support language diversity in science, as a continuation of its general policies in favor of multilingualism. In december 2021, an important report of the European Commission on the future of scientific assessment in European countries still overlooked the issue of linguistic diversity: “Multilingualism is the most notable omission”.[footnote “Pölönen et al. 2021a”] In June 2022, the Council of the European Union included a detailed recommmendation on “Development of multilingualism for European scholarly publications” in its research assessment of open science. The declaration acknowledges the “important role of multilingualism in the context of science communication with society” and welcomes “initiatives to promote multilingualism, such as the Helsinki initiative on multilingualism in scholarly communication.”[footnote “Council of the European Union 2022, p. 11.”] While the declaration is not constraining it invites the experiment with multilingualism “on a voluntary basis” and to assess the needs for further actions by the end of the 2023.[footnote “Council of the European Union 2022, p. 12.”]