Skip to content
HN On Hacker News ↗

Wikipedia:Wikipedia Signpost/2026-05-22/Recent research - Wikipedia

▲ 62 points 13 comments by Antibabelic 2w ago HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

2 %

AI likelihood · overall

Human
100% human-written 0% AI-generated
SEGMENTS · HUMAN 5 of 5
SEGMENTS · AI 0 of 5
WORD COUNT 1,648
PEAK AI % 1% · §5
Analyzed
Jun 12
backend: pangram/v3.3
Segments scanned
5 windows
avg 330 words each
Distribution
100 / 0%
human / AI fraction
Verdict
Human
Pangram v3.3

Article text · 1,648 words · 5 segments analyzed

Human AI-generated
§1 Human · 0%

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"Wikilambda the ultimate: the Wikimedia foundation’s search for the perfect language" Reviewed by User:e_mln_e This paper[1] by Michael Falk (of the WikiHistories project) uses Critical Code Studies methods to examine Wikilambda, the extension of the MediaWiki software that underlies Wikifunctions and Abstract Wikipedia.

"Wikifunctions – Top-level architectural model" (from 2021, by the Wikimedia Foundation, reproduced as figure 1 in the paper) Wikifunctions, a collaboratively edited library of computer functions, is the newest Wikimedia project, launched in 2023. Abstract Wikipedia, a language-independent version of Wikipedia that the Wikimedia Foundation has been developing since 2020, relies on Wikifunctions and thereby Wikilambda to convert structured data from Wikidata into natural language. In other words, Wikilambda is the programming language using Wikifunctions to fetch structured data and facts from Abstract Wikipedia, to translate it into other written language. Published in the journal AI & Society, the paper argues that Wikilambda is an attempt to create a 'perfect language'. Comparing it to previous attempts to create perfect languages, the paper suggests Wikilambda cannot meet its stated goals, and points to assumptions about its potential users that likely aren't correct.

Definitions What does the author mean by a perfect language? The article refers to Umberto Eco's 1995 book The Search for the Perfect Language, which looks at various attempts in history to create ideal languages. Umberto Eco (1995, 73) distinguishes two kinds of ideal language: the "perfect" and the "universal". As described in the article:

A perfect language is one that is “capable of mirroring the true nature of objects. Such a language must analyse the world into its constituent parts, and provide means to build it back up again. Each word must correspond to a real component of nature, and each syntactic rule must correspond to a way that nature combines primitive elements into complex entities. A universal language is ideal in a different way: it is a language which everyone might, or ought to, speak. Esperanto is an example among the spoken languages.

§2 Human · 0%

Among programming languages, BASIC, Logo, Python and Scratch are examples of languages that are intended to be universally accessible. Umberto Eco's book describes many such projects that have failed in the past, because language is not easily severed from symbolism or necessitate a significant learning effort, while not offering the advantages of connection it promised. For instance, Esperanto didn't grow to become a lingua franca. Researchers[supp 1] note that:

Despite the logical concept and intellectual appeal of a standard language, Esperanto has not evolved into a dominant worldwide language. Instead, English, with all its idiosyncrasies, is closest to an international lingua franca. Like Zamenhof, standards committees in medical informatics have recognized communication chaos and have tried to establish working models, with mixed results. In some cases, previously shunned proprietary systems have become the standard. A proposed standard, no matter how simple, logical, and well designed, may have difficulty displacing an imperfect but functional "real life" system. Overall argument Falk argues Wikilambda is an attempt to create two ideal languages:

The proposed "template language" for Abstract Wikipedia is intended to be both perfect and universal: it will be perfectly able to express any fact, and universally accessible by writers all over the world. To implement this "template language", the Abstract Wikipedia team has gone about developing another perfect and universal language: Wikilambda. This programming language will enable the people of the world to collaborate to build the constructors and renderers that will define and express the sum of human knowledge. According to the Wikilambda developers, Wikilambda is universal because it breaks the hegemony of English; it is perfect because it is not actually a language. If WikiLambda indeed is an attempt to create ideal languages, it follows that it is at the same risks of failing as the many other such projects documented by Umberto Eco. The article analyzes why.

Article summary The article opens with a reference to The Signpost's 2023 coverage of an evaluation of WikiLambda, which found the project "at substantial risk of failure".[supp 2] The article includes four sections.

§3 Human · 0%

After the introduction in Section 1, Section 2 describes Wikilambda and its relationship to Wikifunctions and Abstract Wikipedia (see above), and how it treats language as a conduit, i.e. that "when we speak or write, we pack 'content' into a sentence, which is then delivered to a speaker or reader who unpacks the content at the other end." Falk argues that language is not reducible in this way, because of our use of metaphors, and different constructs to understand the world. In Section 3, Falk discusses Wikilambda itself.

The main argument for Wikilambda's universality is that it will break the hegemony of English. Most programming languages, observe Wikilambda's creators, use English as a source of vocabulary. JavaScript has objects, functions and if-statements, rather than Objekte, Funktionen and wenn-statements. Since languages like JavaScript use English words, they force budding programmers to "learn English first" before they learn to program, which is unfair ("Wikifunctions:Vision" 2023). To solve this problem, Wikilambda does not use words to denote parts of a computation. Instead, each part of the computation is assigned a Z-number or Z-key in the Wikifunctions database. When a person visits a function in the Wikifunctions interface, they are presented with a translation of these Z-numbers and Z-keys into their preferred language. Falk notes this is justified by Wikilambda developers as preventing a system reproducing imperialist, Western thinking,[supp 3] which directly contradicts their other beliefs about language as a simple conduit for facts. Further, he points out that because English is the de facto lingua franca, developers communities turn to it to discuss across languages. In Section 4, Falk turns to the function orchestrator, examining "What abstractions have the Wikilambda developers invented to describe their new language? What can these abstractions tell us about the natureand intent of their project?." Falk notes that the first metaphor is that of orchestration:

The orchestrate function takes as its input a piece of Wikilambda code (a ZObject), some configuration settings (invariants) and an ImplementationSelector. Its task is to run the given Wikilambda code, using the ImplementationSelector to choose between available "implementations" in the Wikifunctions database.

§4 Human · 0%

It is this ImplementationSelector that most clearly virtualises the "orchestration" metaphor. Normally, a programming language will have just one way of doing each action: one function for addition, one for integer division, one for instantiating an array, and so on. If there are two ways of doing something, it would normally be up to the programmer to decide: perhaps there are two division routines, one that is fast and approximate and one that is slow but exact, and the programmer can select which one is appropriate for their task. The Wikilambda language is different, because there may be many ways of performing each operation, and it is the orchestrator's job rather than the programmer's to choose between them. He then dives into the specific of language design, to argue that Wikilambda developers are working to carve new abstractions, to make Wikilambda a language escaping traditional programming metaphors and constructs. He also notes the language often fails because of its high level of abstraction, and has to return to default programming conventions.

Where does that leave us? The article is a good introduction to the full Wikilambda project, and a convincing analytical examination of the potential failure points of the project. It situates Wikilambda in the history of programming languages, and provides a useful case study of developers' use of metaphors and understanding of language. It also points to contradictions in the project we should be mindful about. The article concludes with the irony that Wikilambda developers explicitly criticized "One ring to rule them all" approaches,[supp 3] yet implement one such solution. It also highlights the moral commitments made by the team: they make the entire translation process (structured data, functions, interpreter) transparent, contestable and modifiable by humans. "If nothing else, Wikilambda is a thundering critique of corporate AI hype."

See also Background on Critical Code Studies Methods: https://stunlaw.blogspot.com/2024/12/reflections-on-method-for-critical-code.html Briefly See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations. The Wikimedia Foundation has published the findings from the 2025 edition of its annual global survey of Wikipedia readers.

§5 Human · 1%

Among other results, it reports that "Looking at usage trends across all 11 surveyed Wikipedias from 2024–2025, it's clear that Google and YouTube are again consistently the most-frequently named platforms across survey waves and Wikipedia language editions. However, it is also clear that ChatGPT use for learning and accessing knowledge has grown considerably among Wikipedia readers from 2024–2025, particularly on arwiki, jawiki, kowiki, ptwiki, and ruwiki." Alongside Google and YouTube, ChatGPT also received the highest favorability ratings among these other sources. Other finding are about reader demographics, e.g. gender and age:

Consistent with previous findings from 2023 and 2024, Wikipedia readers skew young overall, although this can vary substantially by project. German Wikipedia readers in particular tend to skew older. Share of Wikipedia readers identifying solely as men, by project (from the survey; compare also our earlier coverage: "Global Gender Differences in Wikipedia Readership") Other recent publications Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

Compiled by Tilman Bayer "Generic Geonyms: Exploring Wikidata for Crosslinguistic Prototypical Semantics" From the abstract:[2]:

"[...] data extracted from Wikidata can be interesting for working on geonyms, classifying nouns in place names (e.g., English alley) and their content similarity across languages, e.g., whether Italian piazza and Chinese guǎng chǎng both express the concept ‘square.’ In this paper we explore the use of Wikidata entries to represent the semantic content of geonyms and compare cross-linguistic representations, and thus Wikidata’s potential as a novel, powerful resource for geo-semantic, cross-linguistic research."