‘Wikipedia consensus is that an unedited machine translation, left as a Wikipedia article, is worse than nothing.’
Wikipedia’s founding goal is to make knowledge freely available online in as many languages as possible. To date, this has mostly been in English. Different languages on Wikipedia are called “editions,” and the English edition recently surpassed 6 million articles. Having over a million articles is a feat that only 16 of the 309 editions have accomplished.
The Cebuano Wikipedia is the second largest edition of Wikipedia, lagging behind the English version by only just over 630 thousand articles and ahead of the Swedish and German editions by over 1.64 and 2.98 million articles, respectively. Its positioning is rather peculiar given that, according to the Encyclopedia Britannica, there are only approximately 16.5 million speakers of the language in the Philippines. Despite having over 5.37 million articles, it has only 6 administrators and 14 active users. The English edition, by comparison, has 1,143 administrators and 137,368 active users for over 6 million articles, at the time of writing.
According to research by Motherboard and comments by several global administrators, highly trusted users who specialize in combating vandalism across Wikipedia editions, this is due to the use of bots, automated tools that primarily carry out repetitive and mundane tasks, but can also be used to generate Wikipedia entries. According to a paper published in Proceedings of the ACM on Human-Computer Interaction journal, there are approximately 1,601 of these bots in existence across Wikipedia editions. While the English Wikipedia and other editions use these tools to perform repetitive and otherwise mundane tasks, some editions have taken to using them to write content.
While this may not seem like an issue, when the majority of an edition’s content is written by a single bot it can negatively impact the quality of the edition. The particular bot writing the Cebuano edition is called “Lsjbot” and was created by the Swedish physicist Sverker Johansson. His creation is responsible for over 24 of the edition’s 29.5 million edits and according to research done by Guilherme Morandini, another global administrator, has created 5,331,028 of the edition’s 5,378,570 articles, or 99.12 percent of its article creations. According to that same research, all but five of the edition’s top 35 editors are bots, with no human editors in the top 10. Based on this, Morandini argued that bots have taken over the Cebuano edition from human editors.
“Bots are the product of people,” Vermont, a long-time global administrator who asked to be referred to by their Wikipedia username, said. “They have not taken over any project; rather, they have simply disincentivized article creation with vast amounts of stub [articles].” Vermont also pointed out that Lsjbot has made “more edits…than there are speakers of Cebuano.”
Riley Huntley, a new global administrator, compiled a sample of 1,000 random articles that Lsjbot created. From the random selection of these 1,000 results that Motherboard reviewed, the majority were surprisingly well constructed.
According to Johansson, his bot operates using the following basic principles: to begin, he selects a semantic domain—an area of meaning and the words used to describe it. For instance the domain “body” would include “foot,” “hand,” “face,” and so on. The next step in the process [ … ]