After Wikipedia

In 20 years Wikipedia will be forgotten. It will then be remembered only by some specialist in the Department of Information Archaeology. This latest re-staging of the dream of a Universal Encyclopaedia, created for and by the people, will show to have been of the same failed grand standing as the encyclopaedias Wikipedia replaced and helped to bury. In 20 years, Wikipedia will be seen as the same perplexingly fruitless endeavour as has been the encyclopaedia of D’Alembert and Diderot – its idealism to cover and structure “all human knowledge” to be adored and applauded, but its realization to be pitied as hampered by the technical and conceptual limitations of its time.

An encyclopaedia is as good as the technical form that underlies its realization. Its benefits and deficits are less bound to content, or motivation and capabilities of those that create it. A technical form has it’s advantages and its limits, and what is true of an encyclopaedia in the format of tomes of printed books is true as well of an encyclopaedia that is spread out over servers in different countries, accessible from any entry point to the Big Now.

Wikipedia’s basic features were to be a wiki that collects information provided under CC-/GNU-licenses. Both, wiki and licenses, made the innovative concept of a “content donation” even possible :  people donate a piece of content – be it content proper, a commentary, a fact- or grammar-check, a source, a quote,  etc. – that, being under a CC-license, becomes susceptible to combination, alteration, elaboration. 

Whereas the wiki technically allowed for the collection of tiny donations at one “place” (the lemma), the CC-/GNU-licenses legally allowed for these tiny donations to be gathered, re-used, connected, built upon. People could donate bits of content in a similar way they would donate a little sum of money to their beloved charity. The lasting innovation of Jimmy Wales was to take the feature of a donation from the world of charities and non-profits and put it into place in the area of information, its collection and processing. [1]

This approach relies heavily on human beings for content donations. As with all donations, people give more and on a more regular basis, if and when they are asked only for tiny contributions and if there are easy ways to facilitate them. Big donations on the other hand are rather rare, an exception both in charities as in collaborative text-production. But whereas ordinary charities manage to acquire money-donations on a constant basis, Wikipedia faces a natural limit : at some point in time all the easy edits are done, and the only thing that remains is to donate heavily, i.e., to invest much time and work into a new edit or the advancement of a given edit / article. In fact, in Wikipedia to contribute with a content donation becomes more difficult over time, whereas in regular charities the giving of money always stays the same. That’s the main structural reason why Wikipedia loses editors and is not able to recruit new ones, not some alleged “cold” or “harsh” atmosphere, nor gender- or regional biases.

Given Wikipedia’s dependence on human beings for input and the content donations to be tiny and easy to give, it’s no wonder that Wikipedia is diversifying its outreach to the “Global South”, to indigenous people, to oral knowledge. True, the motivation is to let more people take part in that D’Alembert / Diderot-hallucination (regardless whether those people “outside” the western hemisphere even want it or not). But looking a bit closer we find that this serves mainly as an evasion. As most simple edits (i.e., tiny content donations) are already done (i.e., collected) in the “Global North,” Wikipedia needs to find new territories with new donors willing to make new content donations. As success is measured in rates of clicks and growth of subject matters covered, diversifying horizontally becomes the option left when diversifying vertically reaches its limit due to strains that cannot be covered by easy edits. [2]

The recourse to and dependence on myriads of tiny content donations under CC-/GNU-licence, structured in articles, provided by human beings, form the technical and structural limits of Wikipedia as a concept and an idea. Today’s Wikipedia can only work along such lines, because that is what it’s technical format and infrastructure was built to make real : enhancing the retrieval of content donations made by people via a wiki on a specific website. The result are articles that regardless of quality are often too long for an encyclopaedia and too short for a monographic essay.

*

In 20 years we will be in a better position to recognize this limit in approach in collecting, structuring, and referencing encyclopaedic information. This will be a time in which the need for human beings to provide content donations will be long gone. We can see this trend already in the making – be it Wikidata itself with its aim of readability and writeability not only for humans but for machines as well; be it developments in data journalism; be it pushes to automate information gathering and text production (particularly for economic reasons in the news industries). [3]

In very few years we will see query functionalities that will make no longer use of complex articles about subject matters placed on a website. 

Answering queries (by humans as well as machines) won’t depend on fixed information chunks as they are offered today in encyclopaedic articles. Usually articles in Wikipedia are either too long for an encyclopaedia and too short for a monographic essay. To dump all information in some such article may be a contemporary way of collecting the information, [4] but it is not the way people use them in answering a query – they don’t read the whole essay, but only the parts that seem relevant to the question at hand. But this “break up” of information in tiny parts that are sufficient to answer the query will be extrapolated and generalized to all treatment of search queries. The very concept of an article, an entry in an encyclopaedia, will become hopelessly obsolete as search- and text-production tools will automatically deliver the information “asked for” in the query, without all the bulk of additional information that is usually transported in an encyclopaedic article but that doesn’t answer the query at hand.

Information will no longer be stored in and retrieved from stable chunks of information (such as in articles and other text corpora) as search- and mining-algorithms will be able to sense the point of the query at hand and answer it accordingly. Again, such algorithms will be able to answer the question by simultaneously strolling through repositories on the web, combining the best pieces from different sources into one retrieval. [5] The vast data repositories that will be at hand to provide the basis for the reply, fashioned to the likes and dislikes of the person asking, will be combed through at a moment. The category of an article or even an essay will become grossly antique.

For this approach the very concept of a website with some text to retrieve answers from to a question will be useless. And the main reason for that is that it doesn’t even work today. Or put differently : it works but only insofar as the information retrieval in search for an answer to a question is mainly done by the human person that does the query, not by the machine.

When we try to find something out, we usually form some more or less precise question (open to refinement) that we hope to answer by thinking about the best search phrases or even questions we might put into the search box of the engine or the online encyclopaedia. To answer our question we phrase keywords that lead us to articles and text blocks that we hope may contain some information that answers our question. In fact, we “translate” our question into clusters of keywords that we put into the search box, that delivers results in form of text blocks from which we then try to abstract (or extract) something like an answer to our question. This extraction of an answer usually does not happen just by reading an article or two. Usually, when approaching a text, we zoom-in and zoom-out, becoming absorbed in details here and ignoring others there. We find points of interest, feel that we don’t know enough yet, or find ourselves going astray. We read some portions more intensely than others, and all this to and fro happens in one article or between various ones. Zooming-in and -out is our way to gather information, and the prerequisite for that are texts and articles that are general enough to allow for a multiple readings. Such texts are extensive enough to be used just partially, be it by one person or many. For that, those texts usually provide more information than one person would or could use at one or more occasions. 

Wikipedia articles share the same characteristic. They are written to provide the means for partial information gathering, done by zooming-in and -out. These zooming-ins and -outs are performed by the person that looks up the article, skims over the text, and stops reading when the “feel” of the question being answered occurs. Today it’s the human being that does these assessments; and the text corpus is provided to suit as a basis for such a multitude of different angles to zoom-in and zoom-out. But in the near future, this “zooming” will be part of the automated search query, done by algorithms that will produce just the amount of text that, for the moment, seems sufficient to answer the query at hand – the question, its angle, its depths of zoom. [6]

So what is changing is this : The need for encyclopaedic articles will vanish as algorithms will take on the task of zooming-in and -out. [7]

In the coming years information will be so abundant and at the same time so constantly fluent – due to actuality and the way people frame questions – that Wikipedia as a “place” for storage and certification of quality will no longer be of any use. With the development in data-mining, automated text production, and search algorithms that are capable of directly processing a question asked, Wikipedia (and Wikidata, of course) may offer some such repositories from which automated text production tools will retrieve information to produce an answer. But neither will they use only Wikipedia- / Wikidata-material nor will they even feature Wikipedia- / Wikidata-articles in whole or in part in the machine-produced answers. As myriads of repositories will be at hand, the method of using content donations to accumulate material under a lemma will simply be as outdated and labour intensive as the bygone annual print-updates of the Encyclopaedia Britannica. There will no longer be a demand for such ways to produce updates of encyclopaedic information.

In the end, the concept of a “place” where information resides is obsolete. As automated text production and data mining lift off, the indispensability of human beings as sources of information will be outdated too. Content donation, the charity flair that motivated so many people to engage in Wikipedia in the first place, will be meaningless as all the content will already be around, readable by the search algorithms and reproduced by automated text producing tools.

In former times the printed book as material carrier of the information by its very nature preserved a specific state of that information. Similar to that Wikipedia with its on-going editor-discussions is far to slow for the speed of change information will acquire. Ironically, Wikipedia shares very much the same problems its advocates ridicule in the Old Media: veracity as place- and author-bound; the hierarchisation of sources; the problem of knowledge vs belief expressed as problems of authority; problems in enhancing quality.

Wikipedia will soon share the fate of the Old Media, and for very much the same reasons. It relies on concepts and procedures for information-gathering and -structuring that belong to the Old Ages. In 3 to 5 years automated content-aggregation and text-production will be all around. In 20 years, no-one will even remember that there once was a shift in doing so. Everything will seem as if it had always been that way. Because this is how older things retrospectively look like : being the same as now, just more primitive.

 

[1]  I credit Jimmy Wales for finding a way to make “content donations” on a grand scale possible. This is different from the conceptual and methodological innovations around Creative Commons, open access, free software etc. that stem from and enshrine the collaborative mood of the hacker scenes that go back to the 1980s.
[2] Austin Gibbons, David Vetrano, Susan Biancani, “Wikipedia : Nowhere to Grow” reach a similar conclusion.
[3] See the discussion around the “Journatic byline scandal”: Matthew Ingram, “The uncomfortable truth behind the Jornatic byline scandal”, in: GigaOm, July 4, 2012, 9:07 AM PT. | Matthew Ingram, “Journatic CEO: We are creating a better future for journalism”, in: GigaOm, April 27, 2012, 10:48 AM PT. | Matthew Ingram, “Are robots and content farms the future of the news?”, in: GigaOm, April 25, 2012, 2:17 PM PT. | Steven Levy, “Can an Algorithm Write a Better News Story Than a Human Reporter?”, in: Gadget Lab (Wired), April 24, 2012, 4:46.
[4] As it primarily adds content donations to existing ones and receives less donations with regard to editing, Wikipedia in fact mirrors the enumerating and adding process Isidore of Seville used in his Etymologiae (and that was in the 6th and 7th century AD.). Entries thus vary in quality, completeness, balance.
[5] Take Google’s current research on knowledge graphs as one example. That Google is a co-sponsor of Wikidata has a slightly juicy air : Wikimedia develops a tool under CC-licence that Google may later use in revenue generating search functionalities that enhance its market monopoly.
[6] This automation of zooming in queries will be greeted by many with great relief. As we all suffer from overload not only of information in general but of text skimmed in order to find the relevant information in particular, every tool that helps to ignore information will be seen as great progress. Particularly when the tool can “elaborate” on the answer it gave at first by showing more details (“zooming” as well).
[7] Not that I cherish these developments. In fact, I think they are contrary to how we should deal with questions, answers, aspects, ambiguity, knowledge, error, etc. But the economic realities of the web, the prevalent belief that information is some kind of building block in the “Logical Structure of the World” (Rudolf Carnap), the data fetishism and the literal worship combined will pretty much accelerate those tendencies in the next years.

 

* * *

 

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

One Response to After Wikipedia

  1. Pingback: Does Wikipedia Really Matter? | simsa0's wordpress

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s