Sunday 8 June 2014


Collective Memory in Wikipedia




SIMON DeDEO
Indiana University
Santa Fe Institute 

VIDEO




OVERVIEW: In an analysis of range of social systems, from online collaboration in Wikipedia to revolutionary activity in the Arab Spring, we find a common structure to social reasoning that crucially involves the formation of long-term memories and dispositions. No individual member serves as the system memory or reasoner; these dispositions are, instead, collective states of the group as a whole. The underlying computational structure appears to make use of at least one (formally) unbounded resource. We provide a game theoretic account of group-level strategies based on a simple belief-formation mechanism, and show the challenges that arise in connecting these group level phenomena to the beliefs and desires of the underlying individuals.

READINGS:
    DeDeo, S. (2013). Collective Phenomena and Non-Finite State Computation in a Human Social System. PloS one, 8(10), e75818. 

      Klingenstein, Sara, Tim Hitchcock, Simon DeDeo (2014) The civilizing process in London's Old Bailey. Proceedings of the National Academy of Sciences

    Hooper, P. L., DeDeo, S., Caldwell Hooper, A. E., Gurven, M., & Kaplan, H. S. (2013). Dynamical Structure of a Traditional Amazonian Social Network. Entropy, 15(11), 4933-4955.
    DeDeo, S (2014) Group Minds and the Case of Wikipedia


32 comments:

  1. Wikipedia analyzed in this way WOW!!.Terrific presentation. Thank you SIMON DeDEO. Just one question. I would like to know more between the relation between beliefs and desires humans. Could you share some links please about it?

    ReplyDelete
    Replies
    1. Many—but far from all—accounts of human behavior make reference to "Beliefs" and "Desires". As a simple example, our "folk psychology"—the way we naturally talk about each other—says things like, "I want food [desire], I think there's food in the kitchen [belief], I want to go there [desire]..."

      Mathematical theories do better with beliefs than desires; game theorists, e.g., work with https://en.wikipedia.org/wiki/Utility as a way to quantify preferences—desiring X more than Y—but this leaves a lot of complexity out.

      Delete
  2. So, if I understand correctly, the most polemic and popular a topic is, the most neuter the wikipedia article will end up being? That's a very interesting effect. Collective controversy turning somehow into a neutral outcome. But how hard is it to maintain articles neuter? How often does a loop edit - undo edition - re-edit- undo-reedition happen? And how is this avoided or identified?

    ReplyDelete
    Replies
    1. I think my question is currently being answered during the talk. :)

      Delete
    2. "Neutrality" is a difficult concept; a computer (or a scientist) can't look at an article and say "this is neutral"—that's a judgement people in the system have to make. We can't track it!

      Meanwhile, we see a broad range of timescales—seconds to years—for cooperative streaks! See the http://arxiv.org/abs/1407.2210

      Delete
  3. Dear Simon, What a fascinating presentation ! One question : Since Wikipedia is the biggest knowledge base available, what’s the more important technical and/or theoretical challenge for game theory to automatically reason on Wikipedia as an ontology, and predict how social behaviors in Wikipedia contribute to improve the accessibility of knowledge and science?

    ReplyDelete
    Replies
    1. I don't have deep insight in the ontology question, or to the direct problem of accessibility and Wikipedia. But let me answer more broadly.

      We have a social process, say, but we might want it to *do* something (and hopefully: do something good). My sense is that cooperation is a necessary, but not sufficient, criterion for a system to produce good outcomes.

      Indeed, we should be careful not to link these things too closely: many social systems are highly adversarial, but we believe they have good outcomes. Examples include the Common Law legal system that the British exported to the US and Canada (excepting Quebec, of course), peer review for academics, Oxford-style debating clubs. My sense is that these are also cooperative, meaning that they have agreement on a deeper set of norms, but they certainly appear pretty adversarial to an outsider!

      Conversely, cooperation is also a feature of organized crime, of political systems we now commonly believe are profoundly unfair or immoral. You can't maintain a hereditary aristocracy without a great deal of cooperation (in the game theory sense).

      Delete
  4. Interesting talk by Professor DeDeo. I was particularly interested in the discussion of simple, tractable mechanisms in undergraduates and parakeets (i.e., keeping track of those I peck and who they peck) leading to complex social structures with hierarchical organization over a short amount of time, and all this without the individual agents explicitly aiming at a definite hierarchical structure. My question for Professor DeDeo is how well he thinks this result generalizes to social phenomena in general. I would also like to know the extent to which he thinks we can explain group-level social facts as emerging from simpler dynamics of this kind.

    ReplyDelete
    Replies
    1. We think it generalizes a great deal—beyond systems like Wikipedia (already large, tens of thousands) to cultural and political systems on even longer timescales and sizes.

      Delete
    2. Thanks you! And thanks again for the great talk.

      Delete
  5. A Have a question about recursive formula for the social fact and prestige Ac =BR (BR (Ac)): I want to know a “stop” criteria for this formula (when we stop seeking of the prestige for example)!!

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. When we gave a "Pagerank" or "Eigenvector Centrality" part of the talk, we implicitly talk about the long time limit of the process. If you remember the analogy, about passing around the trophy, "prestige of group X" corresponds to "the probability of group X holding the trophy" if I wait "a long time".

      This is just a model, of course; we find the fixed point by reference to a the infinite limit of a process. It doesn't necessarily describe what actually happens; only that the system is in a stationary, stable state.

      As John Maynard Keynes said, "in the long run, we are all dead"—it is likely that we learn and react to prestige on short timescales, and estimate it using limited data. How limited? We don't know—how quickly (for example) do high school students figure out who is "cool"? In the parakeets case, it takes about a week.

      It's important to remember that the extent to which these sorts of approximations match the reality of prestige is an empirical question.

      Delete
  6. Simon deDeo said « probability of revert (non-cooperation) declines as the square-root of the number of cooperative steps ». Could we say that this is due to some tendency to imitate others?

    ReplyDelete
    Replies
    1. I think he said it was rather to be understood in terms of "payoff" expectation, payoff being that your contribution actually contributes to the article you're trying to contribute to. But indeed, he might want to control for the possibility that it's rather immitation. (Did he?)

      Delete
    2. This said, I struggle understanding this part.

      Delete
    3. This comment has been removed by the author.

      Delete
    4. This comment has been removed by the author.

      Delete
    5. Argh! I keep putting replies in the wrong places.

      Ms. Sève has a nice suggestion here—that part of what is driving the overall system towards cooperative behavior is an imitation effect. Put another way, "my desire to cooperative increases when I see others cooperating". This mechanism would naturally lead to a shift from the Prisoner's Dilemma structure towards a mutualistic/norm-conformant one.

      It's a nice suggestion that, because it doesn't specify why it's "in my interests to imitate", doesn't come naturally to game theorists!—but tendencies to imitate others are widely observed in laboratory experiments (and our informal observations of our social worlds.)

      Delete
  7. Ok, here's where I struggled.

    If I understood well, there were people who interpreted wikipedia editing as a prisonner's dilemma, and those who didn't and simply collaborated. Yet I feel that you could do both – indeed, if you adopt the tit-for-tat strategy (the most popular and most effective in multiple-turns prisonner's dilemma), you won't need to defect if you find someone who adopts the same strategy.

    So, what exactly are you talking about when you're talking about people who interpret the situation as a prisonner's dilemma, and those who don't?

    ReplyDelete
    Replies
    1. Terrific question, Mr. Chartrand. We model the system as a stage game without memory; what we then find is the need to refer to a social-level "memory effect", where repeated cooperation drives the system towards a mutualistic (as opposed to antagonistic) state.

      As you note, Prisoner's Dilemma games can be "solved" in a mutualistic fashion if we allow reputation effects to play a role (tit-for-tat being a very simple example of how reputation matters—if I defect against you, I have a bad reputation that lasts one round).

      Once you allow memory into the system, and give people beliefs about (the possibility of) strategies like tit-for-tat, then the effective payoff matrix changes a great deal.

      So you can see what we're doing as detecting a system-wide memory that has some similarity with the individual-level memories that are so important for tit-for-tat. At the system level, we have more and more individuals that believe that they're better off playing "as if" a solution such as tit-for-tat has been discovered.

      So (to examine the question from a different angle): why don't we start with tit-for-tat? My sense early on was that cooperation in Wikipedia was not a pairwise, or person-to-person, problem; what we tended to see were many people interacting with each other at the same time (see, e.g., Fig. 3 of http://arxiv.org/pdf/1407.2210v1.pdf which shows the distribution of the number of people involved in cooperative runs of different lengths).

      So rather than try to build a multiplayer tit-for-tat, I decided to look for the effects of the those kinds of memories (and resulting strategies) on system behavior.

      Delete
  8. I hope these responses were helpful; do note that these are off the cuff at the end of a lovely day. When in doubt, students and colleagues should refer to our published work on these questions!

    Thank you for your questions and for coming to the talk; it was a great deal of fun trying to put this together for Stevan's meeting.

    ReplyDelete
  9. To link some of the ideas from the beginning of your talk, I was wondering if there are any effects due to "prestige" in the editing of Wikipedia pages. For example, if a user has more "prestige" in the eyes of other users (perhaps because they're prolific contributors), are their changes less likely to be reverted by others? Or if a user, A, has their contribution reverted by another user, B, will a third user C (who observed that B reverted A's contribution) be more likely to revert user A's subsequent contributions?

    ReplyDelete
    Replies
    1. Informally, prestige and reputation both appear to play a role in editor-editor interactions. Records of behavior are easy to access, and users have many different ways to signal outcomes of prior interactions; names are usually pseudonyms, and many users (most, under some ways of counting) choose to retain the same psuedonym for long periods.

      We are keen to look at prestige effects in Wikipedia. It is harder to study, because many individuals interact only rarely, if at all. This is distinct from the cases we described in the talk, where the undergraduates were asked to rate everyone, and the parakeets were in a small ~20 member group and if they didn't aggress against everyone, certainly had the potential to. Meanwhile, some users might appear on a page only very briefly before disappearing forever.

      Delete
  10. I have a few questions:

    1) How does the content of a Wikipedia page compare to the most recent high-profile academic review on the same topic? Is there a formal system in place to evaluate the accuracy of a Wikipedia entry?

    2) Are Wikipedia users familiar with each other? If user X constantly reverts changes, do other editors notice user X’s behaviour and in return revert his edits? Does it go the opposite way too: if a user edits and never reverts, are other editors more likely to edit his posts?

    3) How many of the edits are just partial reverts? If someone writes 4 new sentences and someone else deletes 3 of them, is this considered editing and not reverting? Nonetheless, that square root rule still applies.

    4) I observe more and more people referring to Wikipedia as a factual all-knowing source. Are issues developing from a sort of singularity of knowledge? Or, do debates between Wikipedia editors resemble anything like academic debates (they probably occur much faster, as it doesn’t take so long to publish)? Do controversial entries more often present both sides of an argument, or do they get watered down and emotionless so that conflict remains low?

    ReplyDelete
    Replies
    1. 1) I have difficulty with these kinds of questions, because they are more deeply "normative" than the observational studies I usually think about. Asking whether an article is accurate is quite distinct to studying the ways in which users come to alignment on endogenous norms within the system. It's an excellent question, of course—but one we should clearly distinguish.

      2) Informally, some users have reputations that are well-known; see my reply to Ms. Telids above for some of the issues that come up when considering these effects in Wikipedia.

      3) Tracking partial reverts is more than possible. The norm on Wikipedia treats full reverts differently than partial reverts—strange as it sounds, taking out 3/4ths of what you write is considered different from taking out all of what you write. But this gray area, where I'm being profoundly uncharitable as an editor without crossing the line, is something that Stevan mentioned in the lecture, and it's a good topic.

      4) Accuracy is a difficult topic, made even more difficult by the conflict between how Wikipedia runs (constant, pseudonymous updates) and how we are used to evaluating knowledge claims (fixed document that experts argue over). A famous article in 2005 studied Wikipedia by sending out printouts to academics for peer review; this is commonly cited, though note that it was a journalistic project, and not an academic/peer reviewed work itself.

      You mention the role of emotions in conflict; this is a great question. It would be interesting to use sentiment analysis on the encyclopedia and track the evolution of emotion over time—particularly as correlated with revert behavior in the edit history. Drop me a line if you are interested in pursing this and would like to talk!

      Delete
    2. Thanks for the quick reply.

      Tracking the evolution of emotional and opinionated text in Wikipedia articles is a very interesting research prospect. I wonder how sentiment in Wikipedia evolution compares to renewing editions of standard encyclopedia's or peer-reviewed academic reviews.

      For now, I'll leave these projects open for someone else. Nonetheless, thanks for offering to help me pursue these ideas!

      Delete
  11. Thanks for an outstanding talk! As we already started to discuss after the talk, I am interested in the reflexive effects of coarse graining as applied to social media and society at large. Considering social dynamics in complex adaptive systems we have a certain state space that the particular trajectories of agents can explore. When we largely reduce the dimensionality of the state space, we indeed lose the fine grain effects with the possible benefit of gaining important insights about systemic patterns. However, social media is a living laboratory and our models do affect the behavior of individual agents. We cannot assume here a position where our observations, hypotheses and theories do not affect significantly the systems we observe. This raises of course quite a few questions about the complex relations between this kind of science and society.

    Interestingly, in the description of how Wikipedia editors treat their interactions, it was made clear that their beliefs are highly significant to the dynamics of the whole system. Once they might become aware to these beliefs their strategy may, with high probability change with unpredictable consequences to the overall dynamics. I am interested to hear more of your thoughts on the issue.

    ReplyDelete
    Replies
    1. This is a lovely question, and one somewhat at the border between science and more normative or ethical questions.

      To be clear, we are still building evidence towards the role that coarse-grained concepts play in social systems. While it seems intuitively plausible that these simplifications affect how we interact with others (and this has been a theme of the qualitative literature for many years) in my opinion we do not have quantitative smoking gun that we desire.

      We can show the existence of the information, the use of this information, but we can't rule out that this—for example—might be supplemented by more fine-grained information in the right way.

      Your second question is another lovely topic. We looked at the stock market in a similar fashion—in particular, we looked at the phenomenon of "rallies" in stocks such as IBM and Apple that have long-term stability. These seem to be completely without pattern: as you would expect when there is a great deal of money on the line and many competitive predictors.

      Delete
  12. Really excellent and informative talk. I was hoping I could watch your lecture again before I posted, but the video doesn't seem to be online.

    I'm curious whether you think coarse graining in sociology may fall toward oversimplification. In physics, I see how such aggregations can be immensely helpful. Particles are not self-actuallizing, motivated, subjective, feeling beings; for the most part, they simply follow the laws of nature. When we abstract "social facts" from individual actions and then conjecture various individual mental states of self-actuallizing, motivated, subjective, feeling beings, we leave room for assumption and personal bias. Obviously, approximation can be helpful, but I think when it comes to people, there is a big danger in being too prescriptive; at best, we conclude a social fact which is true for the majority (but may further isolate communities that already feel marginalized). What do you think?

    I'm not sure how familiar you are with anti-oppression (1, 2). It is a tool for viewing the world that emphasizes indvidial identity, oppression, and privilege. I am curious if you think societal coarse graining counteracts such a framework or if the two can be compatible.

    ReplyDelete
    Replies
    1. "Approximations can be helpful": this is true. Indeed, I would go further, and say that approximations are necessary. If we tried to know our social worlds at their most fine-grained level, we would likely be incapable of action. Even worse, social facts would change before we got around to learning them all.

      The question of how we choose to construct those approximations, however, is critical. Both you and Rob Goldstone have brought this up separately. Informally, the city of Montreal provides an excellent example. The particular way we choose to "summarize" Montreal will make various features more or less salient: we might describe it, for example, using categories that wash out some aspects of the culture compared to others.

      This goes back to an old remark from Plato's Phaedrus: we want to "carve nature at the joints". The problem is doubly hard in the case of social systems, because we are not only figuring out the "best" way to coarse-grain the system, but also what the participants in the system itself are using.

      These can be distinct; indeed, a lot of social critique is directed at pointing out what an author thinks is the wrong coarse-graining. A classic example is the (presumably caricatured) view of Marx as suggesting that individuals should see each other in terms of social class, as opposed to nationality.

      As scientists, we're interested in figuring out both "what good descriptions are", and "what the descriptions are that people use"; the relationship between the two is of course itself a fascinating topic.

      Meanwhile, I think you are right to point out the relationship between these (to me) fascinating scientific questions, and ethical questions about how we should (as opposed to actually do) relate to others.

      Speaking informally, thinking about social systems in terms of coarse-graining gives us a new perspective on how our beliefs about others (and ourselves) create our social worlds. And that has to have ethical implications.

      Thinking about those implications carefully—while, I think, keeping in mind that we *necessarily* coarse-grain the world, we can't have a totally one-to-one relationship with it—is without question an important thing to do. I should emphasise a second time that when I'm talking about ethics here, I am very much speaking informally.

      Delete
  13. The time-dependent relationship between revision and collaboration is interesting. I wonder if it applies in other circumstances, like social networking for example. Perhaps a more nuanced examination of the negative and positive emotion Facebook finding would reveal a similar underlying mechanism with negativity being like a revert and positivity like collaboration.

    ReplyDelete