Sunday 8 June 2014

Open Memetrics: 
Monitoring, Measuring and Mapping Memes 

STEVAN HARNAD
UQAM & U Southampton

VIDEO


Closing Overview and Discussion of Summer Institute 


Overview: Memes are the practices and products that we copy from one another and pass on from generation to generation. Memes began with analog mimicry, but with language they became digital. Natural language is a code that subsumes all other codes (maths, logic, programming languages). With language you can say, and understand, and convey anything that can be said. The first arbitrarily shaped word spoken (or, more likely, gestured) with the intention to convey a true/false proposition was the first digital meme. Words are almost all the names of categories (things we need to know what to do with). Language evolved as a means of sharing our categories through verbal instruction instead of having to learn them through slow, risky trial-and-error induction from direct experience. The Web has now become our “Cognitive Commons” — the global repository of our digital memes, and the means of monitoring, measuring, mapping and maximizing our categories: the quotidial ones as well as the scholarly and scientific ones. How many words were needed to “initialize” this whole process? Why were we so quick to use the web for chatter and commerce, but slower to use it to share research findings? And can cognizing — the felt mental state of thinking — be extended beyond individual cognizing minds to collective and even “global” minds?
READINGS
Blondin-Massé, A., Harnad, S., Picard, O. & St-Louis, B. (2013) Symbol Grounding and the Origin of Language: From Show to Tell. In: Lefebvre C, Comrie B & Cohen H (Eds.) Current Perspective on the Origins of Language, Benjamin http://eprints.ecs.soton.ac.uk/21438/

Harnad, Stevan (2013) The Postgutenberg Open Access Journal (revised). In, Cope, B and Phillips, A (eds.) The Future of the Academic Journal (2nd edition). 2nd edition of book  Chandos. http://eprints.soton.ac.uk/353991/

33 comments:

  1. Would it be correct to say that application of categories or categorical perception is equivalent to selection of relevance out of the incoming flow of information? (this includes also selection of relevant actions given a perceived situation)

    ReplyDelete
    Replies
    1. The most likely reason that within-category differences become compressed and between-category differences become enhanced by (difficult) category-learning is that in learning the category we learn to ignore the irrelevant properties and focus on the ones that distinguish the members from the non-members. This acts like a filter in a multi-dimensional property space, suppressing the irrelevant dimensions, thereby changing the distances between members and non-members in this lower-dimensional space.

      Delete
  2. Dear Stevan, Thank you very much for you conclusive presentation and the organization of this very rich event ! I have three questions : (1) How to define and situate the notion of ‘relevance’ in your theory of categorization (“Doing the right thing with the right Kind of thing”). (2) Do you make a distinction between categorization and classification? (3) How do you think Wikipedia can be improved to have a consistent taxonomy (system of categories) and improve the use of Wikipedia as an ontology with more concepts? Which computational and cognitive strategies would you recommend for this goal?

    ReplyDelete
    Replies
    1. (1) How to define and situate the notion of ‘relevance’ in your theory of categorization (“Doing the right thing with the right Kind of thing”).

      Your (unconscious) feature learning mechanism finds which features distinguish members from non-members and which do not. The ones that do not, it filters out, as irrelevant.

      (2) Do you make a distinction between categorization and classification?

      Classification is a particular case of categorization.

      (3a) How do you think Wikipedia can be improved to have a consistent taxonomy (system of categories) and improve the use of Wikipedia as an ontology with more concepts?

      Crowd-sourcing may not be the best way to get fine-tuned categories. Automated text analysis algorithms may eventually help; not that good yet.

      An "ontology" is just a taxonomy.

      (3b) Which computational and cognitive strategies would you recommend for this goal?

      See above.

      Delete
  3. Thank you for this very interesting talk. My question for Professor Harnad concerns Open Access and working as a young scholar. What are young scholars, who typically have little to no financial resources to acquire the rights to their intellectual production, to do in order to publish with reputable institutions while proving Open Access to those publications? Should we all be switching to publication in Open Access journals, or is there something else we can do?

    ReplyDelete
    Replies
    1. Publish in the best journal whose peer review quality standards you can meet. When the paper is accepted for publication, deposit it in your institutional open-access repository (at UQÀM: http://www.archipel.uqam.ca ) This is called "Green Open Access.")

      No need to pay a penny to publish in ("Gold") Open Access journals ("Fool's Gold") today. The time for Gold OA journals is only after Green OA has reached 100%. Then all subscription journals can be cancelled, which will force them to convert to Gold OA, and that can be paid for out of just a fraction of each institutions annual windfall savings from the subscription cancellations. This post-Green "Fair Gold" will be much cheaper because the print edition and online edition and their expenses will be gone, all access provision will be done by the worldwide network of OA institutional repositories, and the only remaining cost will be managing peer review (since the peers review for free). That will be paid on a "no fault" basis, per round of peer review, irrespective of whether the paper is accepted or rejected. That way the cost for refereeing accepted papers will not include the cost of rejected papers, as now.

      Delete
  4. Si la peur est la raison pour laquelle les gens ne mettent pas leur publication en OA, comment se fait-il que des gens qui connaissent bien la cour, comme Alain Deneault ou Francis Dupuis-Déri, ne libèrent pas leur travaux?

    J'ai l'impression qu'il y a une sorte de gap psychologique entre la production d'articles de recherche et la dynamique de diffusion – comme si les chercheur·ses écrivaient des articles pour l'article lui-même, sans jamais penser qu'il doit être diffusé.

    ReplyDelete
    Replies
    1. Je ne connais pas AD et FD-D: peut-être qu'ils écrivent des livres plutôt que des articles. Mais s'ils écrivent des articles et ne les auto-archivent pas, c'est soit peur, soit paresse.

      Delete
  5. You said that only humans are able to mind-read through language, but what about non-human animals? Do you think they are able to mind-read? If so, what about non-human animal that share some codes that we could compare to language? I guess it will depend on your definition of language. How do you define it?

    ReplyDelete
    Replies
    1. Yes non-human animals can mind-read (from movements, vocalisations, facial expressions, and inborn and learned signals) but not via language.

      No, neither the inborn nor the learned communicative signals of nonhuman animals are language.

      Language is a code in which you are able to express any and every proposition. Nonhuman animals do not possess, and cannot learn such a code, mainly because they do not have the capacity or the inclination to understand or make propositions (subject/predicate statements with truth-value True or False) at all.

      Delete
  6. Thank you STEVAN HARNAD for everything. Thank for organizing this events with your team.A comments. I think, the language has the constant evolution. In addition there are languages ​​that are currently being mixed or disappear. So to get the kernel of a language will always be dynamic. The core won't be the same forever. What do you think?

    ReplyDelete
    Replies
    1. You are right, languages are dynamic systems who evolves with time. The exact words in each components are different from each other, even across dictionaries. What we are interested about right now is whether or not the MinSets/Core/Satellites/Kernel/Rest are different from each other. We did it by comparing their psycholinguistic profiles, and they do! (at least for the four english dictionaries we used).

      Delete
    2. As Philippe said, the kernel and core differ (somewhat -- we don't yet know how much) from dictionary to dictionary within the same language. They no doubt also change with time, and differ (somewhat) between languages. Moreover, each dictionary has a huge number of different Minimal Grounding Sets, each of the same size, part-Core and part-Satellites, and each is capable of defining all the rest of the dictionary. So MinSets are not unique (though the differences between many of them can be tiny). But each can do the whole job within a given dictionary.

      Delete
  7. Thank you for this nice talk.
    May question is about the core that contain a minimal words that can define 90 % of other words, (1) can you explain me what is the origin (nature and when they appear in human line time) and can we expect to find the same words in the others languages ?

    ReplyDelete
    Replies
    1. It is the Kernel that can define the rest. There is only one Kernel, and its words are learned earlier, more frequent, and more concrete than the rest of the dictionary. But the Kernel is not the smallest number of words from which all the rest can be defined. That is the Minimal Grounding Set, and it is not unique: there are a huge number of them, all consisting of different subsets of the Kernel words. Every MinSet is part Core words and part Satellite words. We don't yet know the functional role each kind of word plays. The Core words seem to be the ones that are learned earliest of all.

      Delete
  8. Stevan Harnad is aTier 1 Canada Research Chair of Cognitive Schmience.

    ReplyDelete
  9. Do you think that any of the technologies described in this summerschool (information extraction, data mining, network analysis) can be compared to human categorization processes? could we argue that the difference between processes carried out by machines and artificial intelligence analysis of data and meta data to obtain patterns and information lies only in the symbol grounding problem/ or the fact that they are not felt? Or is anything else -intrinsec to the human categorization process that cannot be achieved through a machine?

    ReplyDelete
    Replies
    1. Some automated processes work better than human categorization, some worse (so far). There is so far something that the brain can do with its data that these algorithms cannot do (or cannot yet do). But I certainly could not say that that is because humans feel, because no one has the faintest idea how or why humans feel!

      Delete
  10. Professor Simon discussed big data in humanities & social sciences and in biology. She said that in biology scientists need to de-contexualize the data and do other things before they are able to re-use it, so there is little incentive for data donation. Do you advocate open access of data in addition to open access of articles? If so, what are some possible solutions to incentivize data sharing in biology and other fields?

    ReplyDelete
    Replies
    1. There is a big difference between article-sharing and data-sharing. Articles are published, hence already "shared" -- but only with those users who are at institutions that can afford to subscribe to the journal in which they were published. All article-authors give their articles free to their publishers (and seek and get no royalties). All they want in return is peer review, and then users who read, apply, build-on and cite their findings.

      But the reason researchers gather data is not to publish them but to mine them (analyze, hypothesis-test). They are researchers, not data-gatherers. It varies from field to field, but in general researchers need time -- sometimes a long time -- with exclusive rights to mine the data they have gathered, otherwise why should they bother gathering the data at all, rather than let someone else do the work, and then mine their data as soon as it is available?

      So the access-barriers for articles are publishers' copyright-transfer agreements (which authors don't like, but are afraid to refuse or violate) whereas the access-barriers for data are the author's incentive for gathering the data in the first place, for which they expect a period of exclusive data-mining rights.

      Delete
    2. What about sharing the raw data once they have published their articles?

      Delete
    3. Who promised the data would all be analyzed in one article? (The fair period will vary from field to field and project to project.)

      Delete
    4. I am uncertain about the ethical implications of sharing raw data. In fields where you are working with human participants, such as psychology, the raw data can contain information that is very sensitive. I know, in the past, when I've applied for research ethics approval for a study, you must be very stringent about who is able to access the data and where they can work with it. This is even for non-nominal (no identifying information) data. This strikes me as another hurdle for raw data sharing.

      Delete
  11. It would be interesting to transcribe a famous piece of literature using only words in the minimal grounding set. I wonder what it would read like and if we would be able to understand the text in the same way.

    ReplyDelete
    Replies
    1. There are a huge number of minimal grounding sets (each of the minimal size) in the Kernel. It's already a challenge to define all words using just the words in the Kernel (10%).

      But the exercise you suggest might be interesting, Nicole -- if for nothing else than to show why the growth of language was not just just a matter of grounding the minimal number of words through direct experience, and then using only those words, and not coining any more. It is an open questions when and why we lexicalize new words. They name new categories, to be sure, but when and why do new categories call for a new name instead of just a description (made up of old words)? It probably has something to do with keeping our sentences from becoming too long, but also with "chunking" (creating bigger composite units and then giving them a name for short-hand) in our thinking, as well as the need to put some direct sensorimotor flesh on the formal bones of new categories learned purely by verbal instruction (formal definition).

      Delete
  12. "Nothing in Biology Makes Sense Except in the Light of Evolution" Theodosius Dobzhansky.

    In the context of the above quote how would you draw the evolutionary value of the "felt experience" to the fitness of the organism. Assuming that the production of metal states requires resources such as energy and computation time, it must confer certain advantages to make an evolutionary sense. What would these advantages be and how are they manifested?

    Thanks for the talk and the whole organization of the summer school!

    ReplyDelete
    Replies
    1. That's why it's called the "hard problem"...

      Harnad, S. (2002) Turing Indistinguishability and the Blind Watchmaker. In: J. Fetzer (ed.) Evolving Consciousness Amsterdam: John Benjamins. Pp. 3-18. http://cogprints.org/1615/

      Delete
  13. I was thinking about your many times emphasized 'feeling' faculty of the mind in the same manner as you explained the concept of the 'kernel' of the dictionary. As I understood, the kernel is made of words that are not described by other words, but only grounded in experience. Feeling is something similar in a sense that it cannot be explained by words.

    The other note is that I looked to the article of Picard et al (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.186.1709) to the picture of the toy dictionary and it seems that words in the kernel correspond to some sort of basic distinctions (good / bad; no / not; light / dark) which can be seen as fundamental for any kind of cognition but can only be grounded in immediate experience of the cognitive agent.

    Note #2: the kernel depicted in the picture painfully reminds me of the concept of semiotic square (http://en.wikipedia.org/wiki/Semiotic_square) which is very interesting with respect to language/cognition relationship.

    ReplyDelete
  14. Yes, at ground-level, feeling cannot be expressed in words. But even a Zombie robot would have a symbol-grounding problem if all it had was words. And explaining how and why sensorimotor experience is felt experience is the "hard problem." (See the 2012 summer school: http://turingc.blogspot.ca

    As I warned, the the toy dictionary was merely invented as an example. It is not a real dictionary in any respect.

    I hope the Kernel has nothing to do with semiotics, which as far as I can tell is mostly cultish gibberish (though I'm ready to be corrected if someone can show me example of where semiotics actually explains and predicts, rather than just interprets...)

    ReplyDelete
  15. Thanks for the presentation and the summer school Stevan. Since I agree with most of the things you say, I'll bring up the main point where we differ in opinion.

    I believe everything I sense (see, touch, hear...) is part of my cognition. You argue that if we remove the moon, we still feel something (albeit not seeing the moon), and therefore the moon is not part of our cognition. You use the same argument for any other stimulus.

    Even if we instantaneously remove all stimuli, we would still feel based on our previous experiences, feelings, and thoughts. However, if we never receive any stimuli, we would never feel. Thus, I think we need to include all stimuli (our environment) as part of our cognition. Yes, if we remove individual stimuli we would feel, but if we remove all stimuli ever, then we would not.

    To compare with the brain: If we remove single areas of the brain, we can still feel. If we remove the entire brain, however, we would no longer feel. I think we can extend this argument to our environment.

    ReplyDelete
    Replies
    1. Robert, I think the disagreement is about what is mean by cognition: Cognition is thinking (which is a state that a thinker can be in, and a process that is going on -- somewhere -- that is the thinking itself), and cognitive science is trying to figure out what thinking is by reverse engineering the mechanism that actually implements the process and state of thinking. Of course the inputs to that mechanism (the things I see and hear) influence the thinking, but are they part of the mechanism that implements the process and the state of thinking? A camera also has states and processes: Is the moon a part of the state or process of a camera filming the moon? Or is it just the input to the camera? (I am not saying the brain is like a camera; it's just an analogy...)

      On the other hand, if by "cognition" you don't mean the state or process of thinking, and the physical mechanism that implements or generates thinking, but rather what it is that we might be thinking about, then the moon is a part of cognition when we are thinking of the moon...

      Delete