[from guest blogger and podcast co-host Julia Galef]
Scientists know that a crucial test of whether you've understood a phenomenon is whether your understanding helps you accurately predict new information. (Some people argue that refining our predictions about the world is in fact all science actually does, but that's a topic for another day.) So it occurred to me that a good test of whether you've understood a postmodernist text -- as opposed to merely thinking you've understood it -- might be whether your alleged understanding of a string of pomo text helps you predict the next word in that string.
That's where information theory enters the picture. The information-theoretic definition of entropy is a measure of the uncertainty associated with a random variable, or how reliably you can predict its value based on the information you already know. In 1950, the father of information theory Claude Shannon measured the entropy rate of English-language texts by how predictable each successive letter was, based on the letters preceding it. The less predictable each letter is, the higher the entropy rate of the entire sequence of letters. As you might guess, English has a lot of unpredictability (if I say "This letter is an e, what comes next?" it's hard to guess with much confidence) but also some predictability (If I say "This letter is a q, what comes next?" you can probably make a pretty confident guess).
Measuring a text's entropy rate at the letter-level the way Shannon did probably wouldn't tell you much about whether it contains any coherent ideas. But what about entropy at the word-level? The unpredictability of a word based on the preceding words is a measure of the information content of a text and also of its meaning. Too predictable (entropy too low) and you've essentially got repeated mantras; too unpredictable (entropy too high) and the words have literally no relation to each other. Somewhere in between those two extremes is a range of entropy rates in which meaningful fields fall, and I'd be willing to bet that pomo would be a high-entropy outlier.
Let's return to the quote from Deleuze I used in my previous post:
“In the first place, singularities-events correspond to heterogeneous series which are organized into a system which is neither stable nor unstable, but rather ‘metastable,’ endowed with a potential energy wherein the differences between series are distributed ... In the second place, singularities possess a process of auto-unification, always mobile and displaced to the extent that a paradoxical element traverses the series and makes them resonate, enveloping the corresponding singular points in a single aleatory point and all the emissions, all dice throws, in a single cast.”
This text obeys the rules of the English language, but so does the sentence "Colorless green ideas sleep furiously," linguist Noam Chomsky's quintessential example of a grammatically correct but meaningless statement. In meaningful writing, there is more logic to the placement of words than just the logic of grammar, and that's the difference which I think an entropy rate calculation would capture. So when Deleuze talks about singularities that "possess a process" and of "differences" being "distributed" in an "energy," those are arrangements of words which you would not normally see in coherent English writing. Phrases like these are a large part of what makes pomo writing incoherent, and I suspect they would also make its entropy rate very high relative to other expository writing.
Falsifiability
Finally, before I stop picking on the poor postmodernists, I want to mention a great, intuitive test proposed by Tony Lloyd in the comment thread of the previous post. Pick a statement in a pomo text and ask an expert, "What evidence could conceivably falsify this claim (or at least cast doubt on its veracity)?" If no evidence which could ever be obtained, even in principle, has any bearing on the claim's veracity, then the claim is consistent with literally all conceivable states of the world and therefore meaningless. So, Deleuze: How would we know if singularities do NOT possess a process of auto-unification?
The vagueness of pomo writing also makes it very difficult to falsify (conveniently, some might add). What conceivable evidence could falsify Deleuze's description of a system being "neither stable nor unstable"? It’s the same trick that astrologers, and other people who make pseudoscientific or mystical predictions, use to inoculate themselves from disproof. If your horoscope predicts that you are "extroverted but can also be withdrawn," or that you will experience "financial success but watch out for setbacks," it will never be wrong. But that doesn’t make it right.
~Julia Galef
Galef ignores the possibility that Deleuze's text itself constitutes a metapolyhomorphism with culture. Culture is necessary to actualize the text - indeed, it sublimates it - but since identity is not possible outside of culture, we are forced to locate Galef's conception of modernity within the structural aegis of hegemony.
ReplyDeleteRitchie,
ReplyDeletedid you study with Deleuze? Because what you just wrote makes absolutely no sense to me...
Massimo, I suspect Ritchie's intent was an attempt at satire. To try to fool us to think what he says means something.
ReplyDeletePerhaps that's not an uncommon strategy?
I'm not defending Deleuze here, but metastability is actually a well-defined concept: http://en.wikipedia.org/wiki/Metastability
ReplyDeleteEssentially it means that a system has multiple stable states. So it's neither stable, in the sense of absolutely unchanging, nor unstable, in the sense of constantly in flux.
Also, 4-to-1 odds that Ritchie the Bear is being facetious.
ReplyDeleteExcellent post. The comparison between pseudo-science and post-modernism seems so obvious now that you've suggested it...
ReplyDeleteI still don't understand why elite academic circles maintain the validity of this tripe.
Julia,
ReplyDeleteI used to be sympathetic to what you are saying here until the last couple years ago. In grad school I became good friends with a fellow who was studying a lot of continental philosophy - all the names you mentioned in the last post. He was a smart guy and not the type of person to BS. Anyway, I showed him Dawkin's article one day and he basically said it was trash. He gave me a good analogy that made sense to me. He said that if you went into a random scientific article and extracted a random paragraph from the middle of the article, very few people would understand what it says. It would look like gibberish because scientists use highly technical language that non-scientists can't understand. After some time they might be able to, but on a casual reading it makes no sense. He said it is no different with these pomo philosophers. It makes sense if you are familiar with the terms and how they use them but otherwise it seems like nonsense.
Anyway, I know you are using Deleuze as a representative example, but if you wanted to know what that small paragraph out of his book meant, did it ever occur to you to ask an expert on Deleuze? It would probably take just a minute or two to use google to find a Deleuze expert and email him or her for an explanation.
Julia, great idea. Why don't you write a simple program to do the calculation? Is there an online archive of postmo writings that can be used as a corpus? You might need to do the analysis in French, though.
ReplyDeleteThis is probably going to sound offensive in this forum, but to a non-philosopher like me, analytical philosophy writings appear to be as dense and as unpredictable as that thing by Deleuze. I would't be too surprised that if you do the analysis you proposed to a large sample, postmo entropy isn't all that different from other branches of philosophy.
In asking whether I study with Deleuze, Pigliucci demarcates sexuality as a variant of metatextuality. According to Barthes, sexual metatextuality is not normalizable except as a byproduct of capital. Thus, a variety of discourses, "discourses", and metadiscourses can be inferred, all of which allow the "Other" to be instantiated by the signifier itself.
ReplyDeleteIah,
ReplyDeleteyou said:
-"I'm not defending Deleuze here, but metastability is actually a well-defined concept: "-
But the criticism of postmodernism is that the fact that you use actual words in a grammatically correct way does not mean you have said anything with actual meaning.
(Pssst! Ritchie! ... I think they might be on to you!)
ReplyDeleteJulia,
ReplyDeleteIf I understand you correctly you are suggesting that we can tell the difference between meaningless post modernist text strings and meaningful text strings by looking at the text entropy at the word level. If so I disagree.
First there is no difference in the entropy between the word level and the letter level. You can reconstruct the word order from the letter order and you can reconstruct the letter order from the word order. You just have two different ways of encoding the same information. Since no information is gained or lost in the translation there can be no change in entropy.
I can generate a meaningless string with any entropy characteristic you want. Therefore there cannot be any necessary connection between the entropy of a string and meaning.
Also I can take a meaningful string with low entropy and compress it into a shorter string with much higher entropy. Again there is no necessary connection between entropy and meaning.
The higher the entropy the message has the more information it contains by definition. Does that mean it contains more meaning? Maybe, but the problem is there is no metric for meaning. In fact the whole point of Shannon information is that he separated the statistical properties of the string from the meaning of the string.
There have been several bad attempts to define a metric for meaning. The most infamous is the intelligent design movement where they claim to be able to detect the meaningful intent of a creator in biological design.
This is just a continuation of the human need to inject meaning and purpose out into the world. It is no different in principle from an island native who sees an angry god as the meaning behind an erupting volcano. We humans experience ourselves as active agents who act with meaning and purpose and we project that out in the world.
So what is the meaning of meaning? I don't know. It is something we humans experience like the color red or the smell of fresh bread. A text string can only refer to meaning or evoke a sense of meaning in an observer. It can never contain meaning entirely in itself.
Julia, isn't it in general true that philosophers don't care about falsifiability? What do you think will falsify the conceivability of zombies or the Chinese room? Are those also considered postmodernist philosophy?
ReplyDeletePpnl, I understand the thrust of her argument--my point is that the excerpt she chooses, "...neither stable nor unstable, but metastable..." is not a good example of a non-falsifiable assertion because metastability is actually well-defined enough to be quantifiable. Yes, whether the concept of metastability is meaningful in relation to "singularities-events" depends on what is meant by the latter--but one jumps to conclusions by already declaring meaninglessness without even defining the terms at hand. One must be ever vigilant against xenophobia.
ReplyDeleteAnyway, on a broader note, I don't think that postmodern texts are meaningless. In fact, the more I read that excerpt, the more I am able to begin producing a set of abstract visualizations in my mind's eye that correspond to the excerpt. (This doesn't happen with Ritchie; too many obvious non-sequiturs.)
I do however suspect that in the case of examples like the one you've chosen, their meanings (including the aforementioned abstract visualization) are extremely abstruse, highly subjective, susceptible to organization based on poetic/emotional rather than "rational" criteria, and of dubious use to people who are not scholars of postmodernism. Not that I have a problem with any of that, mind you.
Also, +1 to Brad's point.
Some questions I have in mind.
ReplyDeleteWould you think that pomo texts are theories? Or observations?
Are you taking a positivist stand or a Popperian falsificationist stand when you wrote the post? Seems like a (conflicted) mix of both to me.
I'm also uncertain why the claim is meaningless because it is consistent with literally all conceivable states of the world. Are you invoking the logic of predictability as a value when you made the claim? Which means that pomo texts are taken to be theories?
Anyway, interesting posts! I hope you'll continue posting, even if it's not on this topic. =)
It seems to me that the problem is not with the concept of "metastability," but with the nonsensical way Deleuze applies it.
ReplyDeleteSimilarly, you can say that he uses words like "heterogeneous" and "series," which are well defined in the English language, but become nonsense when joined the way he does.
Will you all just leave Deleuze alone and let him do his job, which is to figure out how stuff works.
ReplyDeleteHe is clearly doing that, and its not his fault if others do not understand what he is saying.
Don't have many of the definitions down, so somebody armed with wikipedia and/or good dictionaries get me all the definitions to that paragraph, and I promise within a month to turn around a reasonable analysis in vanilla English what he is saying.
.....errrr...let him do his job in his tobacco-filled heaven
ReplyDeleteI think the "word test" for understanding is extremely weak, as others have mentioned.
ReplyDelete1.) Such a test would require a "normalizing," as you say, of what is to come next. Such a normalizing is going to depend on the date, region, subject, etc., of the standards you wish to use. If you set the standards by postmodern texts, then I think your rate of entropy for any tested postmodern work would be very low in comparison to what you think would be appropriate for a meaningful work.
In other words, if I were to set the standards of the test by Mark Twain writings, and then test an additional Twain paragraph, again, we would expect that entropy rate to be low. Whatever "normalizing" procedure you use is going to determine the "meaningfulness" that the test finds. I would argue there is as much repeatability (not that that implies meaning) within postmodernism as you will find within any narrow subject.
To say, "normalizing" as any and all writing of today, in Western, "nonpostmodernish" academia, and then test a postmodern paragraph by that, then the entropy rate will be higher, but I am not sure that is saying much. Postmodernism believes it upsets the "normal" discourse that you believe all work should aspire to, and it finds great use in doing so.
This test would likely discount any theory that makes a serious revision to our ideas. For example, a scientist who legitimately found that the sun is made of fireflies would probably score high on the test, but his theory and writing could be the most intelligible, meaningful account of the sun, even though we thought it was gibberish. He would score high on the test until a significant number of people started writing that the "sun buzzes," and such a phrase entered our standard word order.
Iah,
ReplyDeleteActually I thing it is a very good excerpt to make the point. It is true that "metastable" is an actual word. It is even true that even the word combination "...neither stable nor unstable, but metastable..." could have some meaning in some context. But even if in a meaningful context it is a level of pretentiousness worthy of a horse whipping. Who talks like that except postmodernists?
And what is it that is neither stable nor unstable? "singularities-events correspond to heterogeneous series which are organized into a system"?
Yeah, ok. I think the king is naked and his sqrt(-1) is flapping in the wind.
In any case a theory has to make clear predictions in order for falsifiability to have any meaning. In that sense The Sokal hoax can be seen as a prediction and an experiment to test that prediction.
It almost looks like Deleuze is describing "complexity". In this way he is simply talking about self-organization. It is not all things at the same time, just the nature of the beast. I'm reminded of Gould's, Wonderful Life, and the impact this had on folks like Stuart Kauffman.
ReplyDeleteIt appears to fall in line with discussions on necessity and contingency. Basically what I read is we have a steady (stable) state that is derived from unpredictable variables which were necessary for the steady state. This give the appearance of an almost inherent nature of being stable and unstable. We are left without great predictive power due to the contingent nature.
Like the sand pile (not when we call it a pile of sand, different paradox). But the realization that the pile will only stand so tall (self-organized) but we don't know at what point this will happen precisely, we seem to know that we end up where the "next" grain of sand does not add to the height of the pile, but we can't have foreknowledge of which grain.
The complexity view of things makes falsification difficult.
Frankly, to assume that Deleuze is tlaking about Kauffman-style complexity theory is giving him way too much credit.
ReplyDeleteHaving read your comment I thought I would do some quick poking around. It does appear that Deleuze is expounding on "complexity theory", his main thrust seems to use it in social context. Your book, Fashionable Nonsense (through Google books) popped up as well and it shows a fuller reading of the paragraph, though he draws a correlate to biology. I must say, having read through part of that chapter of your book it appears you only say it's nonsense, but give no real reason why this is so.
ReplyDeleteAdmittedly I've only given myself 20 minutes or so of looking up Deleuze's connection to "complexity theory", but if you Google around as I have you'll see it's fairly easy to find.
Just a clarification. I think this blogger Raima Larter (don't know her work) does a good job of describing the sandpile I mentioned (see also, self-organized criticality). I was going on a long ago memory of how this works.
ReplyDelete"The grains tumble slowly down the sides of the sand pile, coming to rest at various points along the slope. The pile slowly grows as sand is dribbled onto the top until, suddenly, the system becomes unstable and a small avalanche occurs. A rush of sand tumbles to the bottom, returning the remaining pile to a situation where additional sand can be added--but only for awhile. Small avalanches occur at random intervals and it is never entirely predictable when they will happen.
The sand pile's SOC dynamics exist because this system has an attractor that is simultaneously stable and unstable. This type of attractor is known as a critical point."
I meant Alan Sokal and Jean Bricmont's book, Fashionable Nonsense. Don't know why I thought it was yours.
ReplyDeleteHi Weiye -
ReplyDeleteTo address your first point, from an informational point of view that is human-centric (no gods or other unknown influences, forward motion of time, etc...)
Scientist X
- receives info and processes it into an observation(s) in brain.
- sends to others a hodgepodge of words evocative of the observation
the hodgepodge will be taken as an observation by others. Call it art, if you like. If the observation has predictive utility, it can also be called a theory.
When ones opens the field to non-human agents, (neutral to exactly where they hang their hats), you have a more interesting situation. The observation formed in the brain can create the 'scientific law' by virtue of its existence and dissemination. I guess this kind of thing already exists in philosophy, but basically in science it works over time like this
1700s - invent some kind of mathematics
1800s - find useful applications built from understanding those mathematics
1900s - the 'natural world' (whatever that means) reacts to the applications of mathematics and CHANGES in response
So one X's ancestors, not that good in the chemistry lab but OK in math class came up with a mathematical theory. Where did it come from? He OBSERVED it. (probably a He back then, unless his sister or wife came up with it and she wasn't allowed to publish).
Therefore it was transmitted.
The mathematics could have been sent from the unknown, and transmitted using a medium not generally discussed in such blogs.
They could have been carried by symbols.
You know, letters and numbers, beliefs that carry well across broad spaces and lengths of time.
ppnl: Sure, call it the use of "metastable" here pretentious if you like. (Going to let that one pass; tired.) But I was only concerned with its *falsifiability*, which was the original reason Julia highlighted it.
ReplyDeleteMassimo: I don't know what evidence you're drawing upon to flatly call this use of "metastable" nonsensical. I'll cheerfully admit that I don't know what he means, because I don't know what he means by "singularities-events" and therefore whether or not metastability is meaningful relative to them. But that just means we lack information.
Likewise, what's wrong with "heterogeneous series"? It would signify a series of elements that vary over some characteristic. Size, shape, duration, flavor, whatever. It's under-specified, not meaningless.
I wrote, "one jumps to conclusions by already declaring meaninglessness without even defining the terms at hand." Do you disagree with this?
DaveS: Do your own legwork. :)
ppnl: You know, your line "I think the king is naked and his sqrt(-1) is flapping in the wind" is actually quite interesting in the context of our discussion. I understand it, but my understanding is entirely dependent on recognizing your allusions to the parable of the emperor's new clothes and the excerpt from Dawkins' article, linked in Julia's previous post, in which sqrt(-1) is equated to a penis. For a reader who does not have these referents, your statement is... meaningless?
Dave, you said:
ReplyDelete"...its not his fault if others do not understand what he is saying."
Actually, it is. If you're writing something, it's because you want other people to read it. And if you want other people to read it, you presumably want them to understand it too. Thus, if no one understands it, then you have failed to convey your message - it is entirely your fault.
Several people here and in response to Julia's previous post mentioned that it is not surprising that postmodernist writings are often incomprehensible, because after all they are technical, so it takes someone with a technical background to really evaluate them.
ReplyDeleteOk, I have a legitimate PhD in philosophy, and I still can't make much sense out of this stuff (while I can of even rather abstruse writings in analytic philosophy).
Moreover, as a biologist I could not only read papers in evolutionary biology (my specialty), but also those in molecular biology. Further hint that much (though not all) continental philosophy is, indeed fashionable nonsense.
Dr. Pigliucci,
ReplyDeleteI don't know if that's as big a hint as you seem to think. There are in fact people who can make sense of these philosophers, so perhaps the education you received didn't train you to read these philosophers. My question still remains, have you or Julia tried to contact experts on these difficult to understand texts?
I have, on repeated occasions, and they still don't make much sense to me.
ReplyDeletekpharri:
ReplyDeleteHe is not writing for everyone, he is writing for some. And because some do understand it, he has succeeded. By 'understanding' I mean his thoughts have resonated with the reader to the extent that the reader can forward an image of the paragraph to a 3rd person, using other words.
lah: OK - thought I was going to do my share of the work by hacking out an explanation in return for a wikipedia definition of each troublesome phrase or word, but I hear you.
Massimo, Julia and other anti-deLuzionals: OK, call it nonsense (again: that which makes no sense to you). Are you willing to listen to possible explanations? I think you are. Otherwise you would be no different than an anti-intellectual.
I spout what others call gibberish on facebook. Here is an admittedly tame example from Summer09:
" explored the Black River / Raritan R. headwaters region (in a car) and was trying to figure out events (when something happens, whats really happening?). Nailed neither events nor the region's mystique, but did come up with a cool visualization of "orthogonal forces" at work. The horiz axis contains peer objects (u,me, this rock, that rock). The vert axis contains a hierarchy of objects (...,atom...planet...) "
I am trying to graph all 'physical' (whatever that means) objects with axes radiating from any single point. It is a bad example and will not work, because the friction along the horizontal axis (rubbing of the two stones, interpersonal relationships) is not replicated along the vertical axis, with the possible exception of friction due to context switching. But the goal was to create a picture of an event, which is influenced by everything around it.
Will eat Magritte's hat if Deleuze is not doing something similar, he is just at a more advanced stage. Partly art, partly conjecture, a picture is being drawn.
I find the idea of falsification of pomo writing to be misguided. Falsification is about predicting the results of experiments. If I or anyone can do that then you would do well to look closely at what I write no matter how obscure. If I cannot do that then you are entirly justified in doubting me when I suggest that sex organs are equal to complex numbers.
ReplyDeleteScience is stapled to the real world by the idea of falsification. Without that connection reality is lost. For example forget about pomo and look at the Bogdanov Affair. It is a kind of inverted Sokal hoax. String theory is so distant from reality that there are few predictions to keep it connected. As a result nonsense was published as if it made sense.
In the end there is no magic procedure to tell us if a text is talking sense. The best we can do is check if it tells us something testable about the real world. Most pomo texts don't do that. Falsifiability then becomes irrelevant.
Still, if someone tells you that your dick is equal to the square root of negative one...
Massimo,
ReplyDeleteThe gulf between analytic philosophy and "postmodern" philosophy may be a little wider than you are giving credit to. If you had gotten your degree by reading nothing but Deleuze, Derrida, Foucault, etc., you would "get" more (if there really is something there to get) by a random paragraph by Deleuze than if you had had the background you did.
By my own experience, anyone here would also make a whole lot more of that paragraph (I made very little of it) if they had been reading the whole work in context, or had just read some other work thoroughly dealing with the issue and working in the language that Deleuze and PM's tend to work in. This is true certainly of any writing.
We would probably all be thoroughly baffled by a random paragraph of "scientology," but if we had spent a couple months reading L. Ron Hubbard it might make sense to us (we could understand what it was trying to get at), even if it ultimately makes no sense.
Sorry if that last sentence dirtied this blog.
Lyndon
"Partly art, partly conjecture, a picture is being drawn."
ReplyDelete...and that's pretty much all that needs to be said about that.
I'm surprised that no one has followed up on ppnl's point re: Shannon information theory. I'm particularly annoyed that in an essay attacking postmodern philosophy (whatever that is) as "meaninglessness" Julia would get such absolutely fundamental aspects of information theory completely wrong. If you are going to accuse others of being sloppy, it behooves you to at least try to get the basics right...
ReplyDeleteAs ppnl points out, Shannon information measures are a measure of *redundancy* NOT of meaningfulness. "Colorless green ideas sleep furiously" would be pretty much indistinguishable, at the word-level test, from "Unfounded suspicions of jealousy often return unbidden." Information theory, more generally, is about the relationship between three things: senders, receivers, and channels. What meaning (if any) is supposed to be conveyed by what the sender is sending is strictly irrelevant...
The suggestion that Julia write a little program to do the math is telling -- if she actually thought this information-metric was going to get her anywhere, she should at least have done the basic research to see what implementing it would have taken, and how different word-strings ordinary English in fact compare.
Consider again why we are convinced that random paragraphs drawn from the middle of physics papers are meaningful. NOT because an information-theoretic approach would show them to be 'like' ordinary sentences of English (or other spoken languages). Rather, we're convinced (with good reason) that the terms are well-defined, and the uses of the terms standard in those fields, etc.
We may not be convinced of this with respect to some particular philosopher. We may be suspicious and think that the terms are never defined carefully and that the claims cannot be made clear. We may be correct in our suspicions. But to show that requires more than grabbing random paragraphs from texts and whining about TAs from ones past. It requires, at the very least, making an honest effort to understand the texts, and being willing to talk seriously to people that think they do.
Jonathan
Jonathan,
ReplyDeletefair enough, but I'm getting also a bit annoyed by the recurrent defense of postmodernism: "if you really try to understand it, you'll see it's not garbage." Really?
I've tried to read some of these texts, and have talked to people who are allegedly knowledgeable, and it is hard to escape the sensation that one is talking to a bunch of new agers...
I still think that Alan Sokal's characterization of a lot (though not all) that is going on here is actually perfectly fair:
"When one analyzes [postmodernist] writings [on science], one often finds radical-sounding assertions whose meaning is ambiguous and that can be given two alternative readings: one as interesting, radical, and grossly false; the other as boring and trivially true."
"..the other as boring and trivially true."
ReplyDeleteThat's partly how I read the Deleuze paragraph. However, in lumping "postmodernism" into psuedoscience without specifically showing how the example is such is somewhat troubling.
As I've shown, it indeed appears Deleuze is following from "complexity theory".
Next, lets look at the words and offer a possible answer to Julia.
I think the above takes care of Julia's question (see my posts); "What conceivable evidence could falsify Deleuze's description of a system being "neither stable nor unstable"?
It's not so much a question of falsifiability and I don't think Julia was completely aware of the descriptions in "complexity", otherwise she would have refined her question. It is also a bit disconcerting since Massimo has said on this blog that philosophers have moved beyond falsification where some scientist lag behind, said confidently and in a positive accepting context.
Delueze is using the word 'Singularity' to mean: "the specificity of a particular component or assemblage, its special, distinctive quality, as well as its infinite potential."
So, Julia's question: "How would we know if singularities do NOT possess a process of auto-unification?"
Again, what Delueze is doing in this instance is descriptive. It is not a claim to knowledge beyond what is understand in a specific domain.
I think psuedoscience is dangerous, but so far this exercise appears unnecessary and lacks basic research ability.
Jonathan:
ReplyDeleteI am also suspicious about making grand conclusions based on a test such like that one with the "entropy" of a text. Nevertheless, I do not understand Julia to see things as simply as you interpreted it.
Her idea that there may be one end of the range where you have random gibberish, and the other where you have repeated mantras, with sensible discussions and development of ideas in the middle, sounds reasonable to me. The problem is largely that this middle would be a very wide field with large areas of gray intergrading especially into the random gibberish end - it would be hard to draw a line. It would be different for different languages and areas of writing (think kitchen receipts vs. novel, philosophy vs. pharmacology), so that it would have to be carefully calibrated. It would also clearly be possible to artfully produce a text that falls into the seemingly meaningful middle without carrying meaning, and if PM manages exactly that, we have shown nothing.
To put it another way, this would be a one-way-test: it would be easy to show something as repetitive mantra (ever read the Quran? all those god-is-all-generous, god-is-all-wise'es all over the place...) or random noise, but if the test fails, that does still not mean that the text does contain meaningful ideas. But this asymmetry is a common situation in the natural sciences.
You are of course right that in the end, you will have to try to understand the texts if you want to be sure. I understand that Julia knows (and tried). But what harm is in speculating about shortcuts? I at least enjoyed these thought experiments and wonder about writing such a small program if time ever allows, which it admittedly does not at the moment, to compare different sciences or novelists for their word predictability or suchlike. At the very least, it could easily show if somebody uses the same formulaic phrases all the time.
Massimo's, "The Limitations of Falsification"
ReplyDeletehttp://tinyurl.com/ydmarbz
Even though this was established to some extent, it is possibly worth noting.
Metastable: continuing in its present state of equilibrium unless sufficiently disturbed to pass to a more stable state of equilibrium
In a way I suppose we could ask Julia how she would falsify "complexity theory". Maybe she would find an answer to her questions.
I think Deleuze may suffer from a healthy dose of "physics envy".
ReplyDeleteJonathan, you're right when you say that "'Colorless green ideas sleep furiously' would be pretty much indistinguishable, at the word-level test, from 'Unfounded suspicions of jealousy often return unbidden.'"
ReplyDeleteBut the test I was proposing would be using a large sample of postmodernist texts taken together -- not a single sentence. The point was that in meaningful English, certain combinations of words tend to occur and others don't (e.g., you would mostly find the word 'green' modifying concrete nouns, not abstract nouns like 'idea')... and my hypothesis was that in the body of pomo texts, there is much less order in which combinations of words appear together. That's why I predicted that the body of pomo texts would have an entropy level closer to randomly generated gibberish than other English-language texts.
What does "combinations of words appearing together" have to do with the meaning of those words? In order to determine if something is gibberish (aka, meaningless) you can't use information theory, which doesn't care one cent about the meaning of a message.
ReplyDeleteJulia,
ReplyDelete--"The point was that in meaningful English, certain combinations of words tend to occur and others don't (e.g., you would mostly find the word 'green' modifying concrete nouns, not abstract nouns like 'idea')... and my hypothesis was that in the body of pomo texts, there is much less order in which combinations of words appear together."--
Except you have just used such a combination to make an intelligible point.
The thing is any narrow field tends to use words and word combination that do mot appear in more normal texts.
Also I would suggest that finding -less- order in pomo texts is also consistent with your thesis. After all if someone is talking sense the complexity of the text may loosely depend on the complexity of the ideas being expressed. If they are talking nonsense they have to fake randomness to achieve the same complexity. Randomness is hard to fake. Patterns tend to appear when humans try to fake randomness. This may tend to make nonsense more ordered.
So as I said I don't think you can draw any conclusion at all.
"In meaningful English, certain combinations of words tend to occur and others don't".
ReplyDeleteYou can't just baldly assert this, just from your feeling that it is so. Is there any mathematical rigor behind your notion of "tend to"? Are you expecting word co-occurrence patterns in academic papers on mathematics to resemble those of blogs about pet dogs?
"I'm willing to bet that pomo would be a high-entropy outlier." I suggest that an actual bet be established, but going through with some exchange of money is not important. Just attempting to write a program to determine the outcome of the bet will reveal numerous weaknesses in this alleged connection between information-theoretic entropy and meaningfulness, as has now been pointed out by pyridine, ppnl, Lyndon, Jonathan, Mintman, and Eric.
Jonathan,
ReplyDeleteIts cool that you get my point about information theory. But I really have to hang with Massimo on post modernism.
The fact that you cannot use either Shannon information or falsifiability to prove that pomo is nonsense does not rescue it from the ash heap of history.
It is a fools errand to try to formally prove a text is meaningless. It is a bit like trying to prove that a sequence of numbers is random. It simply can't be done in the general case.
It is up to those who defend pomo to positively show the worth. They have failed to do that. They have failed to honestly try.
I have a finite amount of time. I cannot spend it trying to understand every crackpot idea. I need a kind of spam filter so I can concentrate on what is more likely to be useful and important. Pomo jams in my my spam filter worse than porn. I think the probability of a 411 scam actually being legit is better than the probability that pomo has meaning.
Pomo is a verbal inkblot. It is form without substance. It satisfies our instinct to look for patterns but does not use those patterns to make specific points. I could accept and even admire it as a work of art. But as a statement about something real it fails. I cannot prove that it is nonsense but I have better sense than to try.
In my opinion anyway.
Ppnl, you wrote: "I can generate a meaningless string with any entropy characteristic you want. Therefore there cannot be any necessary connection between the entropy of a string and meaning."
ReplyDeleteI agree with the first sentence, but it does not logically imply the second. Just because you can be meaningLESS at any entropy level doesn't imply you can be meaningFUL at any entropy level. Clearly, at the maximum and minimum entropy levels, a text cannot have any meaning (words would be either perfectly repeated or randomly distributed, as I explained).
Ppnl also claimed that entropy at the word and letter levels is the same. I don't think that's the case. You can think of word-level entropy in a text as the average number of bits of information learned from each additional word in the text, which is clearly different than the average number of bits learned from each additional letter in the text (there's much more uncertainty about what word will come next, since there are thousands of possible words that could come next, as opposed to just 26 letters).
All that said, I do agree with everyone that it's hard to draw firm conclusions from an entropy level in between the two extremes. I think an unusually high entropy level would be an "approaching gibberish!" red-flag, but I'll readily acknowledge that it wouldn't be any kind of airtight proof.
Lah objected to my assertion that "In meaningful English, certain combinations of words tend to occur and others don't," and replied, "You can't just baldly assert this, just from your feeling that it is so. Is there any mathematical rigor behind your notion of 'tend to'?"
ReplyDeleteWell, no, I haven't done the calculation with a large sample of English-language text yet. But do you really not expect the words "tall" and "building" to appear together more often than "tall" and "lake"? (You'd have to control for the unconditional frequencies of "building" and "lake," of course.)
"Clearly, at the maximum and minimum entropy levels, a text cannot have any meaning"
ReplyDeleteI think you're mistaken here. A well-compressed text file will come very close to maximum entropy. So by the above logic, compressing a file would make it less meaningful.
I do expect "tall building" to occur more often than "tall lake." But I also expect "tall building" to occur more often (controlling for absolute frequencies) than "consensus building", or "hideous building", or "tall Swede", or "tall bike". At what point do you draw a threshold of meaning based on frequency of occurrence?
ReplyDeleteIt does your argument no good to pick a toy example like "tall lake"; by picking a nonsensical phrase to begin with, you're begging the question.
It may be interesting to note here that Amazon.com uses "statistically improbable phrases" from books to give a quick indication of what they're about (e.g. the Inside This Book section at http://bit.ly/8zktsu). This would appear to be in direct contradiction to your assertions about word occurrence.
"I agree with the first sentence, but it does not logically imply the second. Just because you can be meaningLESS at any entropy level doesn't imply you can be meaningFUL at any entropy level. Clearly, at the maximum and minimum entropy levels, a text cannot have any meaning (words would be either perfectly repeated or randomly distributed, as I explained)."
ReplyDeleteThese are bald (and false) assertions. Of course you can produce meaningful texts of any entropy level. And why would you suggest that perfectly repeated and randomly distributed texts can't be meaningful? It simply isn't true.
Eric,
ReplyDeletethat strikes me as a bold assertion. Would you care to give an example?
Scott gave an example of a text with high entropy and meaning: a compressed file. Here's an example of a low entropy text with meaning. Let's say I'm trying to determine if a coin is fair or not. I flip it 10,000 times and record the data. Let's say that it is completely biased - every single toss is heads. This low entropy message ("HHHHH...H") is meaningful in that it tells you the coin is biased. If the message was high entropy ("HTHTHTHTHT...HT"), then it would still be meaningful since it would tell you your coin is fair. You could do the same thing for any entropy value you want and it would be meaningful.
ReplyDeleteA clarification to my last comment: Obviously ("HTHTHTHTHT...HT") would have low entropy. I wrote it that way to show it was a fair coin. I assume a fair coin won't alternate heads and tails.
ReplyDeleteJonathan wrote: "Consider again why we are convinced that random paragraphs drawn from the middle of physics papers are meaningful. NOT because an information-theoretic approach would show them to be 'like' ordinary sentences of English (or other spoken languages). Rather, we're convinced (with good reason) that the terms are well-defined, and the uses of the terms standard in those fields, etc."
ReplyDeleteRight, but what I'm suggesting is that if a term is well-defined -- i.e., it has a meaning that is widely agreed-upon -- then it won't make sense paired with just any other word. A noun in physics -- like "wave" -- has a specific meaning and so it wouldn't make sense to precede it with just any old adjective or verb (i.e., it wouldn't make sense to talk about a "transcendent wave" or to say "the wave darkened.")
My hypothesis is that if your terms have well-defined meanings, they will make more sense paired with some words than with others, and so in a meaningful text you should be able to make better-than-chance predictions about what other words surround a given word.
I think Eric is right. The problem with this conversation is that any string of symbols is meaningful if there exists a code with which to interpret it.
ReplyDeleteFurthermore, in the absence of such a code, we can easily invent one by fiat. "Agleblarg," for example.
Maybe what we need here is a more precise word than "meaning."
Julia:
ReplyDeleteBut where in your attempt at analyzing texts strictly statistically is there room for the notions of "widely agreed-upon" or "in physics"?
Your examples are not nonsensical if you allow broader definitions of "wave": a "transcendent wave" could be a wave of transcendent feeling, and "the wave darkened" could indicate that a group of people in stadium seating doing the wave flipped over their placards from a bright color to a dark one.
You cannot simply assume that there is some easily quantifiable body of English standard language by which to judge texts. Hand-waving about doing quantitative corpus analysis is no substitute for actually doing so; in fact, you're opening yourself up to accusations of the same abuse of scientific terminology with which you charge the postmodernists.
Eric, that example doesn't convince me. The coin flipping series may carry information, but not 'meaning' in any sense that is, well, meaningful in the context of this discussion...
ReplyDeleteScott wrote (and Eric cited): "A well-compressed text file will come very close to maximum entropy. So by the above logic, compressing a file would make it less meaningful."
ReplyDeleteIt's true that compressing a text will increase its entropy, but that's equivalent to translating it into a different language -- it wouldn't be English anymore. You can compare an English-language pomo text to other English-language texts... or, you can compare a compressed pomo text to other compressed texts. The "language" would have to remain constant for the comparison to be useful, right?
Julia:
ReplyDeleteI would bet a lot of money that if you compared a compressed pomo text to a compressed "meaningful" text, you'd get two strings of symbols with practically indistinguishable entropy.
So if you're right that comparing the entropy of two English-language texts would tell us something about how much meaning they contain, then there must be something special about English (and presumably other human languages) that generates a correlation between meaning and entropy. That would be interesting. But I'm skeptical.
Eric:
Sorry but I'm with Massimo regarding your example above. Low entropy texts can't contain a lot of meaning, because they don't contain a lot of information. But I don't think the argument is (or could be) that pomo texts have low entropy.
That said, even low-entropy texts can be meaningful because we human beings are endowed with certain skills, such as making up definitions for previously undefined strings of symbols.
Julia,
ReplyDeleteLook at how compression algorithms work. You start with a text with an alphabet. Initially each letter is encoded with the same number of bits. With ASCII it is eight bits for example. But some letters occur with greater frequency than others. This lets you squeeze out redundancy by using fewer bits to encode common letters and more bits to encode rare letters.
But also some pairs of letters occur more often than others. You can recode each pair of letters as a single character. Then you repeat the trick where common characters are recoded with fewer bits and rare characters are recoded with more bits. Ditto three letter combination, four letter combination and so on.
Now think, a word is just a combination of letters that occur more frequently. Our algorithm will naturally detect and compress information at the word level while acting on the character level. There is no difference.
Look up Huffman encoding and see how it will naturally pick up words and even word combinations as statistically more likely letter combinations. It does this without being told that there are words.
Now imagine I'm trying to make myself understood over a noisy phone connection. One way to do this is simply to repeat myself three or four time. OTOH if the phone connection is very clear I can speak rapidly skipping some words trusting that the meaning will be clear from context. Both low and high entropy messages carry the same information and presumably the same meaning. You can infer nothing from entropy alone.
Again, the standard for determining how often words appear together is going to be an odd indicator of meaning/understanding . . .
ReplyDeleteTake for example: "Hot December"
An analysis of Australian text would imply under this analysis more meaning than an analysis using European and American English texts. Only an example, but the extension of this problem can go across the board.
The problem in the entropy analysis is the claiming greater meaning based on what we "normally see in coherent English writing."
I think one of the main goals of some Pomo is to show that the "normalizing" of mainstream beliefs (2010 American/Euro centered) can be problematic to certain people, and also entrapping us in a way, a culture, that we cannot see nor get out of. Trying to brush off PMism in such a simple entropy test would only reconfirm why they believe the norms of the West must be destabilized, even through obscure writing, if others are so dogmaticaally persistent in reproducing such norms and believing that their singular system defines truth or "proper" meaning or how things should be written in order to be understood.
(For the record, I consider myself a social constructionist, (I like Berger and Luckmann and Judith Butler) but not necessarily a PMist, although I do think their are important things we can find in them. At the same time, I am an ardent Naturalist, finding this blog from Naturalism.org)
Massimo,
ReplyDeleteI will gladly consider any arguments Julia puts forth for her idea that max and min entropy texts have no meaning. I gave one example showing she is wrong and feel no need to respond further until she makes an argument.
Julia,
Your claim was that at max and min entropy, texts have no meaning. That is blatantly false and Scott's and my example show that. A "well-compressed text file" is a "text", it is close to maximum entropy and it has meaning. It doesn't matter if it is in english or not. If your claim is that english texts with max and min entropy have no meaning, I'd like to see your argument for that.
Eric,
ReplyDeletesorry but again I really don't buy the example of the coin sequence, you'll have to come up with something better.
As for compression algorithms, people, remember that they change the entropy, but probably not dramatically (not without distorting the original meaning), and that not all texts are equally compressible. While I don't necessarily buy Julia's original argument, it doesn't seem to me that she is arguing that there is a linear correlation, or even a simple monotonic function, relating meaning and entropy.
I find the claim by ppnl that entropy has *nothing* to do with meaning hard to swallow. After all, entropy is related to information, are we going to say that information and meaning are unrelated?
Should we even be talking about meaning without having established a concrete definition for it?
ReplyDeleteScott,
ReplyDelete"Low entropy texts can't contain a lot of meaning, because they don't contain a lot of information. "
Meaning is irrelevant to information theory. Do I really have to quote Shannon on this? Very well. This is from the 2nd paragraph of his seminal article:
"The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem."
You will find absolutely no mention of meaning in the standard text books on information theory (Cover and Thomas, for example). You guys and gals can protest all you want, but it just shows your ignorance of the subject.
Shannon information theory is concerned with the transmission of signals. The content of those signals is irrelevant to information theory.
Yes, I think information and meaning are unrelated. I think so because entropy is a purely intrinsic measure. The entropy of a string of symbols is always the same under any circumstance, but the meaning of a sentence depends on circumstances. For example, if I say "shut the door" to you, I probably want you to shut the door nearest to you, be it the door of an oven, a car or an apartment; and if there's no door anywhere near you, then I've uttered a meaningless phrase, or at most a metaphor. But under all those circumstances the entropy of the phrase "shut the door" remains the same.
ReplyDeleteI'm realizing, though, that this kind of reasoning calls for careful consideration of the term "information." We've been using "entropy" and "information" somewhat interchangeably, but maybe what Massimo means by "information" in the above post isn't the same thing as entropy. The Stanford Encyclopedia of Philosophy has something to say about the distinction I'm talking about, although in my opinion it's not their best article overall.
Finally clicked last night as I was trying to fall asleep.
ReplyDeleteKey error: The entropy under discussion here is a measure of the predictability of word variation. This has nothing to do with which words are being used.
Simple illustration: Take any English text. Transform it by replacing each letter with the one that follows it in the alphabet (g->h, h->i, z->a, etc). The text has exactly the same entropy before and after transformation, but completely different meaning as English.
Massimo,
ReplyDeleteEnglish text can be compressed by about 60%-70% if I remember correctly. Thats pretty substantial.
You said:
--"I find the claim by ppnl that entropy has *nothing* to do with meaning hard to swallow. After all, entropy is related to information, are we going to say that information and meaning are unrelated?"--
Until you define meaning the question has no meaning. You feel that there must be a definition to the term meaning out there for us to discover. But you are being Platonistic here. Meaning is a word that we define for some purpose. There may be no useful definition that fits your intended purpose here at all. Or maybe there is. Unless you invent such a definition the question is ill-formed.
You depend far to much on intuition for things like meaning, information, data processing and such. This is where philosophy goes wrong. Without precise definitions we can't even agree on what we disagree on.
I guess Julia made a mistake in mentioning Shannon and information theory without writing something along the lines of "I know that this is not the same, but it might be comparable" in flashing, bold letters. Maybe she does not know enough about information theory, fine, neither do I, admittedly, but again, please consider carefully the following things:
ReplyDelete1. This is about words, not about letters.
2. This would be a one-way test (however you say that in English, technically) allowing you to say "this is definitely noise", but if it does not show up as noise, it could still be noise, only you cannot show it with this test. There are other examples in "real science", where you can say some successful test shows a certain genetic structure to be hybridization, but if the test fails, it can still be either hybridization or incomplete lineage sorting. So all who say that this would not be decisive: yes, we know, but at least it would be in one direction.
3. And yes, in one direction it would be. We are not talking about a hypothetical number of bits that could be compressed, we are talking about an English (or French) text. This whole thread about compressibility is irrelevant. This is clearly a less meaningful text than this one, and the same goes at the other (in this discussion more interesting) end where we only have randomly selected terms stuffed into an obfuscating sentence, and this would have to show up in word entropy if the text sample is sufficiently long*. The question whether you can compress English or whether a toss of a coin is meaningful under certain circumstances simply does not apply, because we would be comparing different uncompressed English texts supposedly meant to convey ideas.
* For the decision on what is long enough, I would suggest a calibration with saturation curves selecting random parts of varying lengths from a larger text to see at which lengths the predictability is close enough to the overall value for the text in its entirety. It seems obvious that one paragraph is not long enough for that kind of analysis.
From the sidelines, I notice that the arguments against high correlation of info entropy and meaninglessness appear to be stronger than those in favor. As expected.
ReplyDeleteSo when Deleuze talks about singularities that "possess a process" and of "differences" being "distributed" in an "energy," those are arrangements of words which you would not normally see in coherent English writing.
ReplyDeleteJulia, this proposal is, for philosophy, downright dangerous. That is because--assuming it does what you say it will do-- it will privilege as "meaningful" only those texts which literally say the same sorts of things that previous texts have.
Forget the "pomos": suppose an author comes along who thinks that our patterns of speech are inculcating false and contradictory sets of ideas in our minds. The author concludes that a new way of speaking is precisely what is needed. Such a work would undoubtedly count as relatively meaningless under this mode of analysis.
Yet, relative to its own conventions and definitions, it may well be both incredibly rich and philosophically fruitful.
For this reason alone, I think the idea that philosophy should be shackled to linguistic convention is badly mistaken.
Eric:
ReplyDeleteI'm actually quite aware of Shannon's opinions on the subject, and I agree with both him and you: the measure of the entropy of a sentence is independent of its meaning. What I'm saying is that there's a case to be made that a high-entropy sentence can deliver more information, and therefore potentially more "meaning" -- as long as all the information delivered is meaningful.
It's like saying a high-bandwidth connection can deliver more content, faster, although of course it can also just deliver more noise.
Still, a low-entropy sentence can carry some meaning.
My objection to your heads/tails example had more to do with the fact that the string "THTHTHTH" isn't really about anything, in the way we think of meaningful sentences as being about other things. Perhaps you could say that "THTHTHTHT," placed in the correct context, is "about" the coin; I think that might be accurate. So in that sense, a low-entropy sentence could have meaning.
But in your original example you used "meaning" in a slightly different, and I think rather more metaphorical, sense -- as in: "The palm trees are swaying, which means it's windy." I think that's a more metaphorical use of the term -- related, certainly, but not identical to the use we're talking about here.
ppnl:
ReplyDelete"Until you define meaning the question has no meaning."
I half-agree with your above post, but I don't think "meaning" is the problem. Although it's hard to pin down precisely, I think we have pretty good intuitions about what "meaning" is: a meaningful sentence is a sentence that is about something else -- a sentence that has a definite referent.
That's not to say that we'll all agree on whether a given sentence is meaningful! But that's because meaning is very context dependent.
I'm more skeptical of our intuitions about information and entropy...
Scott,
ReplyDeleteFirst, most of my last comment wasn't directed completely at you. More to Dr. Pigliucci than anybody.
And I agree with you that high entropy sentences deliver more information (the have more uncertainty) but whether those sentences are meaningful has nothing to do with information theory.
"Perhaps you could say that "THTHTHTHT," placed in the correct context, is "about" the coin; I think that might be accurate. "
But how is that different than any other text (in English or any other language). In the sequence "HTHTHT...", "H" means "I flipped a coin and it landed heads" and "T" means "I flipped a coin and it landed tails." But this is no different than what we do in the English language. In the sentence "The tiger is running", "tiger" means "a mammal with orange and black stripes and a tail and ...". You could go on and define all the other words too.
I guess I just don't see the distinction you are making.
It is very odd for someone trained, as I was, as an analytic philosopher, to be in the position of "defending postmodern philosophy." And the funny thing is, I'm really not. I have no temptation to read and try to under Deleuze - I agree with ppnl that my time is limited and valuable, and I'm not going to use it trying to understand texts that seem to me to be unclear and confused.
ReplyDeleteSo I'm not arguing that pomo philosophy is coherent, or meaningful, or deep, or anything like that. Rather, I'm suggesting that there really isn't any such thing as "post-modern philosophy" and that lumping together a bunch of french post-structuralists, a bunch of strong-sociology of science people, and a bunch of random philosophers & historians who you happen to disagree with, is intellectually dishonest.
And again, not everyone tarred with the "post-modern" brush deserves the criticism. Is Baudrillard often a bit of a prick? Sure. Does he argue for points that are neither trivial nor obviously false? Sometimes at least, yes. Lyotard? Similarly. Habermas? Not only isn't he pomo, I think he's a full-blown modernist. I may not love his writing style, but he's making clear, focused, clear arguments (his exchange w/ John Rawls is illuminating in this respect) Foucault? As I already mentioned, lots of arguments that are for conclusions that are neither trivial nor crazy.
The main point I want to make is that there isn't any short-cut to figuring out if a text is meaningful, let alone important. I am inclined to agree that if you make a 'good faith' effort, and no one can convince you that something important is going on (and they are making a good faith effort), there probably isn't anything important going on. But that's not a "test" of meaningfulness.
Sigh.
Jonathan
Hello everyone, I'm a mathematician who has spent some time studying information theory in the past, and I wanted to share with you my thoughts about this post. Unfortunately, I think a number of commenters have misunderstood Julia's proposal. I would like to try to clarify (for anyone who is interested) why texts with maximal entropy, and texts with 0 entropy are both in some loose sense "meaningless" (in the latter case, "informationless" would be more accurate), in accordance with her claims.
ReplyDeleteReal world languages are tricky to deal with for a variety of reasons, so for illustration purposes I will limit myself to a fictitious language called Moof, which contains only ten possible words (in other respects though, I will assume it is like other ordinary languages).
Now, let's say that we analyze a very large text written in Moof (of, say, a million words) and discover that this text has (to close approximation) maximal entropy per word (i.e. the number of bits we learn about what is in the actual text with each word that we are shown is as large as it can possibly be). In this case, that implies that the text was (to close approximation) generated by picking words one by one completely (uniformly) at random, and that each new word was selected (to close approximation) without regard for what words came before it. The reason that this is so is because the only random process (outputting a word at each time step, say) that has maximum entropy is one that assigns equal probability to each word, and for which each word is independent of what came before. Not that this would require a decent method (for very large texts... its essentially impossible to do well for small texts) of estimating entropy per word (of the underlying generating process).
Okay, but what does that REALLY mean? Does it mean that every sentence in the text is meaningless? No, absolutely not, because even randomly generated sentences occasionally have meaning. In fact, it may even be the case that a great many of the sentences are meaningful to human beings (perhaps as bizarre, poetic statements, like "lizards free frantic dead"). The important conclusion that we can draw from the fact that it has maximum entropy per word is that the text (taken as a whole) is not comprised of intentional meaning. There are a few ways one can think about why this is the case:
(see my next comment for continuation…)
(…continued from last comment)
ReplyDelete1. Since each word that was chosen while generating the text is independent from the words that were just written before, it is as if the author got amnesia after writing each word (or, might as well have). How in any intentionally meaningful writing could it possibly be the case that the next word an author writes does not require knowledge of the word that he just wrote beforehand?
2. Since words are independent of each other, that implies that the author would have been just as likely to write the given text with the order of its words completely scrambled, as he would be to write the original text. But since word order genuinely matters in languages, and since most word orders lead to gibberish (try scrambling the word order of a few English sentences), that means that the author would be just as likely to write any particular gibberish reordering of the given text as they would be to write the original text itself. But since there are many more gibberish reorderings than meaningful ones, we should expect the author's process to generate mostly gibberish.
3. Since the author was essentially selecting words uniformly at random, they were as likely to write any particular sentence as they were to write any other particular sentence. How intentionally meaningful could a writer's work be if the writer could be effectively replaced with an incredibly simple algorithm that just strings together words at random?
Hopefully, you now believe me that if a large real world English text truly had maximum entropy per word, it would indeed be meaningless (just like maximum entropy Moof texts). On the other hand though, maximum entropy per word English also violates the rules of grammar. To use such a technique to evaluate grammatically correct (but possibly still "meaningless") texts the technique would need modification. One would have to somehow normalize the entropy to deal with the extra structure imposed by grammar. Then, you would be measuring how much entropy a text has compared to that of grammatically correct (but not intentionally meaningful) text. One thought about how to do this is to compare the entropy per word of the text under consideration to the entropy that would be achieved if you generated random grammatically correct sentence structures (according to the frequency with which they occurred in the text) that had words with the proper parts of speech placed within them like a madlib (according to the frequencies with which each of these words occurred in the text, but without ever considering what came before a given word). The technical details of how one would actually pull this off, and of how one would estimate the actual entropy of the process underlying a large text (i.e. essentially the entropy per word of the text... I've been sloppy with terminology throughout this comment for convenience) are tricky and would require some serious thought.
Now, on the flip side, why would a 0 entropy per word Moof (or English) text be uninteresting? Well, if according to person X a text has 0 entropy that implies that before reading each word, X can predict exactly what that word is going to be. That doesn't imply that the text is truly meaningless (in the sense of actually having no meaning), but it does mean that it conveys no information to X (it tells X nothing that X didn't already know). Essentially, this is only the case if each word in the text is completely determined by the words that came before it (from X's perspective). A real world example of this would be if the text consisted of a poem that person X already knew by heart.
I hope that this helps to clarify things!
Jonathan,
ReplyDeletewhile I agree with you that "postmodernism" is a vague term, frankly so is "analytic philosophy," and one can find gibberish, or at least irrelevance, in both camps.
But I don't think it is fair to say that there is *no* postmodernism, or that some authors aren't clearly more representative than others of that way of thinking, or that some of these authors aren't quasi-nonsensical and/or irrelevantly obfuscatory.
Yes, Foucault has written plenty of interesting things. He has also written things that make little sense, and he has done both using language that rarely should be seen in a philosophical essay.
As Witty put it: "Philosophy is a battle against the bewitchment of our intelligence by means of language." (Of course, he himself was rarely a good example of clear language, but that's another story...)
Scott,
ReplyDeleteYou should be skeptical about intuition about information and entropy. Fortunately we don't need intuition about them. We have exact mathematical definitions. And the implications of those definitions is very counter intuitive.
The problem with meaning is that we have no quantitative definition. All we have is intuition and that is often very very wrong.
Machine code is an example of very high entropy information. If by meaning we mean the complexity of the logical connections that determine what a given bit does then arguably machine code is much more meaningful than any English text of a similar size.
But then a random string of binary digits would have at least as much information. But meaning? I just don't see correlation between entropy and meaning.
If anything I would argue that pomo texts have lower entropy than expected. Take the example:
"...neither stable nor unstable, but rather metastable..."
You have "stable appearing three times with two different prefixes and some connecting words. You could replace the whole mess with one word "metastable" and the sentence would have the same meaning. If it means anything at all. This verbal excess reduces the entropy of the text. It has no effect on meaning. And this kind of verbal excess is pretty much what defines pomo texts.
Another way to think about it is to think of a pomo text as a verbal ink blot used in a kind of linguistic Rorschach test. You don't need to draw a high resolution picture of something specific. You only need to use a little bit of information and a great deal of style to make it seem like you said something profound.
Yet I don't think low entropy is diagnostic of meaningless crap any more than high entropy is. You can say something meaningful with low entropy text. It just takes longer.
Jonathan,
ReplyDeleteDude I think you have nailed everything that needs to be said on this subject.
This comment has been removed by the author.
ReplyDeleteClockBackward,
ReplyDeleteIt seems to me that your argument is moving in the wrong direction. Your three propositions read, to me, as an argument that a randomly-generated text will not have much meaning. I agree! The question is whether it is possible to non-randomly generate a high-entropy text with meaning. This would be a "pseudo-random" text.
1) "...each word that was chosen while generating the text is independent from the words that were just written before..."
I assume you mean statistically independent, right?
"...it is as if the author got amnesia after writing each word (or, might as well have)..."
Quite so. But say the author didn't get amnesia. Let's say (for argument's sake) that the author has decided to defeat our vacuity-testing algorithm. After every word, she gets of a list of those words that are statistically independent of the previous words she has written and intentionally choses one that generates a meaningful sentence.
2) "Since words are independent of each other, that implies that the author would have been just as likely to write the given text with the order of its words completely scrambled, as he would be to write the original text."
Again, I agree -- if the text is randomly generated! But you still haven't ruled out the possibility of a high-entropy text being generated in a non-random way.
3) "How intentionally meaningful could a writer's work be if the writer could be effectively replaced with an incredibly simple algorithm that just strings together words at random?"
Again, this is a question-begging formulation. There's no reason to imagine that the writer could be effectively replaced with a random algorithm. What we have here is a situation in which, yes, there are many possible random orderings of words, but a few are meaningful as a whole, and our author manages to find one of them.
The matter of English being inherently redundant complicates these issues somewhat. But I'm still fairly sure that the process I've described here would generate an English text that would approach the theoretical maximum entropy for an English text.
People might feel that the constraint of having to choose only statistically independent words would be too severe to allow for meaning, but I would be surprised if that were so. There's a novel (written first in French, translated into English as A Void) that contains not a single 'e'! I understand it reads quite smoothly.
My final remark is on statistical independence. As I understand it, that's how entropy is measured. So the sequences generated by pseudo-random number generators are still high-entropy. The difference is that they have low algorithmic entropy, which is a horse of a different color altogether.
ClockBackward:
ReplyDelete"The technical details of how one would actually [normalize entropy to account for grammar], and of how one would estimate the actual entropy of the process underlying a large text... are tricky and would require some serious thought."
Yes, that's the point. The obvious cases on the ends of the spectrum are insufficient to prove anything about the messy middle. The trvial disproof I wrote earlier still holds against this grammar restriction, with minor modification:
Take any "meaningful" text, and swap all instances of a given word or phrase for another which does not violate grammar rules but renders the text obviously "meaningless". Replace "text" with "camel" in your own post, for instance. Now we have two texts, one "meaningful" and one "meaningless", with identical entropy.
-----
ppnl: "This verbal excess reduces the entropy of the text. It has no effect on meaning." I disagree, on the grounds that because text is descended from spoken language, cadence, rhyme, emphasis, cadence, repetition and other such characteristics which are not strictly information-carrying ("information" here in the conventional sense) still affect how the text will be received by a reader. I know this has nothing to do with the main thrust of the argument, but I couldn't resist a quick defense of rhetoric. Please excuse my pedantry. :)
-----
Jonathan: "There isn't any short-cut to figuring out if a text is meaningful, let alone important." Yep.
Hi Scott,
ReplyDeleteIn response to your criticism of my explanation of why maximum entropy per word texts do not have intentional meaning:
"Your three propositions read, to me, as an argument that a randomly-generated text will not have much meaning. I agree! The question is whether it is possible to non-randomly generate a high-entropy text with meaning. This would be a "pseudo-random" text."
Unfortunately, responding to this requires delving a little bit into the subtleties of information theory.
While it can be convenient to use the shorthand phrase "the entropy of a body of text", you cannot actually measure the entropy of text, you can only measure the entropy of a probability distribution, or more generally, of a random process (for example, one that generates words, one at a time). What you can do though is use a text to estimate the characteristics of the random process from which that text was created, and then use that knowledge to estimate the entropy of the random process (which, for convenience, we'll call the "entropy of the text").
If the random process from which a text is drawn has maximum entropy, then the text really was generated uniformly at random, and the generator of that text had amnesia (or might as well have had it) in the sense that each word written required no knowledge of the words written beforehand. It is absolutely impossible to generate a text using a random process that truly has maximum entropy, and have the text have intentional meaning.
However, as mentioned, since all we can do from any particular text is ESTIMATE the random process used to generate the text, at best all we could produce is entropy per word estimates (hopefully with confidence intervals). This procedure would require a choice of estimation algorithm (used to estimate the entropy of the underlying process from the text itself), and hence we might get slightly different results depending on our choice of algorithm. Fortunately this may well still be sufficient to perform a test like Julia proposed.
I agree with you in the sense that, for a given fixed estimation algorithm, it may be possible (though extremely difficult and tedious) to design a meaningful text that appears to that algorithm to have maximum entropy. But this is not actually as damning as it seems. If the text under consideration is very large, and the estimation algorithm used is good, it is EXCEEDINGLY unlikely that a genuine English text created with intentional meaning would produce an estimated entropy per word very close to the maximum possible (even if it is indeed hypothetically possible). For that to happen, the author's use of each word would have to appear to be independent of the words that came before that word, which is (approximately) never true of texts with intentional meaning.
Hence, Julia’s test as I interpret it (when made specific and formal), is a one way, statistical test for very large texts (or bodies of them). If a text produces an estimated entropy per word that is almost maximal, then you can conclude the (the majority of) the text was almost certainly not created with intentional meaning (though a few intentionally meaningful sentences could always have been thrown in here or there, and perhaps an incredibly crafty foe could have purposely defeated your entropy estimation algorithm on purpose). On the other hand, if the test shows that the estimated entropy per word is similar to that of other large texts drawn from sources known to have intentional meaning, then there is not much that can be concluded (hence the reason I said the test was “one way”).
Hey, exactly what I thought, and I do not even have a whiff of education in information theory. Thanks for the explanation, CB.
ReplyDeleteClockBackward,
ReplyDelete"It is absolutely impossible to generate a text using a random process that truly has maximum entropy, and have the text have intentional meaning."
Of course! This seems like a tautology to me. How could any random process generate a text with "intentional" meaning? At best, a random process could generate accidental meaning, but as you say, that's not very likely. But what if the process isn't random? What if it's pseudo-random? Here again, you seem to be assuming the very fact we're trying to test for. Or is there something I'm misunderstanding about the term "random process"?
"I agree with you in the sense that, for a given fixed estimation algorithm, it may be possible (though extremely difficult and tedious) to design a meaningful text that appears to that algorithm to have maximum entropy."
I'm glad I'm not entirely off-base then. But would it really be more difficult and tedious than composing, say, a sestina or a sonnet cycle? Many poetic forms involve manipulating the entropy of the text, usually lowering it by using predictable rhyme or meter, repeating words, and so on. Moving in the opposite direction might actually be easier. This seems like a proposition that would need to be empirically tested.
In any case, I take your point to be that an unusually high-entropy text would be more likely to be meaningless, in which case this test would provide evidence, but not quite proof -- which was, after all, Julia's original ambition.
Jonathan, well put.
ReplyDeleteI think the problem of the unsympathetic reading of post-modernism extends beyond the initial blog post into Julia and Massimo's responses in the comment thread. The problem isn't just the first unsympathetic reading, it's the fact that y'all are completely uninterested in what anyone else has to say.
What's the point of communicating your ideas if you (think you) are never wrong?
And just to clarify, I don't mean that you're completely wrong. But surely other people have made good points that clash with the points you've made...
ReplyDelete