Biol Philos (2014) 29:395–413 DOI 10.1007/s10539-013-9410-2
Getting the most out of Shannon information Oliver M. Lean
Received: 14 July 2013 / Accepted: 10 December 2013 / Published online: 25 December 2013 Springer Science+Business Media Dordrecht 2013
Abstract Shannon information is commonly assumed to be the wrong way in which to conceive of information in most biological contexts. Since the theory deals only in correlations between systems, the argument goes, it can apply to any and all causal interactions that affect a biological outcome. Since informational language is generally confined to only certain kinds of biological process, such as gene expression and hormone signalling, Shannon information is thought to be unable to account for this restriction. It is often concluded that a richer, teleosemantic sense of information is needed. I argue against this view, and show that a coherent and sufficiently restrictive theory of biological information can be constructed with Shannon information at its core. This can be done by paying due attention some crucial distinctions: between information quantity and its fitness value, and between carrying information and having the function of doing so. From this I construct an account of how informational functions arise, and show that the ‘‘subject matter’’ of these functions can easily be seen as the natural information dealt with by Shannon’s theory. Keywords Biological information Teleosemantics Shannon information Biological function
Introduction When biologists talk about biological entities carrying or transmitting information, what kind of information are they talking about? This question is the subject of an ongoing debate among biologists and philosophers. Some (e.g. Sarkar 1996a, b) believe that informational concepts are a historical artefact that no longer play any O. M. Lean (&) Department of Philosophy, University of Bristol, Cotham House, Bristol BS6 6JL, UK e-mail:
[email protected]
123
396
O. M. Lean
useful role in biology. Others (e.g. Levy 2011) see such language as fictional, but usefully so. Even those that remain—those that believe biological information to be both useful and non-fictional—are divided by how this information should be defined. An oft-cited candidate for this definition is the mathematical sense of information famously established by Shannon (1948). Shannon information is a statistical measure of the uncertainty in a system, or the log-likelihood of each possible state of that system weighted by their probabilities1: X pðxÞlog2 pðxÞ ð1Þ HðXÞ ¼ x2X
Two systems share mutual information if their states are correlated in some way; that is, if observing the state of one system alters the probability distribution of the other. The amount of mutual information between two systems X and Y is defined as the average of the statistical associations between each pair of states: XX pðx; yÞ IðX; YÞ ¼ pðx; yÞ log ð2Þ pðxÞpðyÞ y2Y x2X In short, two systems carry mutual information about each other if their states are in some way statistically associated; if one can learn something about one of the systems by attending to the other (see Dretske 1981). How much information exists between the two is dependent on how much, on average, the probabilities are changed by this observation. Henceforth, I will refer to this as ‘‘natural’’, ‘‘mutual’’ or ‘‘correlational’’ information, as largely stylistic alternatives. This sense of information, although mathematically rigorous, is almost universally rejected as a suitable way in which to conceive of information in a biological sense. Reasons commonly given for this rejection are as follows. Firstly, since mutual information applies to any two systems that are statistically associated, any system that causally affects an outcome qualifies as carrying information about that outcome. So although genes qualify as carrying information about the outcome of development, for example, so too does the temperature of the embryonic environment, the nutrients provided, any teratogenic toxins that may be present, and so on—anything whose variation would change the developmental outcome. I will refer to this criticism as the causal parity argument.2 But biologists tend not to treat most of these as ‘‘sources of information’’; they restrict this to only certain factors in the causal milieu, such as DNA and (for some) environmental cues that govern adaptive phenotypic plasticity (West-Eberhard 2003). A purely statistical sense of information cannot account for this difference, so the argument goes. 1
This measure is analogous to the thermodynamic measure of entropy (Bais and Farmer 2007).
2
This isn’t to be confused with the differently-motivated ‘‘parity principle’’, which can refer to an argument by proponents of Developmental Systems Theory (DST) that any sensible conditions met by DNA as an ‘‘information source’’ in development are also met by other resources (see e.g. Griffiths and Knight 1998). This is typically a criticism of chauvinism about DNA’s role in the developmental process, at the expense of other non-genetic developmental factors. Here I use ‘‘causal parity’’ to refer to arguments that a statistical definition of information fails to filter any ‘‘information-carriers’’ in biology from causes in general.
123
Getting the most out of Shannon information
397
Secondly, in a distinct but related objection, correlational measures of information explicitly ignores information’s content. That is, information in a Shannon sense does not constitute anything like some proposition about the world that may or may not be true. Because of this, it cannot account for notions of misreading of content, or of misrepresentation of the world. But this language is common in biology: a developmental system can fail to produce a phenotypic feature ‘‘correctly’’, and an animal can send a ‘‘false’’ signal to another. In the Shannon sense, however, ‘‘right’’ and ‘‘wrong’’ outcomes caused by another system are information all the same. These criticisms constitute part of the case for a definition of biological information that incorporates teleosemantics (Maynard-Smith 2000a; Jablonka 2002; Shea 2007, 2012b), in which carrying information is restricted to, and its content determined by, a certain selective history. In this view, only some things in biology carry information about an outcome because they have a history of selection for producing that outcome. What’s more, because information content is determined by past selective environment and past outcomes or behaviours that may not be tokened in particular instances, it offers a clear account of error. This reliance on selective history means that teleosemantic information does not apply to any and all causes, and is therefore thought to be better able to account for biological information better than a correlational kind. Causal parity and the problem of error, as criticisms of a Shannon-centric theory of biological information, are widely cited in various forms in the literature (e.g. Jablonka 2002; Godfrey-Smith and Sterelny 2008; Shea 2012a). The primary focus of this work is to defend a Shannon-based theory of information against these criticisms. In doing so I will develop a clear treatment of what it means for a biological entity to have an informational function—crucially, a property not shared by all causes in a biological process. I show that once these functions are clarified, there is no need to elaborate on our definition of information beyond the Shannon sense in order to avoid causal parity. The ‘‘subject matter’’ of informational functions can comfortably and usefully be seen as natural information. It is the failure to appreciate the difference between information per se and informational functions that has fuelled the aforementioned criticisms and, in part, motivated the search for a richer, teleosemantic definition.
The mark of biological information A conceptual analysis There is no particular reason to suppose that a suitably unpacked notion of biological information will admit every process (and only those processes) to which the term is typically applied. Informational language is largely informal in nature throughout much of the field, so a more formal treatment may well lead to different conclusions about some individual cases. However, for the theory to be at least recognisable as capturing the key features of informational language, rather than something else entirely, the overlap between cases of information in casual usage
123
398
O. M. Lean
and those in a more rigorous treatment should be considerable. To that end, we will engage in a brief conceptual analysis regarding the kinds of process which are typically considered to qualify, and those that aren’t. Processes that aren’t usefully spoken about in terms of information are typically accounted for by causation alone (hence why a causal notion of information is typically dismissed). For example, when an organism takes in nutrients and uses them for its normal biological functioning, or when a toxin disrupts any of those processes. While the former is in some sense adaptive, and the latter is pathological, both can be viewed as variations in the underlying causal factors that affect the outcome of some process, whether positive or negative in terms of the organism’s interests. Processes generally discussed in terms of ‘‘information’’ include the response of a certain tissue to a hormone released by another tissue, or a phenotypic change in response to some change in the environment. The term ‘‘response’’ is key to the difference here: information typically conceived by biologists is something the organism uses for something resembling an adaptive decision. In other words, information is something that bears on which of a number of alternative phenotypes will lead to the best performance in a given state of affairs. So hormones lead to certain responses that are adaptive to states of the organism associated with those hormones being released, for example. On the other hand, although using nutrients in bodily functions is clearly preferable to starvation, starvation per se is never a preferable alternative.3 The continuing functioning of the organism enabled by the food is therefore not a choice from among reasonable alternatives. This difference is analogous to a car drifting to a halt when it runs out of fuel, versus some ‘‘fuelsaving’’ mode the car can adopt when fuel is running low. Mutual information and fitness value The above suggests that a key defining feature of processes in which informationtalk seems appropriate is some semblance of deliberation on the part of the organism, in response to something that provides information about a state of the environment that has a bearing on the organism’s fitness. From this it seems understandable that the type of information being dealt with should be considered intentional in nature, as something like purposefulness seems integral to the concept. However, this isn’t necessarily the case. Although it seems that a useful distinction between the informational and the non-informational must contain an element of intentionality, this needn’t take the form of a re-definition of what kind of information is being dealt with in these cases. By analogy, both my radiator and my laptop produce heat, but only one of them is for doing so. But the definition of ‘‘heat’’ in both cases is the same, whether or not its presence is deliberate in any sense. Similarly, it may be possible to develop a language for information in biological systems, with their added element of design, without elaborating on the commodity of information that is being exchanged.
3
Although a ‘‘starvation state’’ intended to limit the damage caused by damage might be.
123
Getting the most out of Shannon information
399
Towards this goal, I will begin to develop a more formal expression of the intuition-priming from ‘‘A conceptual analysis’’ section, which will involve a closer look at the relationship between the amount of information and its value. This mathematical treatment is a simplified abstraction of more in-depth work on this relationship done in recent years (Donaldson-Matasci et al. 2010; McNamara and Dall 2010; Bergstrom and Lachmann 2004), which has shown interesting connections between Shannon information and the decision-theoretic concept of information value. A clear idea of the relationship between the two will show that although information value is necessary to explain selection on these systems, mutual information can do a significant portion of the explanatory work, and can still be considered the subject matter of these processes. Here we explore an abstract case of an organism and a nutrient, N, that is essential to its survival. I also introduce a ‘‘deprivation state’’ that can be triggered when N is scarce in the diet. We assign the following fitness values: e1 (N available) /1 (normal state) /2 (deprivation state)
e2 (N unavailable)
10
1
6
5
For the purpose of clarity I exclude varying levels of N and treat its presence or absence as discrete. These values capture the idea that the austere deprivation state is highly advantageous when N is unavailable in the environment, but sub-optimal otherwise due to the consequent reduction in various biological functions. It also demonstrates the idea that the deprivation state is more than just the direct, causalmechanistic consequences of adding or removing the nutrient. If that were all that /1 and /2 amounted to, one couldn’t make sense of either the top-right or bottomleft values. These states would make as little sense as burning in the absence of oxygen. Instead, the deprivation state can be seen as, say, release of hormones leading to altered gene expression in various tissues, and so on. A brief aside seems necessary here: given the exact constitution of the organism as it is, these physiological consequences of altered gene expression, etc. could also be seen as causally inevitable in at least some sense. But there is clearly a far more contingent element to these effects, in that it is possible for these changes not to be enacted if the organism’s constitution were realistically different. It may seem tempting at this point to state the difference in terms of an evolved response. But for now I will avoid appeal to selection, on the grounds that the above matrix can and should exist independently of whether any selective process has so far occurred. Instead, all that is required for the distinction between ‘‘pure’’ causal inevitability and a more contingent kind of consequence is that variation in the outcome is possible within reasonable bounds of constancy in the organism’s biological characteristics.
123
400
O. M. Lean
Abstracting further from our above example, we can express the various expected payoffs, in terms of fitness, of two different phenotypes in two different states of the environment as follows: e1
e2
/1
F(/1|e1)
F(/1|e2)
/2
F(/2|e1)
F(/2|e2)
where F(/i|ej) means the expected fitness of an organism possessing the ith phenotype in the jth environment. Needless to say, if the payoffs are such that one phenotype or the other is always preferable, the fittest strategy will be one which always adopts that phenotype. Things are less simple when, as above, the preferable phenotype depends on the state of the environment. In cases like this, the fittest strategy will be the one with the highest, expected payoff 4 F(/i), which is a function both of the payoffs in each environment, and the probability of their occurrence: n X pðej ÞFð/i jej Þ ð3Þ Fð/i Þ ¼ j¼1
So far there doesn’t appear to be any particular reason to invoke the concept of information. We have two possible states of the environment with different consequences for fitness, and similarly we have two possible phenotypes whose fitnesses are different in the two possible environments. The optimum phenotype to adopt depends partly on the probabilities of each environmental state, but in the absence of any clues about which state obtains the only consideration is their prior probabilities.5 In a long-term selection scenario, one phenotype will win out over the other.6 However, the situation takes on a much more interesting character when we introduce a cue, C, that carries natural information about the state of the environment. Within Shannon information theory, anything whose presence is in some way statistically correlated with the state of the environment meets this criterion, regardless of the underlying causal story of how that correlation is achieved. The quantity of mutual information between C and the state of the environment can be seen as the amount by which uncertainty about the environment 4
This is true in the case of ‘‘idiosyncratic risk’’, in which every individual in the population experiences a different randomly-generated environmental state. However, when all individuals experience the same randomly-selected environment the fittest strategy will be the one with the highest geometric mean payoff, or expected log payoff. Thanks to an anonymous referee for pointing this out.
5
The question of how to define prior probabilities in biological cases is a thorny issue. Since the question here is which of two phenotypes will do the best in the long-term, a frequentist interpretation based simply on how often each environmental state will be realised in the organism’s environment seems uncontroversial.
6
Donaldson-Matasci et al. (2010) observe that some ‘‘proportional betting’’ strategy would be ideal in some circumstances, such as when one environment is lethal to one phenotype. For clarity purposes I deal with pure strategies alone.
123
Getting the most out of Shannon information
401
is reduced by observation of C: the greater the change in probabilities, the greater the amount of information. Crucially, a cue that carries information about the environment has the potential to change the optimum phenotype to adopt, by altering the expected payoffs of each phenotype via the change in probabilities accompanied by the cue. These new expected payoffs are calculated by factoring in the updated probabilities to Eq. (1) above: n X Fð/i Þ ¼ pðej jCÞFð/i jej Þ ð4Þ j¼1
So if the cue carries enough information about the environment, it can change the expected payoffs to the point where a different phenotype is optimal. However, the mere existence of C is not enough to actually change the phenotype of the organism unless it wields some kind of causal influence over which phenotype is adopted. Indeed, there will be countless features of the world that carry natural information about the environmental state in question, but which have no way of affecting the phenotype. What’s more, even if a cue does causally affect that phenotype there is no a priori reason why its influence should be in the direction of a more adaptive strategy—it may hinder the expedient phenotypic change instead. But if there is variation in the population in terms of how C affects the phenotype that’s adopted, there is scope for selection of those variants that are changed in the right direction (‘‘right’’, that is, simply in the sense of fitness). The result of this selection would be organisms that are causally affected by C in a way that biases the phenotype towards the one that is optimum in the environmental state with which C correlates. In other words, selection would produce organisms that respond adaptively to causes that carry mutual information about selectively relevant states of the environment. A working definition ‘‘Mutual information and fitness value’’ section tells an abstract story of how selection can produce organisms that respond adaptively to environmental cues. That this is possible is uncontroversial. Where disagreement arises, however, is in what can usefully said to be the result of this selective process. Some advocates of a richer sense of biological information argue that selection produces what we can reasonably call teleosemantic information, and that it is this kind of information that justifies our speaking of such processes in informational terms at all. My contention is that this approach creates some unnecessary conceptual problems, as I will show in ‘‘Comparisons’’ section. In the meantime, I continue with what we might call a ‘‘ground-up’’ account of how the correlational information that is ubiquitous in the world produces a particular type of function, without creating the need for a new sense of information to explain how they work and how they arise. The exact definition of an etiological function varies to an extent, but all carry the general condition that the function of a biological object is the role which contributed to the reproduction of previous tokens of that object, leading up to the current one. A crucial point about etiological functions, as Buller (1998) notes, is
123
402
O. M. Lean
that even the strong theory of functions relies in some sense on the non-historical, causal-mechanistic notion of function discussed by Cummins (1975). This notion of function is needed to make sense of the causal contribution to the organism’s fecundity that selection favours in the first place. Etiological functions therefore entail a Cummins function, albeit one that was performed by past tokens of some trait, rather than the current token. However, merely having a function isn’t enough to justify our discussing those functions in informational terms. For this we need a further criterion. I suggest that this criterion lies in what explains the fitness of the response. The fitness consequences of the change in phenotype caused by the cue don’t make sense unless we appeal to the information that the cue carries about the relevant environmental state: variants whose phenotype was changed by the cue in that way were fitter than those that didn’t change (or changed differently) because the cue correlated with the state of the environment for which that phenotype was appropriate. This is not true of all causes that affect a phenotype: any trait will only develop within a given temperature range, for example, but the temperature at which it develops will not necessarily correlate with the environmental state responsible for the trait’s fitness (See Fig. 1). The correlation between temperature and phenotype may qualify as information, but the information carried by the environmental temperature may not change which phenotype is preferable. Only a subset (possibly none) of the variables that contribute to a trait’s development will carry correlational information in this sense. From the above discussion, we can now introduce a tentative set of conditions for what defines an informational function in biology. If it is true that all etiological functions are a kind of past-tense version of a causal-mechanistic function, the first thing to do is to specify the latter. Henceforth, for the purposes of clarity, I will use the term ‘‘function’’ only in the etiological sense. Non-historical, causal-mechanistic functions I will refer to as ‘‘roles’’. C1 A biological entity A has an informational role if: A has two or more different states which differentially affect a phenotype /, which has different fitness optima given different states of system E, b. the state of A is affected by some causal factor C, and c. C carries natural (correlational) information about the state of E. a.
See Fig. 1 for a simple schematic outlining these conditions. The above conditions specify that both the phenotype and the environmental state E7 must have possible alternatives, and that at least some possible phenotypes map onto some possible environments in terms of which / is fittest in which E. This obviously rules out states of / and E that are impossible, and those that are irrelevant to the ‘‘decision’’ about which would be the best phenotype to adopt (as they are never the optimum phenotype). It also specifies that A’s state—and consequently /’s state—is causally
7
E needn’t necessarily refer to an aspect of the external, abiotic environment. It may refer to any system whose state has a bearing on the fitness value of the phenotypic state in question, such as a state of another organism, a tissue or organ elsewhere in the same body, or even in the same cell.
123
Getting the most out of Shannon information
403
Fig. 1 Schematic showing the conditions for an informational role, as per C1. Biological entity A has a causal role in the determination of phenotype /, in that the state of A affects the state of /. A is affected by a causal factor C. C carries mutual (correlational) information about E, a system whose states correspond with different fitness values of different phenotypes. d and b represent other causal factors affecting A and /, respectively. A has an informational role if it is affected by such a cue C, which bears this informational relationship with E. This will not always be the case: none of the causes affecting A necessarily correlate with the environmental state that explains the fitness of different states of /
affected by something that correlates with E, thus excluding causes of a phenotype with no relevant correlation with E. Notice that there is no appeal to selective history at this point. So far, informational roles have been distinguished from most causes of the phenotype by the fact that they behave differently (at least generally) depending on the state of E. What’s more, condition (b) doesn’t require that the phenotypic change as a result of C is in the direction of greater fitness: so far we are including entities that effect the ‘‘wrong’’ response (again, wrong simply in terms of fitness) to the information carried by the cue. Now we have a working definition of the causal-mechanistic role, we can specify how this role becomes an etiological function through the process of selection. F1 A has an informational function if: a.
past tokens of A played an informational role (i.e. satisfied the criteria of C1 above), in that they differentially affected the development of phenotype / in response to a cue C that carried information about an environmental state relevant to /’s fitness, b. the effect that C had on development via its effect on A was in the direction of increased fitness, in that the phenotype it caused was relatively fit given the environmental state indicated by C, and c. organisms possessing A tended to be fitter than those without as a consequence of its interaction with C and its subsequent downstream effects, leading to selection for A and its spread in the population. F1 introduces the condition that the effect that the cue has on the outcome of development is in the direction of increased fitness, establishing a correspondence between an environmental state indicated by the cue (read: ‘‘about which the cue carries information’’) and the ‘‘correct’’ phenotype with respect to that environmental state. This was excluded from the above conditions for an informational role because causal-mechanistic actions are in themselves blind to fitness, and exist whether or not they are selected for. Only a small subset of Cummins functions go on to become etiological functions, one condition of which is that the role it plays contributes to fitness. Similarly, in the Shannon-centric notion of information I’m proposing, informational roles may or may not be advantageous. But whether or not
123
404
O. M. Lean
C’s effect contributes positively to fitness doesn’t change whether or not it carries information about E. The intention of the preceding analysis has been to show that it is possible to make sense of the distinction between informational and non-informational processes (as commonly understood) without the need to appeal to a richer sense of information. In the framework I propose, information amounts to no more than correlations between different systems, which are ubiquitous in the world, and which may or may not be exploited by biological systems in an attempt to maximise fitness. Whether or not this happens depends not just on the existence of such information, but on whether that information has any bearing on the organism’s fitness and, crucially, whether variations arise that are able to make use of that information. In this sense, ‘‘making use’’ of information amounts to being causally affected by an information-carrying cue in the direction of increased (expected) fitness. Crucially, this cannot be said of all causal influences on all phenotypes, thus avoiding the criticism that a Shannon-centric theory of information fails to identify what is special about informational processes beyond cause-and-effect. As the above treatment shows, this causal parity criticism misunderstands how Shannon information is to be applied to the problem. Although all correlations qualify as information in some sense, this doesn’t prevent us from distinguishing processes on the basis of the quantity and fitness value of the information being discussed. Real-world examples The intention of the above discussion has been to capture the circumstances under which basic causality takes on an informational character, without the need to redefine what is meant by ‘‘information’’. The applicability of these conditions to the real world can be illustrated with examples, such as the case of insulin release by pancreatic beta-cells: This occurs when blood glucose levels are high, as the increased influx of glucose into the beta-cells results in increased rates of glycolysis (the anaerobic stage of glucose metabolism) and subsequent increase in intracellular ATP concentrations. ATP binds to and stimulates the closing of ATP-sensitive potassium ion channels, blocking the efflux of potassium and increasing the voltage across the cell membrane. This change in voltage stimulates the opening of calcium ion channels, resulting in an influx of calcium and the fusing of insulin-containing vesicles with the cell membrane, and secretion of the hormone into the bloodstream (Bowen 1999; Keizer and Magnus 1989). The causal chain leading from increased blood glucose to insulin secretion evidently consists of several elements, not all of which may play their causal role in this process as a matter of function. For example, the process of glycolysis occurs in virtually all cells, and the ATP (and other high-energy compounds) produced is required for metabolic processes in general. There might conceivably have been some selection on the fine details of this activity in the beta cells in question, but this isn’t necessary for it to qualify for an informational role in insulin release. The enzymes that take part in glycolysis nevertheless meet the C1 conditions: they causally affect the release of insulin (and consequently its downstream effects on
123
Getting the most out of Shannon information
405
other tissues), which varies in expediency depending on blood glucose levels, and they are causally affected by those levels of blood glucose. On the other hand, some factors in insulin regulation may also meet the conditions of having the function of responding to the information borne by the intracellular glucose. The potassium and calcium channels, for example, have informational roles in the process because their open and closed states are affected by upstream correlates of increased glucose (increased ATP and voltage across the cell membrane, respectively), and in turn affect the downstream insulin secretion. If they have been subject to selection for this activity, they also perform this activity as an informational function. Equally illuminating are cases that don’t meet these conditions. The effect of thalidomide on development is a typical example in arguing that mere correlations do not account for the way informational terms are used. Since the presence or absence of thalidomide correlates with the success or failure of normal limb development, it is argued, its effects would qualify as a transfer of information under a Shannon-centric theory. But this argument assumes that correlational information must be doing all of the explanatory work without the help of other concepts, and there is no reason why this should be so. In this example, thalidomide’s effects on limb growth probably result from inhibiting the growth of capillaries at the limb buds (D’Amato et al. 1994). Let us suppose that this effect results from inhibiting the activity of a protein involved with stimulating capillary growth. Do the protein, thalidomide, and the developing limbs qualify as our A, C and /, respectively? No, because thalidomide doesn’t carry information about any state of the environment relevant to the fitness of normal versus underdeveloped limbs. There is no environmental state in which the limb development caused by thalidomide is an optimum phenotype. And even if there were, there is no reason to assume that the presence or absence of thalidomide correlates with this state. This situation therefore fails to qualify as informational, not because thalidomide doesn’t carry any correlational information at all, but because none of it is about some environmental state in which thalidomide’s phenotypic consequences are advantageous. It may be noticed that the conditions offered above don’t suit the behaviour of DNA quite so obviously, but closer inspection of the DNA case shows that it can be applied. DNA (or, more specifically, the DNA-protein complex chromatin), is malleable during development in ways that are recognisable as informational functions in the sense stated above: Gene expression both affects, and is affected by, changes in the extragenetic environment, through DNA methylation, histone acetylation, binding of transcription factors, and so on (see ‘‘Sender-receiver models’’ section for further discussion). These types of process can be likened, in a relevant sense, to the variable states of protein channels, etc., as in the above example of insulin regulation, and don’t seem to require further elaboration beyond the above sets of conditions. However, the effect of DNA sequence is somewhat different: DNA sequence is not normally altered in the course of a single life cycle, so although variations in gene sequence affect phenotypes, those variations aren’t affected in the course of development, so there is no identifiable ‘‘C’’ that changes it and its subsequent effect
123
406
O. M. Lean
on phenotype. Instead, DNA sequence can itself be seen as the cue that carries information about the environment. It acquires this information not in the course of development, but through phylogenetic time: If genotypes vary within a population, and some variants spread due to their advantageous effects on phenotype, then those alleles carry information about the environment that selected them [see Shea (2007) for a similar argument]. Gene sequences therefore play the part of the informationcarrying cue that affects phenotypes via the machinery of gene expression. What is distinct about DNA as a cue is that its correlations with the environment are achieved through selection, rather than through proximal physical mechanisms. The above discussion shows that hormone systems, gene expression, adaptive phenotypic plasticity and so on can be identified as qualitatively different from noninformational biological systems and processes without relying on any reference to semantic content. Rather than appeal to some indicative or imperative content that is instilled in these processes by selection, the only kind of content required is the ‘‘natural meaning’’ (Grice 1957; Godfrey-Smith and Sterelny 2008) of correlations between systems. As well as this notion of meaning, we also retain the option of two unproblematic senses of ‘‘correctness’’: the decision-theoretic notion of the optimum phenotype (as per the fitness matrices above), and the appeal to correct performance of etiological function. An outcome can be called ‘‘wrong’’ if it is a suboptimal phenotype in the given environment, or if it resulted from some developmental factor failing to perform the action for which it was selected, depending on the kind of answer being sought in a given case.
Comparisons Even if it is accepted that Shannon information, in concert with function and fitness, can do the same explanatory work as richer informational concepts, it remains to be shown that the former is preferable. With this in mind, we now explore key elements of some existing accounts to assess what is and is not useful about these theories. There is much of value in these treatments of the problem, and for that reason the present work should be seen as an attempt to extract what is valuable from these discussions, while questioning the necessity of an extended vocabulary for dealing with biological information. Information as a product of selection John Maynard Smith (2000a, b) suggested that the privileged position genes are given in developmental explanations is due to the fact that, unlike environmental factors, genes have been shaped by selection. This fact, he argues, justifies the separation usually drawn between ‘‘nature’’ and ‘‘nurture’’, usually reserving informational language for the former: I will argue that the distinction can be justified only if the concept of information is used in biology only for causes that have the property of intentionality. In biology, the statement that A carries information about B
123
Getting the most out of Shannon information
407
implies that A has the form it does because it carries that information. A DNA molecule has a particular sequence because it specifies a particular protein, but a cloud is not black because it predicts rain. This element of intentionality comes from natural selection. (Maynard-Smith 2000a, pp. 189–190) Maynard Smith’s comparison of DNA with clouds demonstrates his view that the information contained in the former is of a more specific kind from that contained in the latter. Its selective history gives DNA (and possibly other inherited elements) its semantic content, and determines what that semantic content is. He cites the example of the mouse gene eyeless, which codes for a protein that switches on genes involved in construction of the eye: Due to being selected for this effect, it can be considered a signal with the content ‘‘make an eye here’’. This, however, is only possible because the target sequence to which the protein binds is similarly evolved for the same common effect. Jablonka (2002) adopts a similar view, except that she proposes relaxing the condition of selective history on the part of the sender and instead restricting that requirement to the receiver. This permits un-evolved ‘‘cues’’, such as features of the abiotic environment or accidental giveaways from other organisms, to be considered carriers of information, alongside the ‘‘signals’’ in which both sender and receiver have this selective history. Jablonka’s account is therefore more permissive than Maynard Smith’s in terms of the kinds of entity or process that can be said to carry information. However, Jablonka states that ‘‘according to my definition, information is something that can exist only when there are living (or more generally, designed) systems. Only living systems make a source into an informational input’’ (Jablonka 2002, p. 588). Both Maynard Smith’s and Jablonka’s conditions for informational language therefore do not apply either before or independently of the requisite selection, so there remains the question of how to explain what it is that selection is acting on to produce these systems in the first place. Does a cue carry information because the receiver has evolved to respond to it, or did that response evolve because of the information the cue carried? If the former is true, as Jablonka argues, we still need a selection-independent sense of information to explain the fitness value of that response, and the only likely candidate appears to be natural information. The upshot, then, is that building selection into a definition of biological information does not eliminate the need for a correlational sense as well. This is implied, though unacknowledged, when Maynard Smith says that ‘‘the statement that A carries [1] information about B implies that A has the form it does because it carries that [2] information’’. Here the term ‘‘information’’ is implicitly being employed in both the semantic sense [1] that is the product of selection, and the correlational sense [2] that explains why selection occurred. This pluralistic account may be less parsimonious in its conceptual toolkit than a Shannon-centric theory, but it may still be preferable if the latter cannot distinguish informational processes from general causality. I have shown above that this is not the case: Biological processes recognisable as information-related can be defined by the fitness value of the correlational information that a causal factor carries, rather than by defining information itself as a product of selection. This approach carries
123
408
O. M. Lean
the advantage of only helping itself to one, uncontroversial, sense of information— one that can explain the evolution of biological communication systems without circularity, and without dualism in the operative definition of information. If this is accepted, the role of selection can be restricted simply to creating the function of responding to correlational information, rather than defining a new kind of information altogether. Shea’s ‘‘infotel’’ semantics Shea (2007, 2012a, b) offers a formal account of how DNA can be said to carry representational content. The conditions for representation that are met by DNA, he says, are that genes constitute a range of intermediates linking environments with phenotypes, thereby adopting both input and output conditions that qualify as standards of ‘‘correctness’’. For example, an allele that causes the development of thick hair is likely to spread in a population if the environment is cold, outcompeting alleles that correlate with thinner hair. Consequently, the allele’s prevalence at that locus is in part due to the phenotypic outcome it affects, and to the environmental state responsible for that phenotype’s relative fitness. The result, for Shea, is a genetic representation with the content ‘‘the environment is cold, grow thick hair’’. Shea is explicit about his reliance on correlational information in accounting for the emergence of genetic representations: one condition given for some R having content C is that ‘‘Rs carry the correlational information that condition C obtains’’ (Shea 2012b, p. 5). However, with this in mind it isn’t obvious what explanatory work the concept of representation is doing that cannot be accounted for by the correlational information produced by selection. His objection to the usefulness of this reduction is in the vein of the causal parity argument mentioned above. For example, in the case of its applicability to ‘‘poverty of the stimulus’’ arguments8, he argues as follows: If poverty of the stimulus arguments are couched in terms of the bare correlational information available in the environment, they tend to miss their mark. Whenever the development of a trait is causally dependent on an aspect of the environment, the environment will carry information—in the bare correlational sense—about the trait (Griffiths and Gray 1994). But these are not examples of the mechanisms of development reading or consuming a representation. […] Only when the way development reacts to a piece of correlational information in the environment is a matter of evolutionary design, with the function of producing a variable outcome depending upon the detected state of the environment, is it right to think of development as reading or consuming a representation in the environment. (Shea 2012a, p. 16)
8
These are adaptive claims in which the organism’s developmental environment is said to lack sufficient information to specify the state to which the trait is adapted, thus necessitating explanation in terms of inherited biases towards that trait acquired by selection, rather than adaptive phenotypic plasticity.
123
Getting the most out of Shannon information
409
This line of argument fails to distinguish correlations between developmental cause and phenotype on one hand, and between developmental cause and the adaptively relevant environmental state on the other. It is true that all the causal factors in development, evolved or otherwise, carry correlational information about phenotype (by virtue of their having an effect on it). However, as discussed above, merely correlating with phenotypic outcome doesn’t qualify a causal factor as a source of information about the environmental state against which the adaptive value of a trait is measured. It is this adaptive value that isn’t shared by all mutual information carried by all causal factors. Shea’s above mention of ‘‘the function of producing a variable outcome depending on the detected state of the environment’’ comes close to appreciating the information/informational function distinction, but the claimed necessity of representation suggests that it isn’t acknowledged in full. Since he accepts that genes correlate with their environment, it is a small step to say that the developmental system has the function of responding to that correlation. The only role being played by the intentional term ‘‘representation’’, it seems, is in explaining how that correlation arose, i.e. by selection due to their correlation with phenotypes that are fit in their given environment. At this point, it may be seen as a matter of taste whether to say that genes ‘‘represent’’ their environments, or merely correlate with them as a result of selection. Shea himself accepts that representations can in principle be replaced with the concepts they reduce to, but argues that they gain some important explanatory purchase in some areas. For example, he notes an interesting symmetry between the use of environmental information in determining phenotype and the extraction of this information from the genome. On the contrary, I argue that phenomena like this are examples of why concepts like representation over-complicate the issue: Representations are the product of selection under Shea’s account, so any symmetry between genetic and environmental information is less obvious when the former is couched in a sense that requires a selective history. On the other hand, this symmetry is far clearer when both are simply seen as correlations with an adaptive environment, to which the organism has evolved the capacity to respond. The transmission sense of information Mention must be made of Bergstrom and Rosvall’s (2011) ‘‘transmission sense of information’’, which shares the aim of returning to a Shannon-centric sense of biological information. This takes as a starting point the fact that despite being explicitly meaning-free, Shannon’s information theory was in fact designed to deal with human communication systems that do have content. But content can be usefully black-boxed in order to analyse quantitative notions such as the carrying capacity of an information storage system or the capacity of an information channel. According to Bergstrom and Rosvall, these factors are relevant to the decisiontheoretic question of how to transmit information efficiently from one generation to another, and therefore have explanatory value for why DNA has its peculiar properties. They offer a definition:
123
410
O. M. Lean
Transmission view of information An object X conveys information if the function of X is to reduce, by virtue of its sequence properties, uncertainty on the part of an agent who observes X (Bergstrom and Rosvall 2011, p. 165) This gets close to the idea that informational processes in biology are characterised by having a certain kind of function. However, the distinction between ‘‘conveying information’’ and ‘‘having the function of conveying information’’ is missed in the above definition. It therefore carries the same undesired consequence of excluding informationcarriers that aren’t themselves evolved for that purpose, as Stegmann (2013) observes. It is also unclear what the ‘‘reduction of uncertainty’’ amounts to in this sense. It is unlikely to mean that it simply reduces the range of possibilities of an outcome, as this would apply to all causes. However, if this is interpreted as the reduction of uncertainty about something relevant to phenotypic fitness, then this can be seen as similar to the framework I propose. The similarities would be more significant if the aforementioned allowance were made, i.e. that merely conveying information is a selection- and fitness-neutral quality. What would be left to explain is under what circumstances that quality becomes part of a biological function, as I have attempted above. Sender-receiver models Conceiving of information in terms of transmission between ‘senders’ and ‘receivers’ has been developed for biological or evolutionary purposes, particularly by Skyrms (2010). Bergstrom and Rosvall’s approach evidently frames the transmission of information largely in this way, i.e. as the coordination of behaviour between a sender and a receiver—‘receivers’ including other spatiallyseparated individuals, or the same individual at a later time (thus including forms of ‘memory’), or the sender’s offspring ‘receiving’ a vertically-transmitted genetic message for use in development. Godfrey-Smith (2013a, b) elaborates on the potential for sender-receiver models to illuminate these non-classical cases of behaviour coordination within an organism, between generations, and so on. There are, for example, a great many cases in which there is a need for certain parts of an organism to perform actions conditional on states or events outside of themselves. This is typically achieved through the activity of structures whose function is to produce a sign which (a) is conditional on that state or event, and (b) produces the appropriate effects in other parts of the organism. This would include hormone signalling and intracellular cascades, for example. Senderreceiver models comfortably fit these cases when we view sender and receiver as, respectively, the sensory mechanism and the eventual response mechanism. Lewis’s original work on signalling interactions was treated as a solution to the problem of coordinating behaviour between individuals with a common interest (Lewis 1969). Cases of partially or fully-conflicting interest have been discussed (Godfrey-Smith 2013a, b); however, since different parts of the same reproducing individual will (usually) have fully-aligned interests, these complications won’t usually apply. How might the above characterisation of informational roles and functions fit within a sender-receiver framework? One notable observation is that the definitions
123
Getting the most out of Shannon information
411
introduced in ‘‘A working definition’’ section apply equally to senders and receivers in these models. Informational functions, i.e. F1 above, cover any biological entities that, as a matter of evolved function, differentially affect a phenotype depending on a causal factor that correlates with a state relevant to the fitness of that phenotype. Crucially, the information-carrying causal factor could be an abiotic cue, or a ‘purpose-built’ signal,9 or anything in between. What’s more, the definition doesn’t specify the nature of the entity’s variable effect on the eventual phenotype, meaning that its role could equally be a causally upstream signalling event, or even a feature of final phenotype itself. What this means is that the conditions for informational roles and function remain true whether the entity in question is treated as the sender or receiver. This is to its advantage, since the relative nature of sender and receiver roles is well-known: The sending of a signal, for example, could itself be seen as a receiver act, depending on perspective. Planer (2013) argues that genes (as structures that vary developmentally in terms of their expression according to cis-regulatory elements, etc.) can be usefully viewed as agents in signalling games. He demonstrates this by framing a gene’s conditions of expression—the conditions under which it is successfully transcribed and translated— as a ‘strategy’ akin to the state-sign and sign-response functions that define the strategies of agents in signalling interactions. Depending on a gene’s position in a gene expression cascade, therefore, its conditional behaviour can be modelled (and, from an evolutionary perspective, explained) in terms of the effects of that behaviour on fitness outcomes, relative to upstream and downstream events and conditions—in line with the way sender-receiver strategies evolve between individuals. As with the above definitions of informational roles and functions, this work fully acknowledges and makes explicit the relativity of sender and receiver roles. In summary, the present work is by no means in conflict with sender-receiver frameworks. In fact, within such a framework it might be seen as an explicit account of what exactly justifies assigning biological entities a role of either sender or receiver in a signalling interaction. What’s more, it is particularly suited for doing so when the focus is on explaining the system’s evolution in the first place, as the C1 conditions for informational roles carry no prerequisite of evolved function. Yet they still successfully identify the kind of entities that are capable of being selected as senders or receivers. In short, it makes explicit the effect of selection in terms of turning an un-evolved behaviour into an etiological function, i.e. an informational function according to the F1 conditions.
Conclusion The teleosemantic project draws heavily from Millikan’s (1989a, b) consumerbased theory of mental representation, which is explicit in defining semantic content in a manner equivalent to etiological functions. From this it is clear that there are 9
It is uncontroversial that even signals typically considered to have semantic or intentional content also correlate with whatever they are intended to indicate. Aside from any debate about how words like ‘‘fire’’ come to mean fire in a semantic sense, it is still true that hearing someone shout the word should still make the presence of fire more likely, at least if it is to be worth acting on by listeners.
123
412
O. M. Lean
considerable points of similarity between the teleosemantic approach and the one being advanced here. The key difference is that in the present view, the notion of function is playing a far more modest role. Although the teleosemantic view defines representational or semantic content in terms of historical function, representation is a stronger claim than functions in general. Legs, for example, have the function of walking, but they do not represent walking (Godfrey-Smith 2003). What further conditions must be met for representation are disputed, but the present view benefits from avoiding the debate entirely, relying only on function in general. Coupled with the above-mentioned advantage of avoiding definitional pluralism, this frugality should make this view easier to accept for those who see use in informational language, but are sceptical of the commitments made by the teleosemantic project. The intention of the present work has not been to exorcise semantic concepts from biology, but to show that a Shannon-centric theory of biological information can be taken much further than is commonly supposed. The idea that semantic notions are a necessary characteristic of biological information is, at best, a synecdochic mistake: that is, it takes a special case of biological information and sees it as general defining feature. Although semantic information and representation might well have a role to play in biology, the point at which this phenomenon arose is not some watershed moment before which information-talk is vapid or superfluous. Instead, I advocate what might be called a ‘‘ground-up’’ view: taking a fundamental sense of information from physics and seeing how far it can usefully be taken as an explanatory ingredient in higher-level biological processes. There are good reasons to favour Shannon information over more elaborate notions: The information dealt with in Shannon’s theory is a relatively uncontroversial mathematical value, existing prior to, and independently of, any process of selection. Because of this, it can play an indispensable role in explaining why selection should favour certain responses to certain physical stimuli in the first place. What arises from this selection, I argue, can still be expressed in terms of correlational information, albeit as the subject matter of a certain kind of function. Theories of biological information with ‘‘built-in’’ historical conditions fail to track the evolution of informational processes without referring back to a simpler concept of information. This is because they embody a more ‘‘top–down’’ approach: selecting a sense of information suited to human communication systems and looking for similar characteristics in lower-level biological processes. This approach is appropriate if the aim is a natural history of human, psychological intentionality. However, much of biology is concerned with processes far removed from these emergent phenomena, but for which correlational information is a vital explanatory ingredient. Acknowledgments I am grateful to Samir Okasha for advice and guidance through every stage of this research, and to Nicholas Shea, Christopher Burr, Kim Sterelny, and one anonymous referee for helpful comments on various versions of this paper. This work was supported by the European Research Council Seventh Framework Program (FP7/2007-2013), ERC Grant Agreement No. 295449.
123
Getting the most out of Shannon information
413
References Bais F, Farmer J (2007) The physics of information, arXiv:0708.2837 [physics.class-ph] Bergstrom C, Rosvall M (2011) The transmission sense of information. Biol Philos 26(2):159–176 Bergstrom CT, Lachmann M (2004) Shannon information and biological fitness. In: Information theory workshop, 2004. IEEE, pp 50–54 Bowen R (1999) Insulin synthesis and secretion. http://www.vivo.colostate.edu/hbooks/pathphys/ endocrine/pancreas/insulin.html Buller DJ (1998) Etiological theories of function: a geographical survey. Biol Philos 13:505–527 Cummins R (1975) Functional analysis. J Philos 72:741–765 D’Amato R, Loughnan M, Flynn E, Folkman J (1994) Thalidomide is an inhibitor of angiogenesis. Proc Natl Acad Sci 91(9):4082–4085 Donaldson-Matasci M, Bergstrom CT, Lachmann M (2010) The fitness value of information. Oikos 119:219–230 Dretske FI (1981) Knowledge and the flow of information. MIT, Cambridge Godfrey-Smith P (2003) Genes do not encode information for phenotypic traits. In: Hitchcock C (ed) Contemporary debates in philosophy of science. Blackwell, London Godfrey-Smith P (2013a) Information and influence in sender-receiver models, with applications to animal behaviour. In: Stegmann U (ed) Animal communication theory: information and influence. Cambridge University Press, Cambridge Godfrey-Smith P (2013b) Sender-receiver systems within and between organisms. http://philsci-archive. pitt.edu/9513/ Godfrey-Smith P, Sterelny K (2008) Biological information. In: Zalta EN (ed) The stanford encyclopedia of philosophy, fall 2008 edn Grice HP (1957) Meaning. Philos Rev 66(3):377–388 Griffiths P, Gray R (1994) Developmental systems and evolutionary explanation. J Philos 91(6):277–304 Griffiths P, Knight R (1998) What is the developmentalist challenge. Philos Sci 65(2):253–258 Jablonka E (2002) Information: its interpretation, its inheritance, and its sharing. Philos Sci 69(4):578–605 Keizer J, Magnus G (1989) ATP-sensitive potassium channel and bursting in the pancreatic beta cell: a theoretical study. Biophys J 56(2):229–242 ˆ S 45:640–657 Levy A (2011) Information in biology: a fictionalist account. NOU Lewis D (1969) Convention. Blackwell Publishing Ltd, Oxford Maynard-Smith J (2000a) The concept of information in biology. Philos Sci 67(2):177–194 Maynard-Smith J (2000b) Reply to commentaries. Philos Sci 67:214–218 McNamara JM, Dall S (2010) Information is a fitness-enhancing resource. Oikos 119:231–236 Millikan RG (1989a) Biosemantics. J Philos 86:281–297 Millikan RG (1989b) In defense of proper functions. Philos Sci 56(2):288–302 Planer R (2013) Replacement of the ‘‘genetic program’’ program. Biol Philos. doi:10.1007/s10539-0139388-9 Sarkar S (ed) (1996a) Biological information: a skeptical look at some central dogmas of molecular biology. In: The philosphy and history of molecular biology: New perspectives. Kluwer, Dordrecht, pp 187–231 Sarkar S (1996b) Decoding ‘‘coding’’: information and DNA. BioScience 46:857–864 Shannon C (1948a) A mathematical theory of communication. Bell Syst Tech J 27:279–423 Shannon C (1948b) A mathematical theory of communication. Bell Syst Tech J 27:623–656 Shea N (2007) Representation in the genome and in other inheritance systems. Biol Philos 22:313–331 Shea N (2012a) Genetic representation explains the cluster of innateness-related properties. Mind Lang 27(4):466–493 Shea N (2012b) Inherited representations are read in development. Br J Phil Sci 0:1–31 Skyrms B (2010) Signals: evolution, learning, and information. Oxford University Press, Oxford Stegmann UE (2013) On the transmission sense of information. Biol Philos 28(1):141–144. doi:10.1007/ s10539-012-9310-x West-Eberhard MJ (2003) Developmental plasticity and evolution. Oxford University Press, Oxford
123