This essay originally appeared in the Journal of Speculative Philosophy, 2, 2 (1988) 89-119. For this HTML edition, I have corrected a few typographical errors and altered the wording of one paragraph. Copyright © 1988, Peter Suber.
What is Software? Peter Suber, Philosophy Department, Earlham College
- Introduction
- Digital and Analog Patterns
- First Formulation
- Executability
- Readability
- Pattern Per Se Again
- Liftability
- The Softness of Software
- Software Versus Data
- Software Identity
- Software Versus Data Again
- Church's Thesis, Instruction, Causation
- Stones Left Unturned
- Index of Principles
Abstract In defining the concept of software, I try at first to distinguish software from data, noise, and abstract patterns of information with no material embodiment. But serious objections prevent any of these distinctions from remaining stable. The strong thesis that software is pattern per se, or syntactical form, is initially refined to overcome obvious difficulties; but further arguments show that the refinements are trivial and that the strong thesis is defensible.
1. Introduction The computer revolution will affect philosophy most profoundly by providing a powerful new set of models and metaphors for thinking about thinking. Can thinking be reproduced by hardware running software? Is our brain hardware? Are neural patterns software? Can the interaction of pattern and patterned substance create thought? Can thought and intelligence derive from the complex interactions of unthinking and unintelligent parts?
If the answer to any of these questions is yes, then one will naturally want to know the logic of the necessary, complex interactions. But that is a distinct inquiry not to be expected here an inquiry not into the question "what is software?" but into the question "which software makes us intelligent?"
Even if these models and metaphors were not alluring, philosophers might well pay close attention to the distinction between hardware and software, for it raises an exceptionally difficult, far-reaching, and important set of problems.
At first there appears to be little problem with the concepts of hardware and software. Hardware is the tangible machine and software is the set of instructions that makes the machine operate in specific ways. But difficulties quickly set in. Does the distinction apply to computers only or to any machine? Or will we call anything a computer if it seems to take instructions? For example, is knob-turning the software of a clock? Are tracks and their switches the software of trains? Is Bach's written score to the Art of the Fugue, perhaps with a human interpreter thrown in, the software of an organ?
In any case, what are "instructions" to a machine? If they are merely thought, or written in English with pencil on paper, they will not (yet) direct the behavior of machines. But if they are given a material form such as punched cards or magnetic tape and enter the causal web of the machine's physical operation, why do we not call them hardware?
The question of this essay is unabashedly metaphysical. It may be that metaphysics cannot be done, or can only be done badly. In a short essay, certainty cannot be expected on the basic notions. (See Section 13.) If this incompleteness characterizes all inquiries, it is not always admitted; and if it does not characterize all inquiries, I am not prepared to remedy it here. This essay in metaphysics is therefore modest.
2. Digital and Analog Patterns Let us begin with a clear-cut case of hardware, say, a desktop ("personal") microprocessor. The example is meant to be non-controversial and familiar, not paradigmatic. It is dangerous to define paradigm cases in an area that is changing so rapidly. Indeed, we do not want to beg the question whether very unfamiliar things should be counted as software or hardware.
Most software for a personal computer comes on disks that are inserted into the machine, read by the computer, and executed. If we zoom down, the disk may be conceived as a magnetic grid whose cells may be set to one magnetic pole or the other (north or south) by a magnetic head much like those used on tape recorders. In computer terms, each cell holds one "bit" of information. The north-south orientations of the cells can obviously support any binary code (alphabet of two characters) and any message that can be translated into a binary code. If magnetism had three poles, or if the material medium of the instructions had three states per cell, then the grid of bits could support a code in base three, not just one in base two, and so on with higher bases.
When a program is recorded on a disk, we may imagine that an enormously complicated pattern of norths and souths has been painted on the magnetic medium. The information on the disk could as well be captured by graph paper filled in with 1's and 0's, pigeon holes filled with black and white counters, or a bank of lights turned on or off. Anything encoded on a finite grid in this way can be represented in a single string, e.g. if each row is taken off in order and put end-to-end with its predecessor. The information in a program, then, can be represented in a long string of any two characters. While the same information can be represented many other ways, this is perhaps the easiest way to conceive it.
Now let us zoom back up toward generality. Since the string can be spatially folded on itself any number of ways, the pattern may take the form of a two-dimensional grid, three-dimensional lattice, or higher-dimensional organization. Just as base two can express all the numbers expressed in other bases, a two-symbol alphabet suffices to express all that can be expressed by any alphabet. Software patterns, then, are only contingently expressed in binary codes, but are essentially expressed as arrays of symbols or texts.
Let us call arrays of symbols, or texts, digital patterns. This term emphasizes the cellular or granular nature of the patterns. Digital patterns are arrays of cells or locations that contain one or another symbol. What makes them "digital" is the individuality and discontinuity of the separate bits. Digital patterns may be contrasted with analog patterns, such as line drawings and facial expressions, in which the information comprising the pattern is continuous, not discontinuously individuated into cellular bits.
"Pattern" is taken in a broad sense to signify any definite structure, not in the narrow sense that requires some recurrence, regularity, or symmetry.
The important feature of digital patterns here is their complexity, their formal articulation of parts, their exact internal differentiation. To summarize, paraphrase, or average one is to destroy it by blurring or omitting some of the structure that defines it. Each joint of articulation carries information for any machine designed to read it. These joints may take innumerable physical forms. Information, in Gregory Bateson's famous definition, is any difference that makes a difference. A digital pattern is an organized array of differences.
It is important to note that an analog pattern can be reproduced by a digital pattern to an arbitrary degree of accuracy. For later reference, let us call this the Digital Principle. For example, a photograph of a face may be printed in a newspaper as a pattern of small dots. The analog image or pattern has been converted to a digital one. For a more accurate picture, imagine the dots becoming smaller and packed more densely (and of course, arranged so as to make the image more accurate). This approximation process is infinite, approaching the original analog (continuous) pattern as a limit. The Digital Principle is an "information theory" version of the familiar mathematical fact that every irrational fraction may be expressed as a convergent series of rational fractions, showing one sense in which discrete numbers constitute the continuum.
If there is a complementary Analog Principle, which I believe, then it is not relevant to this essay except perhaps as a reminder that the Digital Principle does not imply that digital patterns are more fundamental or explanatory than analog patterns.
Under the Digital Principle, software may take the form of analog or digital patterns, and is not limited to the latter. It is often thought that digital computers require digital patterns as software, while only analog computers may take analog patterns. But even a digital computer may take instructions from an analog pattern, provided it is first translated into a digital form, a translation that might itself be performed by the digital computer or by a peripheral device, such as a scanner or digitizer, plugged into it.
Under the Digital Principle, we see that any digital or analog pattern is at least eligible to serve as software. If there are no other kinds of pattern, then we would be able to say that any kind of pattern may serve as software. Let us call the thesis that digital and analog patterns exhaust the domain of pattern, the Exhaustion Principle. Even if we do not assert the Exhaustion Principle, neural patterns (the array of synaptic switches in the brain and their on/off states) may easily be read as digital patterns and hence may serve as software.
I do not assert the Exhaustion Principle, but suppose it for what follows. This will allow me to speak of pattern per se as potential or actual software, without worrying about the existence of an odd type of pattern (neither digital nor analog) that might not qualify. If it turns out that there are such odd patterns, then they will not weaken my thesis so much as require a more prolix exposition of it that specifies some kinds of pattern and excepts others.
3. First Formulation One might want to stop here and attempt a definition: "software is pattern per se." This is a bold beginning that at least captures one of the necessary conditions of software. A moment's reflection, however, will reveal several deficiencies in it. But since we have only just started, we should not be discouraged and learn what we can from its shortcomings.
First, this definition does not say whether the pattern has a material expression. Must it be written on paper, recorded on a disk, or given another physical representation? Or is a pattern software when it is merely thought? Or if we are Platonists in mathematics may a pattern be software even before it is thought?
Second, the definition does not help us distinguish between software and data. This will be a crucial, but slippery, distinction that will play an important part in elucidating the nature of software. The text of the Gettysburg Address, when typed on a word processor, is also recorded as a pattern of magnetic norths and souths. But we would not want to call it software. Why? When the magnetic text of the Gettysburg Address is read by the magnetic head, is it not "instructing" the machine to do something, if only to print the appropriate Roman letters on the screen? If this work is actually being done under the commands of the word processing software, reading the Gettysburg Address as data, then how did one pattern attain the role of active operator and the other the role of passive object?
That is, even if patten is both natura naturata (software) and natura naturans (data), can it be both simultaneously or only successively? And either way, how are the two roles distinguished and how do given patterns acquire either role?
Third, the definition does not help us distinguish between software and noise. A random spray of norths and souths across the disk will produce a digital pattern. But we would not call it software for the same reason that we would not call data software, whatever that reason turns out to be. And there may well be additional reasons to deny noise the name of software, for unlike data noise is not even meaningful under the language conventions used by our machine or by the other software, such as a word processor, that stands between the machine and the pattern on the disk.
The problem is aggravated when we note that a random spray of norths and souths, and the Gettysburg address, could becomesoftware under a devisable programming language and for a devisable machine. A very clever person working backwards from an arbitrary series of bits could create language conventions that would make the string a meaningful program that did something interesting. (By "meaningful" and "interesting" I intend only to eliminate the trivial case of a program that performs the same act regardless of its input, for any noise could easily be converted to such a program.) This thesis is just the software version of a principle well known in other fields: some order may be made of any set of data points; every formal expression has at least one interpretation. Let us call it the Noiseless Principle, because it asserts that no pattern is noise to all possible machines and languages. (The Noiseless Principle is asserted here only for finite strings or patterns.)
If the Noiseless Principle is true, then patterns would be "noisy" only relative to language conventions which did not fit them. No string would be noise from all perspectives. Every string would be software from some attainable perspective.
Finally, restating the last two deficiencies from another perspective, our initial definition gives us no sense of why, and when, a given pattern is software, data, or noise. How does pattern come to be in a position of control, as it were, where it is read as giving instructions? What determines that software patterns "work", while data patterns are "worked upon", and noise patterns are neither?
It is well to note here that the word "software" may not name a coherent concept. Not all words can be presumed to do so, and this one names something that is undergoing qualitative change even now. But this will not suffice to dismiss our problem. For there are questions enough about the programs that currently exist, even if we do not try to unify all the things that legitimately go under the same name or to fix the concept for the future. The task of this essay is to elucidate the concept of software, more or less as it exists today, as opposed to giving all and only the legitimate senses of the word. I hope the concept can be articulated in a way that is (1) fair to the contemporary phenomenon, both at the user's level and at the lowest machine level, and (2) accomodating to future developments, especially in massive parallel processing, non-procedural languages, higher and higher level languages, and in the quantitive leaps in sheer size that create qualitative changes. If this concept applies to neural patterns, or to pattern per se, then it will not matter that the word typically does not.
It is also well to note here that I use "program" and "software" interchangeably. No gain in clarity is intended by saying that software is a kind of program or vice versa.
4. Executability What if we refined our first definition by saying that software is a pattern that is readable and executable by a machine? We insert "readable" to require a material expression for the software pattern, and we insert "executable" in order to distinguish software from data and noise.
Let us take the second qualification first. (The first will be treated in Section 5.) Does it really distinguish software from data and noise? I think not, and for two reasons. First, to require that the code be "executable" simply shifts the problem of distinguishing software, data, and noise, from the nature of software to the meaning of "executable". If executable code does not include data or noise, then what is it about executable code that sorts these things in this way? The problem is not solved but replicated in another place, though perhaps in a place more amenable to analysis.
Before we get to the second reason, note the definition's commitment on the important question whether a binary code is software when it is not being executed. If we take the suffix in "executable" seriously, then the new formulation implies that code is software if it can be executed, regardless whether it is presently being executed. This conforms to present usage, which calls a fully functional word processor, for example, software when it is being advertised, purchased, copied, and erased, not only when it is being run.
But does it matter why a piece of code is not running? We happily consider a thing to be software if it could run successfully on an existing machine, but happens not to be needed at the moment. But what if it isn't running at the moment because it is too full of bugs? What if it is so full of bugs that it is indistinguishable from noise? What if it isn't running because it is not "readable" by a machine? It might be a digitized picture of Ed McMahon's face. It might be well-written "raw code" that hasn't been compiled or interpreted. It might be a perfect bit-by-bit expression of some useful design, but written with pencil and paper. (More on this in Section 5.)
What if it requires more memory or more parallel processors than any machine now in existence? What if it is incompatible with present technology (hardware) in ways that we do not even understand? In Carl Sagan's novel, Contact, a pattern of numbers following the 1020th base-10 digit of pi is an encoded message from the creator. What if a piece of code is not being executed at the moment because there is no programming language that makes it software rather than noise? Or because humans have never apprehended it, even qua pattern?
Patterns that cannot run as software in the present could run in the future, if they were compiled, or debugged, given a certain magnetic (or other physical) representation, interpreted under suitable language conventions, matched with the appropriate machine, or some necessary combination of these. The Noiseless Principle assures us of this.
For some patterns the distance between their present state and the conditions under which they could run as software is very large, so large that perhaps we cannot even describe the conditions they require to be fulfilled. But in this they differ only in degree, not in kind, from the "programs" that are only one step from this goal, and will run perfectly when compiled or interpreted. Therefore, these patterns might as well be considered software, as long as we are clear that not all patterns can run as software under existing, or immediately available, or even currently conceivable, conditions.
The second and most important problem with the requirement of executability is that, in a rough and ready sense of "executable", noise will execute. Noisy code will execute badly. A program with no syntactic errors (so that it will compile) but plenty of semantic errors (so that it doesn't do what the programmer wanted) will be "noise" in one sense, but it will execute. If this looks more like stupidity than noise, then consider a random spray of norths and souths again. If a collection of random bits were packaged (in a file with a suitable extension and so on) so that we could try to execute it as a program, then why shouldn't we call the instant crash an "execution"?
Before we dismiss this question as a trick, consider the complexity of "crashing" and the vagueness of "execution". Some poorly written software can only be exited, or turned off, by contriving to crash. Crashing, then, is chosen by the programmer and user, engineered into the "proper execution" of the program. Some programs even have menu options that trigger crashes for the purpose of exiting. Even if we do not call these exits "crashes", despite the fact that they frequently freeze the system and never drop us gracefully where we left off before running the program, they show that there is enough breadth and depth to the notion of crashing to create borderline cases.
Apart from the crashes that occur only after a period of actual running, and deliberate crashes, there is the very nature of crashing to consider. A crash is an abrupt halt to processing; but to cause a halt, some pattern in the "software" must have an operational effect on the machine parts of the computer. Often this effect is manifold, and consists of keyboard lock, certain screen displays, and sound effects. While some of these effects are called up by non-crashing software resident in the machine that has detected the crash, this does not differ in principle from normal operation when "good" software has its effects with the aid of machine-resident software.
The Noiseless Principle tells us that any digital pattern can be read as a pattern that, qua software, could do interesting work. If a pattern crashes when run on a machine M as if written in language L, then the principle tells us to change M or L or both. If a key is jammed into the wrong lock, it does not follow that it will not open another lock; moreover, it is still a key.
We should be careful not to define bad software as non-software. This mistake (an illicit prescription arising during a task of description) has plagued aesthetics and jurisprudence for centuries, showing otherwise sensitive philosophers unable to distinguish bad art from non-art and bad law from non-law. But to avoid this mistake, we must not hastily call bad software that crashes non-software that does not execute. If we remember that pattern may be software even if it does not execute, provided it is executable, then we will never use non-execution as a sufficient criterion of non-software.
To distinguish crashes and bad executions from good executions, it appears that we must introduce the element of the programmer's purpose. Software executes the programmer's will, while semantically flawed, random, and crashing code do not. This suggests that to understand software we must understand intentions, purposes, goals, or will, which enlarges the problem far more than we originally anticipated.
Perhaps we must live with this enlargement. We should not be surprised if human compositions that are meant to make machines do useful work should require us to posit and understand human purposiveness. After all, to distinguish literature from noise requires a similar undertaking.
But although this complicates matters considerably, the matter is not this simple. For why should we assume that software is a "human composition meant to make machines do useful work"? Do all programs have programmers? (This is the software version of the physico-theological "clockmaker" question.)
Even if we dispense with programmers for some programs, can we dispense with purpose? For example, if neural patterns constitute a program with no programmer, we still want to recognize cases of flawed execution (brain damage). Must we appeal to purpose to do so?
If we limit software to patterns that human beings make on purpose, then we beg a question central to the inquiry, whether neural patterns are software. Can some software be purposeless, or as problematically purposive as the universe itself?
These questions would not be begged, and could even be dismissed, if we limited ourselves to our exemplary hardware, the personal computer. For all software written for such a machine is human-made and is meant to serve human purposes. But to do this would beg a different question, the question what is hardware. Rather than let our point of departure proscribe these questions, we should let these difficulties invite us to question our point of departure.
We cannot, then, resolve the question of purpose by appealing to a prior limitation on the notion of software or hardware. We cannot assume that software and hardware exclude the purposeless or the "natural". They might range over what make us human, not only what is made by humans. But if we cannot make these exclusions, then we must find another way to distinguish crashes and bad executions from good ones, unless we are willing to let the "executability" condition in our definition reduce to triviality.
5. Readability Let us now return to the qualification that software is a pattern that must be "readable" by a machine. There are two requirements of machine-readability. First, the pattern must have the proper material form, which is a function of the machine to read it. If a given machine reads magnetic fields, as opposed to punched cards, hand-thrown switches, or pits and lands on a reflective surface, then the pattern must be represented in magnetic norths and souths. Second, the pattern must be in "machine language". While English can be digitized, no machine can yet do anything with digitized English (qua software, as opposed to data). Programs that are written in "high level languages" must be translated, not merely digitized, so that their digital form is compatible with language conventions at the machine level. English or the high level programming language, by contrast, is compatible with language conventions only at the human level.
The first of these requirements (let us call it the physical requirement) is for a physical representation in general, and for one of a particular kind that suits the machine that is to read it. The second requirement (let us call it the grammatical requirement) is for a pattern to the pattern, or for certain syntactic structures within the pattern. The first is not intrinsic to the pattern while the second is.
Under the Digital Principle we can always make a digital pattern from an analog pattern. If we add the Exhaustion Principle, we can make a digital pattern from a pattern of any other kind. Our instructions to the machine, once expressed in any form, then, can be digitized. Once digitized we can meet the physical requirement. It is simply a technical matter of using norths and souths instead of 1's and 0's, or whatever.
To meet the grammatical requirement we must translate instructions that we understand into a language that the machine can "understand". We can write our instructions in the first instance in the machine language of the computer, or we can instruct the computer to do this itself. Programs that translate higher level languages into machine language are called "compilers". If we express our instructions in the strict and uniform syntax of a programming language, then a compiler takes us the rest of the way to meet the grammatical requirement of machine readability.
We might note here that the higher level, uncompiled software pattern may be translated not only into the lower level machine language, but also into other equally high level programming languages. Any kind of translation will change the pattern qua pattern. That is, a different series of 1's and 0's will result. The original and the translation need not be the same length, and even if they are, need not map each other at the bit level. Different patterns that are translations of one another may do exactly the same work. So may certain patterns that are not even translations of each other. (More on this in Section 10.)
If a pattern meets the physical but not the grammatical requirements, then we will have an instant "crash". Here is where failure to read and failure to execute are indistinguishable, and where minimal performance and simple non-performance also converge. A pattern that is impeccable software in a higher level language is still noise to its intended machine until it is properly compiled or translated. It cannot be read or executed until it meets the grammatical requirement.
Conversely, to meet the grammatical requirement is to meet most, if not all, of what we originally intended by the executability requirement. Patterns that are physically and grammatically readable by a machine, will "execute" in every sense necessary to define software. They may fail to do the programmer's bidding because they are semantically buggy. But this is to execute badly, not to fail to execute. Something that gets far enough to add 2 and 2 to get 5 is unquestionably software.
Let us focus more closely on the physical requirement. To be read by a machine software must have a material expression. Software may pass through stages where it has either an unsuitable material form or none at all. It may be written in English, spoken in English, thought in English, or thought in a more abstract mathematical language. The physical requirement means that it must be capable of assuming a suitable material form, even if it never does so.
I said above that any pattern can be embodied. Why should we believe this? First, patterns that can be imagined can be drawn. Patterns that are conceivable but not imaginable (like Descartes' chiliagon or 1000-sided polygon) can be described in a notation that provides a complete recipe for conception; and the notation can be drawn. If something cannot be conceived, it probably does not deserve the name of pattern. And what is drawn is thereby given a physical representation that can be read or decoded by suitably designed machine. Let us call this thesis, that any pattern can be physically embodied, the Sensible Principle.
Now what is it for a machine to "read" an embodied pattern? If we want a train to go straight, and express our desire in the material form of straight track, has the train "read our software"?
As with "executable", we must take the suffix in "readable" seriously, and we find that the term is entirely acceptable when we do so. Taking the formula strictly, a pattern may be software if it is readable by a machine, even though it is not being read at the moment. If a program meets both the physical and grammatical requirements, but is written on a disk sitting idle in a drawer, it is still software. As for the physical requirement alone, there is no reason to deny the name of software to programs that exist (so far) only in thought, provided they can be completely articulated in principle hence expressed in a uniform language, hence digitized (under the Digital Principle), hence embodied (under the Sensible Principle). This formula, then, permits us to classify as software both the immaterial idea and the material expression that is read.
I have spoken as if ideas were immaterial. For present purposes this is only for contrast with the material substratum that supports the machine readable pattern. The formulation does not imply the immateriality of ideas; it implies that ideas that can be made readable by machines, but that are not yet so, are just as much software as those that are already readable. The formula cares not whether both kinds are material objects or just one.
(Throughout this essay, the adjective "ideal" and noun "ideality" refer to the metaphysical standing of ideas and pattern, whether it is best explained by an idealist, materialist, or other metaphysical theory. An idealist theory is not presupposed, but neither is it excluded. A materialist may have no more difficulty with idea or pattern than an idealist has with hardware. While I make no commitment here, I suspect that the problem of software is just what is needed to overcome the oversimple opposition of materialism and idealism.)
This outcome permits us to call something software if it is inchoate or nascent, still at the stage of mere idea, provided that it is capable of taking a machine-readable expression at some future time. This enables us to identify as software the separately non-executable fragments written by different teams working on a gigantic programming project, and the programs written (in English, shorthand, and mathematical notation) on the back of an envelope. It also allows us to call something software if is already in a form that can be read and executed but, in that form, happens to be sitting in a drawer. We may say that it is software that is not being read or executed; we need not say that in drawer the pattern reverts to data or noise.
6. Pattern Per Se Again Note what has become of our two qualifications, that patterns be readable and executable by a machine. The suffixes make the qualifications plausible, but when we take them seriously the qualifications no longer qualify the universe of pattern. Taking readable and executable strongly, pattern is software even if it is not currently in a form that a machine can read, and is not being executed, provided that it can in principle take such a form and be executed. This permits us to recognize software as pattern that is still without physical embodiment and awaiting profound technical advances as conditions of its actual reading and execution.
But this means that the qualifications do not limit the patterns that may be software. We are left with the original formulation that software is pattern per se. We now assert, however, that it must in principle be capable of meeting the physical and grammatical conditions of readability and the requirement of executability. But the Sensible Principle says that all pattern meets the physical condition. The Digital Principle says that we can digitize any pattern, and from there translation is all that is required to meet the grammatical requirement. And, as noted, the grammatical requirement carries the burden of the executability requirement; or, all pattern executes, at least by crashing, and executes better, though perhaps badly, as soon as it meets the grammatical requirement. Under these principles, then, all patterns already meet the readability and executability requirements.
The implications are at first astonishing. Not only is a random spray of bits software, or the second 1020 digits of pi, or the curves of the lips of Mona Lisa; so is the arrangement of papers on my desk, trees in the Yukon, and stars in the Milky Way galaxy. So is the arrangement of neurons in the brain, of the molecules comprising the neurons, the atoms comprising the molecules, and of any bits at any lower level we care to contemplate. The universe is software insofar as it is sufficiently determinate to show pattern or bear information rather than, as Hegel said, the spectacle of contingency losing itself in vagueness.
These are patterns that can in principle be read and executed by a machine. If we cared to devise the language conventions and machine to do so, and to specify the differences that should make a difference, then we could read the information of a dirty ashtray as software in a non-trivial way and execute it toward some interesting end. The end is by no means inherent in the pattern; we have infinite freedom, when we devise our languages and machines, to interpret the unmeaning pattern as we like. We are not saying, then, that concrete objects, because they display patterns, are really mere forms or intrinsically interpreted texts awaiting decipherment. We are saying that concrete objects, because they display patterns, can in principle be read and executed as software. They may be so read and executed only with the aid of a language and suitable hardware, which are other patterns. Hardware, in short, is also software, but only because everything is.
Let us call the freedom of interpretation the flexibility to read intrinsically unmeaning formal patterns as holding a given meaning, and the freedom of formalization the complementary flexibility to create a formalism or code to express a given meaning. It does not matter here that both these freedoms are actually infinite. (To claim the least I need for my theses, I will say that these freedoms are only indefinitely large in scope.) The conclusion that everything is software is a product of both. The freedom of interpretation enables us to read the pattern of commas in the first edition of the Critique of Pure Reason as a definite structure holding an indefinitely various body of information, depending on what language conventions we will use to interpret it. The freedom of formalization allows us to construct the pattern that, when physically expressed as hardware, will execute the patterns we have chosen as software.
The freedom of interpretation strengthens the Noiseless Principle and allows us to assert that a random array of information will not only be meaningful under at least one set of language conventions, but will be meaningful under indefinitely many. The freedom of formalization strengthens the Digital Principle and allows us to assert not only that a non-digital pattern can be digitized to an arbitrary degree of accuracy, but that it can be captured to the same accuracy by an indefinite number of digital patterns (using different language conventions).
I explicate the conclusion through these freedoms in order to emphasize that software patterns do not carry their own meanings and need not. They need only make a fruitful match with another pattern (embodied in a machine); since we create the latter, the former can be conscripted as it were to service as software. They may still be meaningless. But just as we distinguish "fee fie fiddley I oh" from "nick nack paddy whack", or red from green, without imputing to any of them an intrinsic meaning, or even a conventional one, we can distinguish other meaningless patterns from one another. And insofar as they can be distinguished, they are able to carry definite information and function as software.
The thesis that everything determinate is software does not trivialize the concept of software. It highlights the attribute of determinate existences to differ from other determinate existences (determinatio est negatio), thereby to be bearers of information. Information is necessary for software and when it is physically embodied (and under the Digital, Noiseless, and Sensible Principles) it is sufficient. But all information is capable of physical embodiment. Embodied patterns suffice for software even if there is no known set of language conventions and no known hardware presently capable of executing the pattern as software, for in principle the language and machine could be produced. If one wishes to deny the name of software to patterns still awaiting the development of suitable languages and hardware to convert them from unexecutable noise to runnable software, I will not object, provided one acknowledges that such development is always logically possible and that many non-controversial cases of software do not lose their status as software when they are not actually being read or executed.
7. Liftability For many people, the essence of software is not that one can run this piece and then that piece on a given machine, although this is a revealing property, but that one can run the same piece on this machine and then on that machine. Hardware is programmable, software is portable.
There is a deeper kind of software portability. If software is pattern first, and material embodiment of pattern second, then it can be ported from one substratum to another. It is liftable. The pattern embodied in magnetic fields on a disk can be lifted from the disk and transposed to solids and blanks on the cells of punched cards, and vice versa. Or, what is more commonly done, the pattern embodied on the magnetic substratum can be lifted and transposed to the medium of the random access memory (RAM) of the computer, where it is available for use by the processor. This is done whenever a program is run on a contemporary computer.
It is this property of software that gives workers in the field of artificial intelligence (AI) their hope of success. If the mind is reducible to the brain, and the brain is a complicated pattern of neural switches, then the mind is a digital pattern that can be lifted and reinstalled in a silicon substratum.
Liftability may enter our definition of software, but we should note that it is derivative, not primary. It is a consequence of defining software as pattern first and embodied pattern second. But despite its derivative status, liftability serves a valuable function in our definition. It illuminates the exact sense in which software is independent of hardware and the physical media that record the software pattern. Software is not defined as a pattern of any particular material; it is defined as pattern that may be represented in many different materials. It may even be represented in many different clods of the same material, as when one program is written in magnetic oxide thousands of times.
The question whether neural patterns are software, then, may be rested on the question whether they are liftable. The answer seems to be that they surely are, but simply have not been due to the extraordinary complexity of the pattern. (There are about as many neurons in one human brain as stars in the Milky Way galaxy, 15 billion, and three orders of magnitude more synapses or connections between them.)
The question whether intelligence is liftable depends on the prior question, whether the mind is reducible to the brain. The latter question is much different, and not at all answerable, from the liftability of neural patterns.
8. The Softness of Software The word "software" is in many ways a brilliant coinage. Like many terms invented with the computer, it reaches for clear, colorful, and quotidian metaphors when desiccated latinate compounds could have been justified by their precision. "Software" appealed immediately because it reminded us that there was something soft about software, compared to hardware. Users recognize this softness, but might have difficulty articulating its nature. It seems to me that the softness of software has two sources that are rarely kept distinct: its alterability and its ideality.
(Remember that "ideality" refers to the status of ideas and pattern per se, even in a materialist metaphysic.)
Software is typically loaded into a machine through disks, tape, ROM (read-only memory) modules, or cards. It is alterable in the sense that we can pull out one disk and put in another. But that much is also true of light bulbs, rubber feet, and disk drives. To substitute one program for another goes to the programmability of the hardware; to amend a given program goes to its alterability in the present sense. Individual programs may be altered about as easily as text, which is usually much easier than altering circuits or power supplies. Alterability also varies with the material medium of the pattern; magnetic norths and souths are easier to rewrite than holes in punched cards, which is the chief reason why magnetic media have come into prominence. Clearly the softness of software is a matter of degree. Where the machine will not accept different inputs, or will not allow users to set things up their own way, we say it is "hard wired".
A lock may accept many keys, even an infinite number, but will not accept just any key. We like to make our keys of substances that permit alteration as needed: hard enough to operate causally on locks, soft enough to allow us to change them rather than the locks when a different fit is required.
Machine parts are hard physically. Software has a dimension of pure pattern or meaning, and is in that sense soft. Even when it is physically embodied, its embodiment does not exhaust it; if it did, it would not be liftable. It is in this sense softer than material things that lack this extra dimension. Its status as pattern makes any given embodiment optional and contingent; and its mutability means that decisions at the pattern or intelligible level (different "instructions") can make a difference at the "hard" level of machine parts.
But even the hard machine parts serve their functions because they are embodied pattern. A simple steel L-brace must be L-shaped to do its work well. This shape is clearly liftable and may be reinstalled in rye bread. The necessity of steel for the brace to function as intended, or the necessity of some small range of other materials, differs only in degree from the necessity of magnetism for software to be read by a given machine. The softness of software is seen partly in its liftability, which expresses an ideality shared by all other embodied pattern, and partly in its mutability even after taking a particular embodiment. Hardware is usually soft in the former sense, rarely in the latter. (There are no unexceptionable ways in which hardware is hard.)
Usage tends to blur these two types of softness. For example, in a word processor the user may hit the "return" key to terminate the current line and begin a new one below it at the left margin. Or she may let the computer do this automatically when a line of text grows to hit the right margin. In current computer argot, the former are called "hard" carriage returns, and the latter "soft". To all but novice users, this usage has an inner logic. (The expressions are remarkable only for the outdated reference to "carriage returns".) Similar expressions exist for hyphens, spacing, and other formatting features in word processing. Even the text in its entirety is soft so long as it is subject to editing by the word processor; to freeze a version by printing it is to make a "hard copy".
Yet it is not clear whether soft carriage returns, for example, are indulgently called soft because they are alterable or because they are ideal. They are more alterable than hard carriage because they disappear if they do not remain at line ends after one adds or subtracts material to a paragraph and then reformats it. But for virtually the same reasons that soft carriage returns are alterable, they are ideal; they exist not only as data, like the alphabetic symbols, but as instructions to disappear under specified circumstances. They have a more evanascent form of existence than the other evanascent objects in a text file or on a screen.
Now comes the hard part. One of the physical embodiments of a software pattern is a circuit. Whether circuits are macroscopically soldered to contacts, or microscopically etched on a chip, they are "hard wired". The patterns embodied in circuits are not read off some easily interchangeable insert, like a disk, though they are often etched on circuit boards that are designed to plug in and unplug for modular expansion and alteration. In most contemporary computers circuits are not programmable except with a soldering iron, though there are some integrated circuit boards that are programmable by devices that selectively burn out circuit elements. Circuits are part of the machine itself, part of the "hardware". But under our definition, all circuits deserve the name software, since they are physical embodiments of patterns, readable and executable by a machine, and liftable. But those that are written as software by programmers, and hard wired into the innards of a computer, have a special name in the computer world: "firmware". Firmware is one of the most important examples of hardware that is software.
The point of compiling a program written in a high level language is to put it into machine language, or into the language conventions "understood" by the machine. But computers do not have native languages, the way the native language of trains is track. Computers are given language conventions by firmware. A pattern in the machine meets the pattern on the disk, and together work is accomplished. (The pattern of the lock is as essential to "unlocking" as the pattern on the key.) Without firmware, or without software that has been hardened into circuits not easily alterable by users, the more alterable kinds of software could not serve their function. Firmware, or patterned substance, allows the very existence of a grammatical condition of readability.
If neural patterns are software, then they are also firmware. Their "pre-embodied" character, then, does not decide the question against their being software. The fact that they are not as directly programmable as hardware designed to be a computer, or as directly alterable as software written on magnetic media, that is, the fact that they are comparatively immune to the effects of human purposive action, does not exclude them from the universe of software. It is important to make the point in this way, even if "learning" is a kind of programming (or deprogramming). We have let pattern count as software without regard to its purpose or whether it was made by human beings. Part of the uneasiness this idea might have created is traceable to the limited usage of the word "software", and part is no doubt due to the fact that the brain is not "soft" the way most computer programs are. Once we recognize firmware as software, most of this uneasiness should disappear.
9. Software Versus Data The Noiseless Principle breaks down what at first looked like a firm distinction between software and noise. Everything that is a candidate for noise is also a candidate for software, given a suitable language and machine. A similar breakdown occurs with the distinction between software and data. There are important reasons to make this distinction permeable, or tentative, or provisional.
The main reason is that there are programs that work (are readable and executable) only because their physically embodied pattern is treated at some times as software and other times as data. In fact, this is true of most contemporary software. To compile a program written in a high level language, it must be treated as data by another program, the compiler. It must passively be worked upon, so that later it may actively do work. Without this step, the programmer might as well have spoken in English. Programs written originally in the binary code readable by the machine do not require this translation, and hence need not be treated as data. But virtually all programs written today are written in higher level languages.
Apart from this critical step in the very functioning of software, programs are treated as data for the purpose of copying (publishing) and transmitting them. Even if a program were originally written in machine language, chances are good that if we are using it, its code has once been treated as data by another program.
A minority of programs takes this many steps further. There are several ways in which an AI program can be designed to learn from its mistakes and its experience. One way is to alter itself, qua pattern, in accordance with current inputs on its performance and current rules. For this the program must treat itself (or a copy of itself) as data. The programming language LISP is designed to make this easier than it is in most other languages.
A very similar phenomenon occurs in the reproduction of life, when the pattern on a DNA molecule is used as software to make a copy of itself, in the process of which it must scan itself and treat its pattern as data. Without the ability of a pattern to serve alternately as software and as data, as director of process and as subject of process, life itself would be impossible. (I owe this example to Douglas Hofstadter.)
The phenomenon is more familiar in mathematics if we compare functions to software and their arguments (or input) to data. A function, d, that doubles its argument, keeps the distinction between software and data clear and clean: d(3) = 6. But when we make d's argument a second function, then we treat the second function as data and complicate the formerly simple distinction of levels. But this is so commonplace that it too causes no conceptual difficulty: d(d(3)) = 12. Note that the level-distinction has not been abolished. The inner function must be computed, hence treated as a function, before it can give us the data permitting us to compute the outer function. The inner function is treated as software and data successively and distinctly, not as a blurred amalgam of software and data.
Equally familiar is the use-mention distinction. When the words "I promise" are used, they do work the work of promising. When they are mentioned (quoted), as they were in the last sentence, they do not do that work. The use-mention distinction exactly parallels the software-data distinction, and underlines the fact that the pattern to be software or data, like the word to be used or mentioned, need not change in order to change its role or function.
The pattern of a key is used as software when the key opens a lock; it is used as data when it is used as the model for cutting a duplicate or when it is visually compared to another for identity. It can play either role without changing its pattern or its substance.
I am not asserting software-data identity, or even strict interchangeability. A single pattern may alternate in the roles it plays, in the functions it fills, or in its relations to other patterns. There is another affinity between software and data that need not detain us here, but that points in the direction of genuine software-data identity in a particular sphere. In a very high level language that interacts with a user, taking inputs and continually translating them down to the machine level for processing, it becomes difficult, and increasingly pointless, to distinguish inputs from software. The user may essentially be writing a program by answering questions, when the questions are "put" by a high level combination of an application and a language. But when input and software become indistinct, data and software have too.
10. Software Identity If software is pattern, and pattern is a definite arrangement of information, then do distinct patterns always make distinct pieces of software? Are we forbidden by this principle to identify two programs that differ at the bit level?
On one reading, this question confuses pattern per se with the particularity of specimen patterns. As we saw, particular patterns that are translations of one another will do exactly the same work even though they will differ in at least one bit. By saying that pattern per se is software, we do not focus on the pattern's pattern (at the bit level or at any higher level), but on its being patterned or having the status of pattern.
But the question might be restated. If the essence of software is to be pattern, then different patterns should make different software. If we nevertheless identify two different patterns as the same program, haven't we denied that the essence of software is pattern, and located that essence in something like "the work done" or "the program(mer)'s purpose"?
First we should note that there there are many cases in which we want to say that two disks contain the same program, when a bit by bit inspection of the medium would reveal different patterns. When one program is a translation of another, then we could say that they have the same "semantic" pattern, which the two languages require to be reflected in different "syntactic" patterns. But even if this does not equivocate too far on "pattern", not all cases are like this. One pattern could be the version of the other that was rewritten for a different operating system or different computer. Both could be designed for the same machine, but one could be a later version that differs from the other only in non-substantive refinements, that is, in changes that make no difference in the output from the same input. The later version could have more elegant algorithms to do the same work, or it might run faster or make better use of computer memory. Or it might add or subtract the white space that makes the program, qua text, easy to read, or add or subtract the comments left in the program by the programmer to herself and her posterity.
If the essence of software is syntactic pattern, then even one-bit deviations would make two patterns into two different programs. Sometimes we want to insist on this, when the tiny difference has large macroscopic effects, such as the binary equivalent of a "not" in front of a predicate. But when the pattern-differences have no effect on performance, even when they are very numerous, we may want to say that the two patterns make the same program. We should be careful here. The word "software" may justify this application while the concept we have been elucidating does not; or they both might do so.
If we do identify two different patterns as the same program, we are passing from the pattern of articulation to the pattern of work. We pass from the syntax to the semantics of the program. We pass from the uninterpreted arrays of bits to the function computed or the output and operation as interpreted by human beings. We look to the uses of the program to the programmer or user, not to the structure that permits it to serve those uses. In addition to pattern in the sense elucidated, then, we add the familiar and problematic notion of purpose.
Here it is well to admit the obvious, that human beings do make software, and do so for purposes. While that does not foreclose the possibility of software not made by humans, and purposeless, it sheds light on the current problem. I submit that when we are most tempted to identify two different patterns as the same program, we are referring to two human products designed (through translation or engineering) to fulfill the same purpose.
Two different "natural" patterns would not normally raise this temptation, unless they had some purely syntactical similarity (like the geometric patterns on the backs of snakes of the same species) or had been subordinated to some human purpose (like certain animal profiles and 'similar' star clusters interpreted as constellations).
The mere fact that purely syntactic patterns can be the repository of purposes is significant, however. It means that syntax can serve semantics, and that software patterns can capture purposes in order to promote them just as language can capture purposes in order to describe them. That articulationshould be useful is not astounding. But it does not by itself mean that syntax gives rise to semantics in general, or that purpose is a high level epiphenomenon of pattern.
Let us call the thesis that syntax does give rise to semantics in general the Pythagorean Principle, because it restates in modern terms the thesis attributed to Pythagoras that all is number.
If we deny the Pythagorean Principle, then we can only identify two different patterns as the same program by invoking extrinsic semantic concepts like purpose or function. Then the essence of software would rest equally on syntax and semantics. But if we affirm the Pythagorean Principle, then we may identify two different patterns as the same program consistently with the view defended here that the essence of software is pattern alone or syntax. To identify distinct patterns on the basis of purpose or other semantic features would not surpass what is already present in them as syntactic patterns.
I note this compatibility without, here, asserting the Pythagorean Principle. If it is true, it demands far more discussion that it can receive here.
Now we can answer the postponed question of how to distinguish crashes and bad executions from good ones. If we add purpose, either as a secundum quid or as an epiphenomenon of pattern, then it easily permits us to draw the distinctions needed. Good executions fulfill the programmer's purpose; the others do not. To bring purpose into it as an epiphenomenon of pattern is compatible with the account offered here only under the Pythagorean Principle.
But if we omit purpose from our account, there will be no non-arbitrary way to distinguish crashes and good executions. (We could arbitrarily define a crash as any execution that lasted less than five seconds or required fewer than five changes of the internal state of the machine.) So if we do not assert the Pythagorean Principle, and do not admit purpose as an equal and independent explanans of software, then we must say that the requirement that software be "executable" is trivial, since even crashes would count as executions. But this is not as costly an admission as it first seemed, since the heaviest part of the burden of "executability" has been captured by the grammatical requirement of readability. (See Section 5.)
One reason to want to affirm the Pythagorean Principle is to recognize that human-created programs are not the only ones. If there can be software without purpose, or with the same problematic purposiveness as DNA or the universe at large, then we would falsify the phenomenon or carve out only a small subset for analysis if we made purpose as essential to software as syntactic pattern.
11. Software Versus Data Again Under the Sensible, Noiseless, and Digital Principles, the domain of patterns that can serve as software opens up to include, apparently, all that can be called pattern. If all finite patterns can be physically expressed, then all can be software. If all finite patterns that are noise under existing language conventions can become meaningful and interesting software under devisable new conventions, then random and meaningless patterns can be software. If software does not cease to be software when its pattern is regarded for a time as data, then data patterns can be software. If every analog pattern can be captured by a digital pattern to an arbitrary degree of accuracy, then software patterns are not limited to those that are already digital.
If these Principles are reduced to conjectures, then the concept of software, and the boundaries of its extension, become uncertain. I call them principles because I have offered some argument for them, but I am the first to admit that they have not been proved here.
Under the weight of these three Principles, the primary objections to seeing pattern per se as software must be qualified. But it does not follow from the claim all patterns are software in principle that all patterns are equally available for use as software. We have never denied that there are important differences among patterns that can be read and executed only under remote conditions, those that require only translation or compilation or embodiment by known methods, and those that can be read and executed without further ado.
Moreover, there is a real difference between using a given pattern as data and as software. The transition from one to the other is not through translation or embodiment, but through a kind of reorientation or mechanical gestalt shift.
This is the last loose thread to tie up. Software may essentially be pattern, but how is it to be distinguished from patterns that are used as data rather than software? How does it take the position of natura naturata and then natura naturans? This seems to be the central mystery. How can pattern be read as instructions? How can mere pattern rise from passivity to activity? Why isn't sheer syntactical pattern always inert, perpetually data and never software?
We can approach answers to these questions by saying that software is pattern in a controlling position, while the same pattern in a different position will be data (and the same pattern under different language conventions will be noise). But what is this "position"? The first thing to observe about it is that it is not part of the pattern. It is the use to which the pattern is put, or the relation between the software-pattern and other patterns that are currently functioning as data. In this, to assume the "controlling position" is similar to meeting the physical requirement of readability; it leaves the pattern unchanged and occurs independently of the syntactic and semantic content of the pattern.
It is this "position" or use of the software pattern that enables its binary code to be taken as code for instructions that are to be executed. The matter is simpler than it may appear. If we write down on one piece of paper directions for copying a page of text, and on another piece of paper directions for erasing or shredding a page of text, then we may give them to a stranger and ask that the top sheet be read and applied to the bottom sheet. It does not matter how they are shuffled; each can apply to the other as it can apply to itself. One is put in a controlling position if the "hardware" (here the stranger) reads one first and one second. Odysseus may command his men to tie him to the mast as his ship passes the island of sirens, and to ignore any commands to be released that he might issue. If his men obey this command, then it "poisons the well" for future commands and causes them to be interpreted as data. But the commands to be released are like the directions written to the stranger: fully satisfactory and "authoritative" as commands. Whether they function as commands or data is a matter of whether they are taken up earlier or later than other contenders.
Approaching the area of contemporary computing, imagine two bit-level editors on the same disk. Either program could edit, copy, or delete the other. When neither program is running, they are equally capable of assuming the software position and the data position. When one is running (on a serial processor), it has taken the field and forced the other one to sit as data. The occupants of the positions are interchangeable, but the positions are not identical.
Patterns that could function as software but that are currently functioning as data cannot escape their "data position" and assume the "software position" by virtue of any information they contain, unless the pattern in the "software position" permits it. The bit-level editor about to be erased is helpless to use its weapon of counter-erasure. Attempts to reach a meta-level, to jump out of the data position for even one second or one effective utterance, are closed unless the rules at the meta-level permit it. Patterns of behavior, regulated by beliefs or physiology, can function like software patterns at least to illustrate the logic of this situation. It is improbable that an accused witch can persuade her inquisitors that witches do not always lie. It is improbable that an accidentally committed person can talk his way out of an insane asylum, or that anyone can make a baby stop crying by spanking it.
What, in the last analysis, are instructions to a machine? What is a command? The metaphor of instructing or commanding a computer through a program has been misleading, especially if it suggests that any machine that can be instructed must already contain a germ of intelligence. If machines are entirely physical, and operate through causation, then instructions must be causes. That machines are entirely physical is not precluded by their display of patterns with a dimension of ideality. Indeed, it is by virtue of the determinate form of the machine parts, or their embodied pattern, that they can be caused in the complex and regular ways that make computation possible.
Instead of the two bit-level editors that can somersault together, imagine two gears of different sizes with meshing teeth. When power is applied to the small gear, the big one is turned in the opposite direction. What it means to "command" the large gear to turn in a certain direction is to arrange the physical causes of its motion so that it turns that way. If the clockwise motion of the large gear had the side-effect (by turning other gears) of making a sign appear in a window saying that the time is such-and-such, then the way to "command" the machine to tell the correct time in the window is to arrange for the small gear to move counter-clockwise so much so fast at a certain moment. If the control of the smaller gear were through knobs labelled with hours and minutes, then it would appear to human users of the machine that one could "command" the machine to show a certain time by turning the knobs in a certain way. The appearance would be justified to a large extent. But the meaning of telling time would remain utterly foreign to the machine. Even the notion of commanding and being commanded would remain foreign to it. All it would know is the physical rotation of knobs and gears.
Similarly, software does not command computers in any literal sense. Contemporary computers and software do electronically what knobs do mechanically with gears. The point of labelling knobs with hours and minutes is to make the "user interface" with the machine more transparent. If one is to set the time, one should think about time, not about gear ratios. A good machine designer will take the burden of calculating such ratios off the shoulders of the user, and ask for input only in time-related terms intelligible to humans.
Because of their greater complexity, computers can mask the unknowing of their internal parts to a much higher degree. When it appears that we are typing letters from a keyboard, in response to a request on a screen, we are actually hitting switches that call up certain patterns that have been linked by a program to the pixels (dots on the screen) that compose a given font of Roman letters, that are then displayed dot by dot on the screen. To us it is an intelligible interaction; the screen asked for our name, and we provided it. But instead of (some would say: in addition to) instructing the computer that our name was so-and-so, we have been manipulating physical causes at an electro-magnetic level. The designer of the machine has relieved us of the burden of calculating which patterns of 1's and 0's print the Roman letters of our name on the screen. This predesigned convenience allows us the luxury to think of commanding and instructing, rather than controlling minute pulsations and surges of current.
Although the process is extremely complex, even superficial acquaintance with it demystifies how a machine can receive instructions. The physical medium of the pattern of the instructions is designed for the machine with which it is to interact. A 19th century loom was instructed through punched cards that controlled the height of pins that read the cards. Contemporary computers are instructed through magnetic disks and keyboard strikes that control the electrical states of transistors acting as switches in a very complex circuit. The loom is designed to use the physical state of its reading pins to direct the weaving process, just as the computer is designed to use the circuit set and reset by the disk and keyboard to direct the manipulation and display of data.
In this way patterns with a physical expression can instruct machines. But how can a single pattern function alternately as software and data? How does the machine "know" which one it is to be? The answer is provided by the twin gear example. When the power is applied to the small gear, the larger gear is driven accordingly. But the power could be applied to the larger gear at a different time, driving the smaller gear accordingly. Which gear is the driver and which is driven is never a matter of metaphysics or mystery. It is a matter of which gear's "information" is "consulted" first and used as the basis for "interpreting" the information of the other, i.e., which one's physical embodiment is allowed to work causally on the machine first.
If we take this path of explanation, there is no conceptual problem in saying that the brain is hardware that processes software. This is true whether or not there is a soul superadded to the brain that explains human intelligence. The fact that the brain is not instructed by disks or tape or any other single compendious medium is plainly inessential. Further, the fact that the neurons are unintelligent is as irrelevant as the fact that gears don't know time and that transistors don't know words and numbers.
12. Church's Thesis, Instruction, and Causation There is one more aspect of instructions that raises difficult questions. When written in programming languages, instructions are "algorithms". In procedural languages, they lay out the steps to be taken to make a calculation or perform a task. In descriptive languages, programmers simply describe the desired result with some precision and algorithms built in to the implementation of the language do the work. An algorithm is an "effective method", a recipe for performing a task or solving a problem that, if followed scrupulously, will finish the task or give the right answer after a finite number of steps. The intuitive notion of effectiveness in computation is that of mechanical and dumb step-taking, requiring no ingenuity or insight, and yet sufficient to finish the job correctly. This may sound precise, but what is most remarkable about effective methods is that effectiveness remains an intuitive notion that can be paraphrased by various precise models but never demonstrably exhausted.
This is important. A task is computable if and only if it can be performed by an effective method. Hence, the theory of computation requires a precise definition of an effective method. But no precise formulation can ever be demonstrably correct or sufficient just because it must satisfy the intuitive sense of effectiveness. Many mathematical definitions of computable functions have been devised to capture the intuitive sense of effectiveness (by Gödel in 1931, Church and Kleene in 1932-35, Turing in 1936, Post in 1943, and Markov in 1951). It is surprising and suggestive that they converge and describe in different terms the very same set of functions. This is inductive support for the thesis, called Church's Thesis after Alonzo Church, that these formalisms capture the intuitive sense of an effective method. It asserts that every task that is effectively computable in the intuitive sense will be computable in these technical ways.
Church's Thesis can never be proved because it makes appeal to what conforms to a rough and ready intuitive sense. (The Exhaustion Principle may be indemonstrable for the same reason.) If that is true, then is the concept of software subject to the same lack of closure?
The answer is yes and no. If a program performs work and ex hypothesi does so without an immaterial soul to assist it, then it works by effective methods. We can know this even if we do not know which ones they are, or how to write programs to execute the effecive methods that we have in mind, or even what an effective method is. Anything is an effective method if it has an effect, if it is a cause. Hence, a complete definition of software may contain the assertion that software works by effective methods without running afoul of the imprecision of the intuitive sense of effective computation. The indemonstrability of Church's Thesis does not render the assertion indemonstrable or unintelligible that effective methods are used.
On the other hand, if physically embodied pattern influences physical machine parts through dumb causation, then a complete definition of software may have to include a theory of causation. Only by laying out such a theory can we express our confidence that software performs work without an immaterial soul to assist it. In the details of a theory of causation we will run into the intuitive sense of effective computation, and either make conclusory appeal to that sense or articulate it in new ways.
Short of offering a theory of causation one may observe that the model of a machine "mechanically" performing its tasks is the paradigm for most people's intuition of effective computation. There is no violation of intuition in saying that whatever hardware can do with the assistance of software is effective. There is only a problem of proof in asserting the converse, that if a method is effective, then it can be performed by a given precise technique, e.g. recursive functions, Turing machines, or any named programming language, program, or machine.
Indeed, the only astonishing thing to intuition is how dumb switch-throwing or bit-switching at the lowest machine level can concatenate to produce non-intuitive and even mind-boggling results. This is the same remarkable thing as how complex syntax can simulate semantics, or how the commas in the first edition of the Critique of Pure Reason, together with a few dozen other intrinsically meaningless marks, simply by differing from one another and standing in a particular complex pattern, may articulate a revolutionary theory that changed history.
From the top down, as it were, software is pattern that we use to do work. When we use it this way, it is by reducing the work to a complex series of effective methods. Even from this standpoint, software qua pattern is not a series of effective methods; it is the encoded expression of them. But from the bottom up, software is a pattern that has taken the form of (say) magnetic norths and souths that cause a magnetic head to send pulses to a complex circuit. There is no instruction except causation and no effective methods except "one damn thing after another." High level languages can express effective methods; after compilation into a low level language there is nothing deserving the name any longer, unless it is simply causation.
Hence, if we regard software from the semantic side, as instructions, then the limited precision in our intuitive notion of effective computation affects the precision of our definition of software. But when we regard software from the syntactic side as pattern, or as physically embodied pattern capable of causal interaction with a machine, then the notion of instructions has vanished into the web of interaction. And the need to define effective method, insofar as this would take us beyond causation, vanishes with it.
There is thus no mystery in how ideal pattern communicates with material machines. There is no software version of the mind-body problem, no "seam" between software and hardware that defies conception. Software patterns only affect machines when they are physically embodied, and then they affect them through dumb causation.
13. Stones Left Unturned In the beginning I noted that this kind of inquiry is best embedded in a system that answers other questions at the same time. I have answered the question what is software by introducing other notions that, in my view, are only slightly less problematic. I would not want to close without drawing attention to the questions that this essay has not answered, that must be answered to complete the inquiry in a systematic way.
Those questions are: What is pattern? Are the Exhaustion and Pythagorean Principles true? Are the purposes and semantic meanings of programmers "reducible" to syntax, or "emergent" from syntax, or both or neither? Are the patterns of the world digital in themselves or merely subject to digitization? (This affects the question whether "the true logic" must be formal or may be dialectical.) Are hardware and software best explained by idealism, materialism, or by a third theory? What is a software type as opposed to a software token? What is hardware? What is purpose? What is causation?
I would like to thank Andrew Brust, Hal Hanes, Bob Horn, Jonathan Jacobs, Cathy Kemp, Peter Marvanyi Nagy, John Newman, Ray Ontko, A.L.P. Thorpe, and Kate Wininger for helpful comments on an earlier draft of this paper.Index of Principles These links jump to the first occurrences of the six principles described in the essay.
Peter Suber,
Department of Philosophy,
Earlham College, Richmond, Indiana, 47374, U.S.A.
peters@earlham.edu. Copyright © 1988, Peter Suber.