Random Explorations in Automata Theory
Gary J. Shannon
Created: Mar. 25, 2003
Last updated: Mar. 25, 2003


Exploring Self Organization and Autocatalysis in a Cellular Automaton

A collection of interacting entities often react in certain ways only, e.g. entity A may be able to affect B but not C. D may only affect E. For a sufficiently large collection of different entities a situation may arise where a complete network of interconnections can be established - the entities become part of one coupled system. This is called an autocatalytic set, after the ability of molecules to catalyse each other's formation in the chemical equivalent of this arrangement.

--From FAQ for USENET Newsgroup comp.theory.self-org-sys

Some Preliminary Thought

[This page contains a new project that I have just begun (Mar. 25, 2003). All of this is very preliminary and subject to much revision as the project proceeds As pointed out in the introduction these pages are my own informal and decidedly non-academic random explorations into the world of cellular automata.]

Most cellular automata use a small number of distinct states for each cell, typically less than a dozen with two and three state CAs being the most popular. In addition the typical CA uses as few dimensions as possible for the purpose at hand. In a totalistic CA these two limitations are necessary because the size of the rule table grows rapidly when either the number of states or the number of neighbors is increased.

The rule table must include one entry for every possible neighborhood state. For a 1-dimensional CA with 2 states the neighborhood totals can be represented by the number of neighbors in state 1 with the remaining neighbors assumed to be in state 0. The only three neighborhood conditions, therefore are 0, 1, and 2. The rule table will then have 6 entries, one for each neighborhood condition and initial state. Since either of two states can be applied to each entry in the rule there are a total of 26 = 64 possible rules. Expanding the same two-state CA to a space of 2 dimensions using the Moore neighborhood each cell has 8 neighboring cells and there are 9 possible neighborhood states (0 to 8) times two possible initial states allowing 218 = 262,144 possible rules.

Now adding a state to the CA gives us more possible neighborhood states, namely (0,0), (1,0), (2,0), ... (8,0), (0,1), (1,1), ... (7,1), (0,2), (1,2), ... (8,0), for a total of 45 possible neighborhood states. The rule table will have 45 X 3 = 135 entries and there will be 3135 = 2.6 X 1064 possible rules.

Expanding the CA to three dimensions (26 neighbors) and allowing, say 1000 states would require a rule table of astronomical size. But in order for any meaningful kind of self organization to occur within the space of the CA  with randomly created states and random rules a very large number of initial states would need to be supplied, after which the environment would select those states and rules which contribute to organization and survive while dropping those states which do not contribute and do not survive. In this way a very large selection of possible states is made available to the process while after self organization has occurred only a very small number of those states might survive. For example an initial population of 1000 states might achieve a final stable "ecology" in which only a few dozen states exist in a mutually supportive organized system. The remaining 970 or so states would have, for lack of fitness, vanished from the space of the CA.

A computer program to explore a massive CA such as that would have to be able to support thousands of states, each with their own specific rules. In order to make such a task manageable it is necessary to abandon the notion of the totalistic CA and go to a more primitive form of rule. One method that would satisfy this requirement is a straightforward existential rule. The rule specifies a state transition that is to take place predicated only on the existence of a given neighboring state. No neighborhood counts are performed since the number of neighbors in any given state is irrelevant to the rule. The rule would state only "If any neighbor of state A is in state X then transition to state B," or symbolically; A(X)->B. The state "X" is called the trigger state for this transition.

If no rule is triggered for a given cell in a given tick of the clock then the global default rule is that the state remains unchanged, i.e. X->X. Any state may have an optional default transition in the form X->Y associated with it. Any state may have any number of triggered transitions in the form X(Y)->Z associated with it.

Applying this method of rule definition to an example CA with 1000 states we might assign default transitions to half of those states, thus creating 500 lines in the rule. In addition we might assign an average of 4 additional triggered transition rules to each state creating an additional 4000 lines in the rule table. Now we have a rule table with 4500 entries which is minuscule compared to the astronomical rule table for a totalistic 3-dimensional rule table with 1000 states. Even a massive CA with 10,000 states in four dimensions (80 neighbors per cell) would have a mere 40,000 lines or so in the rule, placing it easily within the reach of a computer simulation.

As a practical matter for a CA with N states the rule would be partitioned into N lists each containing some small number of transition entries. In this way only a very short list of rule entries need be examined for each given initial state. If the list for a given initial state contains more than one transition entry the first applicable transition rule will be considered to have the highest priority and will be applied after which no subsequent entries will be examined. If the last entry in the list is an unconditional transition it will be applied only if no higher priority rule was applicable. If no rule is applicable and no default transition is found in the list then the global default is applied and the state is left unchanged. If each list contains an average of R rules then no more than R rule entries must be examined for each cell. Since R will typically be quite small, probably less than 10, a very small number of rule entries must be sequentially examined for each cell at each tick of the clock.

What Neighborhood?

It is important to maximize contact with other neighbors so that cells are not so spatially isolated as to make interactions with the necessary trigger states unlikely. On the other hand spatial seperation is necessary if organisms are ever to develop and maintain their individual identities and structures. The standard Moore neighborhood contains 8 neighbors in 2 dimensions and 26 neighbors in 3 dimensions.  Expanding that neighborhood to the extended Moore neighborhood gives us 24 neighbors in 2 dimensions and 124 neighbors in 3 dimensions. It seems intuitive that neighbors at a greater distance should have less likelyhood of participating in a reaction.  To embody that intuition the neighborhood adopted for this experiment is a two-stage extended Moore neighborhood. The ordinary Moore neighborhood of 8 neighbors is searched for the trigger states contained in the rules for the current state. If one of more such trigger states are found the highest priority rule that can be applied is applied.  If no trigger states are found in the immediate neighborhood then the extended Moore neighborhood is searched for trigger states, and the appropriate rule applied.  If no trigger states are found in the extneded neighborhood then the default rule is applied.

The net result of this technique is that the number of potential neighbors is maximized yet an adjacent neighbor of low priority has precedence over a more distant neighbor of higher priority.

What About the Empty State?

When potentially hundreds or even thousnad of states are possible what is it about the empty state that makes it special? How is it distinguished from any other arbitrarily choosen state? It cannot simply be the fact that we have declared state 0 to be empty since we could just as easily have declared that sate 99 is the empty state. The need for an empty state is implied by the desire to see the development of mobile organisms.  In order for mobile structures to travel through space they should not be impeded by a space that is filled with random states whose effects on that structure would be variable and random. The existence of empty space may be necessary for the development of any consistent kind of mobility. This simulation will employ the concept of empty space as distinguished from space in other states. For the empty state to qualify as somehow unique among states there must be something that distinguishes it from the other states. The distinctions I am proposing to use are:

  1. The majority of space will be initially filled with the empty state.
  2. The empty state is the target of significantly more default transitions than other states.
  3. The empty state has significantly more transistion rules than any other state.

These three conditions should result in a state which predominates at the outset, is abundantly replenished by the demise of many different "decaying" states, and is itself more amenable to being transformed into a wide variety of other states than is the typical non-empty state.

How Many Rules Per State?

As it turns out for systems with the number of states, N= 50 to 100 in the majority of cases 4 to 5 rules per state usually results in a connected graph.  For systems with as many as 2000 to 3000 states that number climbs to between 9 and 10 rules per state.  16 to 20 rules per state resulted in a connected graph in every one of 100,000 trials where random rules were generated even for very large values of N. A practical strategy would be to generate 3 to 5 rules per state and then test the graph for connectivity. Refering back to the empty state conditions it should be remembered that the empty state will receive many more transition rules than the typical state and that it will be the target of many more default transitions than any other state and its special qualitites would establish the empty state as the bridge between disjoint systems. Additinal rules could therefore be added to the empty state until connectivity was achieved.

As a result of this technique several disjoint system can share and compete for the same space even though their internal "biochemistry" might be completely different.

This approach does not address the loss of connectivity when certain states become extinct. This is a problem that will need some further thought.

What to Expect: Some Speculations

At this point the simulation has yet to be written so any expectations would be pure speculation. But there are a few expectations that are likely to be verified as facts. Initially the vast majority of the states will have no useful transition rules and will eventually die out. Other states may only be able to remain inert and unchanging. Some states might be inert yet act as the catalyst or trigger state for another useful state transition. Eventually it is hoped that a surviving set of states and their transition rules will emerge to form an autocatalytic set which is self-sustaining and (dare we hope?) possibly even self-reproducing.

To increase the chances for useful interactions to occur, and to allow for more complex structural developments in organisms the CA should ultimately take place in at least 3 dimensions, although some initial explorations may be done in 2 dimensions. The staged extended neighborhood discussed above will be used.

Initially only the high level statistics of the population could be collected. Then as patterns began to emerge more details could be harvested from the simulation to identify the members of autocatalytic sets and what their transition rules are.

Anticipated Catastrophes and How They Might be Avoided

Once the simulation is started for the first time I expect to see some small handful of states flood all of space, and setup some kind of large-scale cyclic pattern similar to the phase waves in the Zhabotinsky reaction or the ebb and flow of Brian's Brain.  Either that or space will quickly fill with a single state and reamin frozen in that state. In either case all of the potential candidates for autocatalysis will have been crowded out of space before they have had a chance to find each other and establish their symbiotic relationships. It would not be terribly surprising to find that some autocatalytic set that never got off the ground even posseses the ability to consume the very state that flooded space.  In other words, there might have been a preditor that could have kept the rampant growth in check had it been given time to evolve.

In a very large space of billions of cells or more these very organisms might have evolved in some remote backwater which was reached only later by the advancing tide. The existence of such a backwater where special conditions can obtain would require either great distance or the existence of some barrier like the sand bars around a tide pool  which isolate it from the ocean at large during low tide. In the case of a barrier space would not be homogeneous, but might have a varied topography. Open spaces might alternate with confined spaces and narrow passages.  It might be the case that the invading flood depended on the coincidence of three or four states working cooperatively to wreak their havoc. In that case the flood might not be able to propigate through a passage only 1 cell wide, leaving the nooks and cranies at the ends of these tiny tunnels unscathed.

Of course there are limits to how large and varied a space can be simulated in a reasonable amount of time on any computer no matter how fast it might be. Rather than implementing this kind of varied topography, therefore, it might be more practical to implement something which had the same effect. Distant or semi-isolated regions of space contribute to the mix by providing a site for independant evolution and by ocasionally spilling over and adding new ingredients to the ecosystem of open space. This could be simulated either by dropping random states into open space from time to time, or by maintaining a handfull of isolated spaces and periodically allowing mixing by droping some barrier between them for a few clock ticks. Dropping in random states has the advantage of simplicity, although isolated states aren't much use where whole autocatalytic organisms would be necessary to otherthrow the dominant species. Running seperate spaces would be time consuming and have the additional hazard that all of the seperate spaces might fall into the same catastrophe and thus provide no variety at all.

If they were to have variety and avoid common collapse they would have to be seeded with an entirely different mix of initial states.

Another approach would be environmental. Transitions could be predicated not only on the presence of some particular trigger state but also on some environmental condition like "temperature" or the presence or absence of "light" in a given region of space. These environmental factors could easily be varied over time and space facilitating some transitions in certain places or times while discouraging the same transitions under different environmental conditions. The whole of space could, for example, be filled with a temperature gradient; cold at the bottom and warm at the top, or with temperature or light level values that varied over time in either a random-walk fashion or in a systematic manner. This would provide regions that would be more or less hospitable to transitions of various types and create ecological niches that would prevent the catastrophic spread of a single dominant state or system of states into all of sapce.

The mechanics of implementing such an environmental system might include transition probabilities rather than priorities. These probabilities would be a function of some environmental variable. Thus rather than having a fixed priority rank of 1, for example, a given transition might have a transition weight wi=fi(e) where e is the environmental variable.  A random number between 0 and Q = sum(w), the sum of of all transition weights that are applicable at this tick would be compared to each individual weight wi to determine which transition took place. In sufficiently extreme conditions the default transition to state 0 might remain as the only viable transition and the state is destroyed by a temperature too high or too low.

An easier way to implement temperature dependence would be to have an "ideal" temperature associated with each transition and the further the actual temperature strayed from that ideal the less probable the transition would become. Even a simple linear relationship, i.e., the priority is computed as the absolute value of the difference between ideal and actual, would provide effective environmental sensitivity. In that case randomness would not have to be invoked unless two or more priorities tied for lowest value (highest priority). Each transition rule would thus include the initial state, the predicate state or trigger, the target state, and the ideal value for each environmental variable implemented in the simulation.

Another way of providing environmental variability would be to have fixed states specifically designated as trigger states only, distributed in certain ways throughout the environment. Such states would never make any transitions themselves, and would never be the target of any transition, but could be selected as the trigger for any transition.  Thus a wall of such fixed states along the bottom edge (or surface in a 3 dimensional space) would form a "ground" and organisms that required those states as catalysts would be required to remain in contact with the ground. In order to extend into the space above ground level would require a symbiotic relationship with other autocatalytic sets that could grow on the "roots" that are required to stay earthbound.

Some Categories Into Which States Might Fall

Before actually trying the experiment it might be useful to speculate about what categories different states and systems of states, or "organism" might fall into.

But all of that is getting pretty optimistic. It's highly unlikely that anything like an organism would evolve in anything less than geological times scales.


A Few Preliminary Design Notes

The Parameters of the Simulation

1. The dimensionality or connectedness of the space, D.

2. The size of the space in cells, S.

3. The number of states, N.

4. The probability that a given state will have a default rule, D.

5. The the distribution curve of number of rules per state and the computed average number of rules per state as determined by that distribution.

Initializing the Rules:

Select the number of states.

For each state based on probability D either create or do not create a default rule for that state.

...For each state select some number of rules based on the probability distribution curve.

......For each rule select a random trigger state and a random transition state and add that rule to the list of rules for that state.

Initializing the Space:

Based on the connectivity of space build the structure (array or graph or directed graph) that will be the space of the CA.

Fill that space with random states.

Running the Simulation:

For each tick

...For each cell

......Collect the identities of all neighbors.

......Based on the initial state at that cell scan the rule list for that state and apply the first applicable rule, if any. If no applicable rule, the state remains unchanged.

Collecting data:

Population Statistics:

For each state track the population of that state over time.

For each state track the age of that cell in that state.

Track the number of unique surviving states over time.

Track the cumulative average longevity for a state in a single cell.

Track the average longevity of surviving states in a cell.

And ...?

Spatial Distribution:

For each state determine the center of mass of all cells in that state.

Determine the average distance of a cell in that state from its center of mass.

Look for clustering or population centers.

Look for places where local density far exceeds average density. Partition space into sub-regions and compare the density in those sub-regions to the average density in all of space.

And ... ?

Classification:

Identify the number of inert stable states in the population.

And ... ?

Organisms:

Look for signs of mobility, i.e. do cluster centers move?

And ... ?

Visual Representations:

Produce cross sectional snapshots or 3-D projections of selected states.

And ... ?

===UNDER CONSTRUCTION. More to come soon ===