A priori it is not clear which neural signals out of a given pool should be combined into a good estimate for a specific stimulus property.
A simple weighted average, or any similar linear combination, of neural signals will not work under most circumstances, because such an estimate will be subject to noisy interference by all neurons which are processing data not related to the stimulus property in question.
For example, the signal of a neuron which codes mainly stimulus orientation, but is also sensitive to colour changes, might correctly reflect the orientation of an edge in some cases, but might show a mixture of orientation and colour readings in other cases. Clearly, one would want to include the data of this neuron in orientation estimates only if the signal is not disturbed by colour.
A simple weighted average of neural signals will fail also if some of the neurons are driven out of their working range. Any estimate based partially on neural signals which are clamped by saturation- or threshold-effects will inevitably be biased.
In summary, any good estimation process should be highly selective concerning the question of which neural responses should be combined and which should be dropped in an estimate.
There is problem surfacing here: The decision of which specific subgroup of neurons should be included in an stable estimate, and which should not, varies greatly with the actual stimulus situation. It does not seem to be an easy task for a neural network to discriminate between noise and signal in spike trains coming from a pool of neurons.
However, there is a rather simple solution to this problem, which uses the fact that many neurons have receptive fields overlapping with each other, i.e. are looking at nearly the same aspect of the external world. This allows the external world to be used as reference.
Any specific stimulus situation will split a given pool of neurons into two disjunct classes: one class which codes the stimulus value more or less confidently (class ), and the rest of the neurons, which display either no relation or a wrong relationship between their neural signals and the stimulus value in question (class ). Overlapping receptive fields ensure that class will consist of more than a single neuron, and this makes it possible to detect this class by a simple ``neural voting'' process.
More formally, if stands for the neural response of the neuron,
and if one searches within the whole pool of neurons for coherence clusters
, defined as all sets having pairwise similar responses,
In this process, the coherence threshold defines the amount of noise tolerated within the coherence cluster - neuronal signals deviating more than from the signals in class will be regarded as outliers coding noise or other information.
The actual size of the coherence cluster depends on the choice of . For , the number of neurons able to participate in the coherence cluster decreases. In this case, the estimate will be based on fewer units and will become less reliable. For , all neurons, including the noisy ones, can participate in the coherence cluster. In this limit, equation (2) computes simply the average of all signals.
The coherence threshold might be chosen adaptively, depending on actual stimulus and noise statistics. Normally however, classes (signal) and (noise) are clearly separated, so the actual value of is not critical (compare results in Sections 2.6, 3.5).
For a fixed coherence threshold , the number of neurons participating in the coherence cluster, , compared to the total number of neurons in the pool, , is an indication of the quality of the coherence estimate and can be used as a validation measure (compare Section 2.6).
So, the simple answer to the selection problem posed above turns out to be: Let the neurons do the selection themself via the process of coherence detection, which is itself driven by the coherence of the external world.