**Background/Question/Methods**

Estimating the distribution of a species represents a major challenge and requires knowledge of the interaction of at least three important factors: (1) its environmental tolerance range, or fundamental niche, summarized as a time-dependent diagonal matrix **N*** _{F}(t)* with the probability of invading a given environment as a function of its distance to the fundamental niche of the species. (2) Its dispersal capacities relative to the physical matrix of reference (e.g., rivers, mountains), summarized in an adjacency matrix

**M**(

*t*); and finally (3), its biotic interactions, which we disregard due to theoretical arguments that suggest that at coarse resolutions interactions “average out.” Estimating the area of distribution of a species then requires access to different types of data and complicated computational simulations, as represented by the following scheme: the area of distribution

**G**(

*t*) is a vector of zeroes and ones at time t. Then

**G**(

*t+1*) is obtained by

**N**

*• M(*

_{F}(t)*t*) •

**G**(

*t*), where the dots represent matrix multiplications. This is a recipe to operate on large objects that contain very considerable amounts of data. It is a hypothesis on how to operate with climatic and topographic data to obtain areas of distribution. We illustrate this prescription using data of Pleistocene distributions of mammal species.

**Results/Conclusions **

We show that it is possible to estimate fundamental niches, by resorting to physiological and distributional data. It is also possible to estimate realistic adjacency matrices on the basis of the topography of an area and the dispersal capacities of organisms. Finally, it is possibly to assemble the above by using sound hypothesis about the mechanisms governing distributions to estimate ancestral ranges of species. We use this type of example to discuss the need, and the possibility, for biologists to move away from an “envy of physics” paradigm to an “envy of computer sciences” paradigm, more based on knowledge representations and manipulation of large sets of data, rather than simple equations with a few parameters. This framework is solidly theoretical, based on concepts and hypothesis, but it is used to establish relations between data objects, rather than simple parameters.