Cosmology. The Origin and Evolution of Cosmic Structure - Coles P., Lucchin F
..pdf348 Statistics of Galaxy Clustering
The notation here means a product over the (N − 1) edges linking N objects, summed over all relabellings of the objects (l) and summed again over all distinct N-tree graphs with a given topology t weighted by a coe cient QN,t. The fourpoint term must therefore include two coe cients, one for ‘snake’ connections and the other for ‘star’ graphs, as illustrated in Figure 16.3. For N = 2 and N = 3, the di erent graphs connecting the points are topologically equivalent, but for N = 4 there are two distinct topologies. The topological di erence can be seen by considering the result of cutting one edge in the graph. The first ‘snake’ topology is such that connections can be cut to leave either two pairs, or one pair and a triplet. The second cannot be cut in such a way as to leave two pairs; this is a ‘star’ topology. There are twelve possible relabellings of the snake and four of the star. For the N = 5 function, there are three distinct topologies, illustrated in the figure with 5, 60 and 60 relabellings, respectively. We leave it as an exercise for the reader to show that N = 6 has six di erent topologies, and a total of 1296 di erent relabellings.
The Lick and Zwicky catalogues have also supplied a rather uncertain estimate of the four-point correlation function, which is given by the approximate relation
η = ξ(4) Ra[ξ(r12)ξ(r23)ξ(r34) + 11 others] |
|
+ Rb[ξ(r12)ξ(r13)ξ(r14) + 3 others], |
(16.5.3) |
where the function η depends on the six independent interparticle distances as in Equation (16.2.7); the first twelve terms correspond to ‘snake’ topologies and the second four to ‘stars’; the quantities Ra and Rb correspond to QN,t of Equation (16.5.2) for each of the two topologies; from observations, Ra 2.5 and Rb 4.3. This again seems to confirm the hierarchical model. Indeed, as far as one can tell within the statistical errors, all the correlation functions up to N 8 seem to follow a roughly hierarchical pattern. The success of this model is intriguing, particularly as the analysis of galaxy counts in cells seems to confirm that it extends to larger scales than can be probed directly by the correlation functions. A sound theoretical understanding of this success now seems to be emerging: the strongly nonlinear behaviour (16.5.2) is consistent with our understanding of the statistical mechanics of self-gravitating systems through a hierarchy of equations studied first by Born, Bogoliubov, Green, Kirkwood and Yvon, which is known as the BBGKY hierarchy. The behaviour in the weakly nonlinear regime can be understood by perturbation theory.
16.5.1 Comments
The extraction of estimates of ξ(N) from galaxy samples has involved a huge investment of computer power over the last two decades. These functions have yielded important insights into both the statistical properties and possible dynamical origin of the clustering pattern. An important aspect of this is a connection, which we have no space to explore here, between the correlation functions and a dynamical description of self-gravitating systems in terms of the set of equations that make up the BBGKY hierarchy (Davis and Peebles 1977).
|
|
The Hierarchical Model |
349 |
N = 2 |
|
( 1 ) |
|
1 |
|
||
|
|
|
|
N = 3 |
( 3 ) |
|
32
1 |
|
2 |
1 |
|
2 |
N = 4 |
|
( 12 ) |
|
|
( 4 ) |
4 |
|
3 |
4 |
|
3 |
1 |
|
2 |
1 |
|
2 |
N = 5 |
5 |
( 5 ) |
|
5 |
( 60 ) |
|
|
||||
4 |
|
3 |
4 |
|
3 |
|
|
1 |
2 |
|
|
|
|
|
5 |
( 60 ) |
|
|
|
|
|
|
43
Figure 16.3 Di erent topologies of graphs connecting the N points for computing correlation functions in the hierarchical model; graphs for N = 3, 4 and 5 are shown.
Nevertheless, the statistical information contained in these functions is limited. In order to have a complete statistical description of the properties of a point distribution we need to know all the finite-order correlation functions. Given the computational labour required to extract even the low-order functions from a
350 Statistics of Galaxy Clustering
large sample, this is unlikely to be achieved in practice. This problem is exacerbated by the fact that the correlation functions, even the two-point function, are very di cult to determine from observations on large scales where the evolution of ξ is close to linear and analytical theory is consequently most reliable. For this reason, and the di culty of disentangling e ects of bias from dynamical evolution, it is necessary to look for other statistical descriptions; we shall describe some of these in Sections 16.6–16.10.
16.6 Cluster Correlations and Biasing
As we mentioned above, the correlation function analysis can be applied to other kinds of distributions, including quasars and radio galaxies. In this section we shall concentrate on rich clusters of galaxies; we shall also restrict ourselves to the two-point correlation properties of these objects since the sizes of these samples make it di cult to obtain accurate estimates of higher-order functions. The twopoint correlation function for Abell clusters (those containing at least 65 galaxies within the ‘Abell radius’ of around 1.5h−1 Mpc) is found to be
ξc(r) |
r |
−γ |
|
|
|
|
, |
(16.6.1) |
|
r0c |
where 5h−1 Mpc r 75h−1 Mpc, r0c 12–25h−1 Mpc and γ 1.8. The similarity in shape between (16.6.1) and the galaxy version (16.4.5) is interesting. There is, however, considerable uncertainty about the correct value of the correlation length r0c for these objects because of the possible, indeed probable, existence of systematic errors accumulated during the compilation of the Abell catalogue. Cluster catalogues recently compiled using automated plate-measuring devices suggest values towards the lower end of the quoted range, while the richest Abell clusters (those with more than 105 galaxies inside an Abell radius) may have a correlation length as large as 50h−1 Mpc. There is indeed some evidence that the correlation length scales with the richness (i.e. density) of the clusters and is consequently higher for the denser, and hence rarer, clusters. It has been suggested that this correlation can be expressed by the relationship
ξi(r) |
r |
−γ |
|
r |
−γ |
Ci |
r0c |
,i |
γ |
const. 0.4 |
|
|
|
Ci |
|
|
|
|
(16.6.2) |
||||
r0c,i |
li |
li |
|
between the correlation length r0c,i and the mean separation li of subsamples selected according to a given richness threshold. The self-similar form of (16.6.2) can be interpreted intuitively as a kind of fractal structure.
The self-similar properties that seem to be implied by both observations and the theory described above lead one naturally to a description of the mass distribution in the language of fractal sets. The prevalence of techniques based on fractal geometry in fields such as condensed matter physics has given rise to a considerable interest in applying these methods to the cosmological context.
Cluster Correlations and Biasing |
351 |
To get a rough idea of the fractal description consider the mass contained in a small sphere of radius r around a given galaxy, denoted M(r). In the case where ξ(r) 1 we have
M(r) ξ(r)r3 rD2 , |
(16.6.3) |
with D2 = 3 − γ: since ξ(r) has a power-law form with a slope of around γ 1.8, then we have M(r) r1.2. In the language of fractals, this corresponds to a correlation dimension of D2 1.2. One can interpret this very simply by noting that, if the mass is distributed along one-dimensional structures (filaments), then M(r) r; two-dimensional sheets would have M r2 and a space-filling homogeneous distribution would have M r3. A fractional dimension like that observed indicates a fractal structure.
The first convincing explanation of the relationship between (16.6.1) and (16.4.5) was given by Kaiser (1984). He supposed that galaxy and cluster formation proceeded hierarchically from Gaussian initial conditions in the manner outlined in Section 14.4. If this is the case, then clusters, on mass scales of order 1015M , must have formed relatively recently. Moreover, rich clusters are extremely rare objects, with a mean separation of order 60h−1 Mpc. It is natural therefore to interpret rich clusters as representing the high peaks of a density field which is still basically evolving linearly: the collapse of the highest peaks will not alter the properties of the ‘average’ density regions significantly. Applying the spherical ‘top-hat’ collapse model of Section 15.1, the collapse to a bound structure occurs when, roughly speaking, the linearly evolved value of the density perturbation, δ, on the relevant scale reaches a value δc 1.68. If Ω 1, which we assume for simplicity, then the collapse time tcoll will be given by
|
68 |
3/2 |
|
|
|
tcoll t0 |
|
1. |
|
, |
(16.6.4) |
νσ |
where t0 is the present epoch, σ is the RMS mass fluctuation on the scale of clusters and δ = νσ is the value of δ obtained from linear theory. The final overdensity of the collapsed structure with respect to the background universe will be, at collapse (see Section 15.1),
t0 |
2 |
|
, |
|
|
δf 180 tcoll |
(16.6.5) |
so that structures which collapse earlier have a higher final density. For t0 tcoll t0/2 we have 1.7 νσ 2.4 and 180 δf 720. A small di erence in collapse time and, therefore, a small di erence in ν produces objects with very di erent final density. For this reason it is reasonable to interpret clusters as being density ‘peaks’, i.e. as regions where δ exceeds some sharp threshold. On large scales we can use the high-peak biasing formalism described in Section 14.8; the relationship between the correlation function of the ‘peaks’ and the covariance function of the underlying matter distribution is therefore given by Equation (14.8.5).
352 Statistics of Galaxy Clustering
For simplicity we assume that galaxies trace the mass, so that equation (14.8.5) becomes
ξc(r) |
ν |
2 |
|
|
ξg(r), |
(16.6.6) |
|
σ |
which, for appropriate choices of ν and σ, can reconcile (16.4.5) with (16.6.1). The model also explains how one might get an increased correlation length with richness: higher peaks have higher ν and correspond to denser systems.
This elucidation of the reason why clusters should have stronger correlations than galaxies is natural because clusters are, by definition, objects with exceptionally high density on some well-defined scale. Kaiser’s calculation was, however, subsequently used as the basis for the first models of biased galaxy formation described in Section 14.8. For it to apply to galaxies, however, one has to think of a good reason why galaxies should only form at particularly dense peaks of the matter distribution: some mechanism must be invoked to suppress galaxy formation in ‘typical’ fluctuations. One should therefore take care to distinguish between the apparent biasing of clusters relative to galaxies and the biasing of galaxies relative to mass; the former is well-motivated physically, the latter, at least with our present understanding of galaxy formation, is not.
In any event, one of the advantages of the cluster distribution is that it can be used to measure correlations on scales where the galaxy–galaxy correlation function vanishes into statistical noise. The cluster–cluster correlation function seems to be positive out to at least 50h−1 Mpc, while the galaxy–galaxy function is very small, and perhaps negative, for r 10h−1 Mpc.
16.7 Counts in Cells
A simple but useful way of measuring the correlations of galaxies on large scales which does not su er from the problems of the correlation functions is by looking at the distribution of counts of galaxies in cells, Pn(V). This is defined as the probability of finding n objects in a randomly placed volume V, or the low-order moments of this distribution such as the variance σ2 and skewness γ which we define below; do not confuse γ with the slope of the two-point correlation function in Equation (16.4.2) or with the spectral parameter in equation (13.2.11). Indeed some of the earliest quantitative analyses of galaxy clustering by Hubble adopted the counts-in-cells approach.
Using only the moments of the cell-count distribution does result in a loss of information compared with the use of the full distribution function, but the advantage is a simple relationship between the moments and the correlation functions, e.g.
|
n |
2 |
|
1 |
1 |
|
|
||
σ2 |
≡ ∆n¯ |
|
|
= |
|
+ |
|
ξ(2)(r12) dV1 dV2, |
(16.7.1) |
n¯ |
V2 |
where n¯ is the mean number of galaxies in a cell of volume V, i.e. n¯ = nV V (nV is the mean number-density of galaxies). The derivation of this formula for the
Counts in Cells |
353 |
variance is quite straightforward. Consider a set of n points (galaxies) distributed in a cell of volume V. Divide the cell into infinitesimal sub-cells dVk and let each contain nk galaxies. If the dVk are small enough, then nk can only be 0 or 1. Clearly n = nk. The expected number of galaxies in the cell is
n = n¯ = nk = V n dV = nV V. |
|
(16.7.2) |
||||||
The mean squared value of n is |
|
|
k l |
|
|
|
|
|
2 |
|
2 |
|
|
|
|
||
n |
= |
nk + |
nknl . |
|
(16.7.3) |
|||
|
|
|
|
≠ |
|
|
|
|
Because nk is only either 0 or |
1, the first term must be the same as |
|
nk ; the |
|||||
2 |
|
|
|
|
|
|||
second term is obviously just nV dV1 dV2(1 |
+ ξ12), so that |
|
||||||
n2 = nV V + (nV V)2 + nV2 |
ξ12 dV1 dV2. |
|
(16.7.4) |
|||||
The form (16.7.1) then follows when the result is expressed in terms of |
|
|||||||
|
n − n¯ |
2 |
|
∆n |
2 |
|
(16.7.5) |
|
|
|
= n¯ |
. |
|
||||
n¯ |
|
|
The 1/n¯ term in Equation (16.7.1) is due to Poisson fluctuations: it is a discreteness e ect. Apart from this, the second-order moment is simply an integral of the twopoint correlation function over the volume V, and is therefore related to the mass variance defined by Equation (13.3.8) for a sharp window function. The same is true for higher-order moments, but the discreteness terms are more complicated and the integrals must be taken over the cumulants. For example, following a similar derivation to that above, the skewness γ can be written
n |
3 |
|
2 |
|
3σ |
2 |
1 |
|
|
||
γ ≡ ∆n¯ |
|
|
= − |
|
+ |
|
|
+ |
|
ξ(3) dV1 dV2 dV3. |
(16.7.6) |
n¯2 |
n¯ |
|
V3 |
Equation (16.7.1) provides a good way of measuring the two-point correlation function on large scales. Use of the skewness and higher-order moments descriptors is now also possible. The usual formulation is to write the ratio of the Nthorder moment to the (N −1)th power of the variance as SN . For example, in terms of γ and σ2, the hierarchical parameter S3 is just γ/σ4. In the hierarchical model the SN should be constant, independent of the cell volume. For the simple hierarchical distribution (16.5.1) we have S3 = 3Q, which seems to be in reasonable agreement with measured skewnesses. There should be some scale dependence of clustering properties if the initial power spectrum is not completely scale free, so one would not expect S3 to be accurately constant on all scales in, for example, the CDM model. It is, however, a very slowly varying quantity. Within the considerable errors, there seems to be a roughly hierarchical behaviour of the clustering data consistent with most gravitational instability models of structure formation.
The Power Spectrum |
355 |
16.8 The Power Spectrum
There are many advantages, particularly on large scales, in not measuring the two-point correlation function directly, but through its Fourier transform. The Wiener–Khintchine theorem (13.8.5) shows that, for a statistically homogeneous random field, the two-point covariance function is the Fourier transform of the power spectrum. One might expect therefore that one can define a useful power spectrum for galaxy clustering which is the inverse of the two-point correlation function. For power-law primordial spectra P(k) kn, one can show that ξ(r) −sin(πn/2)r−(3+n) (n > −3), which can be used to deduce the power spectrum from a knowledge of ξ in regions where it can be represented as a power law. On the other hand, one would imagine that a better procedure is to estimate P(k) directly from the data without worrying about ξ(r), particularly on large scales. This is indeed the case. There are some subtleties, however, because the discreteness of the galaxy counts induces a ‘white-noise’ contamination into the power spectrum which must be removed.
For a discrete distribution of N points (galaxies) we can define the Fourier transform as
δ(k) = |
1 |
exp(ik · x), |
(16.8.1) |
N |
where the sum is taken over all galaxy positions x. If the distribution were random, the coe cients δ(k) would be generated by a random walk in the complex plane. It is then straightforward to show that the variance of the modulus of δ(k) is given by
|δ(k)|2 = |
1 |
|
N . |
(16.8.2) |
In principle, one can therefore just subtract the quantity 1/N from the quantity |δ(k)|2 determined by (16.8.1). In fact, the power spectrum is estimated over a region of k-space which defines an interval in the modulus of k, denoted k. One therefore needs to subtract o the ‘shot-noise’ contribution for each k which enters this estimate, so that
|
2 |
|
nk |
|
|
P(k) |δ(k)| |
|
− |
|
, |
(16.8.3) |
|
N |
||||
k |
|
|
|
|
|
where nk is the number of k modes involved in the sum.
Even this does not work, however, unless we have a cubic sample volume (which is unlikely to be the case). It is necessary, in fact, to think of the observed sample as being a modulation of the real density field by some selection function f(x), which can also take account of the fact that some galaxies will be missed at larger distances from the observer in a survey limited by apparent magnitude. To account for this, one therefore has to subtract o from δ(k) the Fourier transform of f(x) before doing the subtraction in (16.8.3). One also has to correct for the e ect of f at modulating the Fourier coe cients of δ. It turns out that the observed power spectrum is just a convolution of the ‘true’ power spectrum with
356 Statistics of Galaxy Clustering
the function |fk|2, the squared modulus of the Fourier transform of f(x). This also induces an error in nk, since the number of k modes depends on the volume after modulation, rather than on the idealised cubic volume mentioned above. Correcting for all these e ects requires some care.
To be precise, P(k) is actually a spectral density function, and should have units of volume. To avoid the possible dependence of P(k) upon the sample volume it is more useful to deal for comparison purposes with a dimensionless power spectrum ∆2(k) k3P(k) in the manner of Equation (14.2.8). The power spectrum of galaxy clustering has been analysed for a number of di erent samples and the results are reasonably well fitted by the functional form:
∆2(k) = |
(k/k0)1.6 |
|
1 + (k/kc)−2.4 . |
(16.8.4) |
The best-fitting value for the parameters are kc 0.015–0.025h Mpc−1 and k0 0.19h Mpc−1, but k0 depends quite sensitively upon the accuracy of the various selection functions. This form, on large scales, is similar to a low-density CDM spectrum or a CHDM spectrum; see Figure 16.4. The power spectrum of Abell cluster correlations has also been computed; the results are consistent with a rather large value for the correlation length, r0 21h−1 Mpc, and indicate that the clustering strength does depend on the cluster richness, as one might expect from the discussion in Section 16.5.
16.9 Polyspectra
Since the power spectrum is the Fourier transform of the two-point correlation function, it would seem likely that similar transforms of the N-point functions for N > 2 would also prove to be useful descriptors of galaxy clustering. For example, the Fourier transform of the three-point correlation function is known as the bispectrum. The use of higher-order spectra is not yet widespread, but they will probably turn out to be a very e ective way of detecting non-Gaussian fluctuation statistics on very large scales and of constraining the gravitational instability picture generally.
To see why, consider the application of the power spectrum to a continuous density contrast field as in Chapters 10–15, i.e. δ(x) defined by
δ(x) = [ρ(x) − ρ0]/ρ0, |
(16.9.1) |
where ρ0 is the average density and ρ(x) is the local matter density. Because the initial perturbations evolve linearly, it is useful to expand δ(x) as a Fourier superposition of plane waves:
˜ = x − ·
δ( ) d δ ( ) exp( i ). (16.9.2) x
k
k x
|
|
Polyspectra |
357 |
|
|
k / h Mpc−1 |
|
|
0.01 |
0.1 |
1 |
|
10 |
|
|
|
1 |
|
|
(k) |
0.1 |
|
|
2 |
|
|
|
∆ |
|
Abell |
|
|
|
|
|
|
|
Radio |
|
|
|
Abell × IRAS |
|
|
0.01 |
CfA |
|
|
|
APM/Stromlo |
|
|
|
Radio × IRAS |
|
|
|
IRAS |
|
|
0.001 |
APM (angular) |
|
|
10 |
|
|
|
|
Γ = 0.5 |
|
|
1 |
|
|
|
|
Γ = 0.2 |
|
(k) |
0.1 |
|
|
2 |
|
|
|
∆ |
|
|
|
|
|
Abell |
|
|
Radio |
0.01 |
|
Abell × IRAS |
|
|
CfA |
|
|
APM/Stromlo |
|
|
Radio × IRAS |
0.001 |
|
IRAS |
|
APM (angular) |
|
|
|
|
0.01 |
0.1 |
1 |
|
k / h Mpc−1 |
|
Figure 16.4 Comparison of the power spectrum of galaxy clustering with various CDM models having di erent values of the shape parameter Γ . The y-axes show ∆2 = k3P(k) as a function of k; the data points are from a compilation of redshift surveys before (upper panel) and after (lower panel) allowances are made for bias and velocity e ects. Picture courtesy of John Peacock.
˜ |
|
|
The Fourier transform δ(k) is complex and therefore possesses both amplitude |
||
˜ |
|
|
|δ(k)| and phase φk, where |
|
|
˜ |
˜ |
(16.9.3) |
δ(k) = |δ(k)|exp(iφk). |