\[ \DeclareMathOperator{\qfrE}{E} \DeclareMathOperator{\qfrtr}{tr} \DeclareMathOperator{\qfrsgn}{sgn} \DeclareMathOperator{\qfrdiag}{diag} \newcommand{\qfrGmf}[1]{\Gamma \! \left( #1 \right)} \newcommand{\qfrBtf}[2]{B \! \left( #1 , #2 \right)} \newcommand{\qfrbrc}[1]{\left[ {#1} \right]} \newcommand{\qfrC}[2][\kappa]{ C_{#1} \! \left( #2 \right) } \newcommand{\qfrCid}[5]{ C^{#1, #2}_{#3} \! \left( #4, #5 \right) } \newcommand{\qfrrf}[2][k]{\left( #2 \right)_{#1}} \newcommand{\qfrdk}[2][k]{ d_{#1} \! \left( #2 \right) } \newcommand{\qfrdij}[3][k]{ d_{#1} \! \left( #2, #3 \right) } \renewcommand{\det}[1]{\left\lvert #1 \right\rvert} \newcommand{\qfrhgmf}[4]{{}_2 F_1 \left( #1 , #2 ; #3 ; #4 \right)} \newcommand{\qfrmvnorm}[3][n]{N_{#1} \! \left( #2 , #3 \right)} \newcommand{\qfrcchisq}[1]{\chi_{#1}^2} \newcommand{\qfrnchisq}[2]{\chi^2 \! \left( #1 , #2 \right)} \newcommand{\qfrBtd}[2]{\mathrm{beta} \! \left( #1 , #2 \right)} \]
Since version 1.1.0, qfratio
(CRAN; GitHub)
has a functionality to evaluate probability density and distribution
functions of a (simple) ratio of quadratic forms in normal variables.
This document is to describe theoretical backgrounds and some
implementation details of this functionality. See the main package
vignette (vignette("qfratio")
) for the evaluation of
moments of ratios of quadratic forms.
Most symbols not listed here are largely restricted to individual sections.
Consider the (simple) ratio of quadratic forms in normal variables: \[\begin{equation} Q = \frac{\mathbf{x}^T \mathbf{A} \mathbf{x}} {\mathbf{x}^T \mathbf{B} \mathbf{x}}, \end{equation}\] where \(\mathbf{x} \sim \qfrmvnorm{\boldsymbol{\mu}}{\mathbf{I}_n}\). The denominator matrix \(\mathbf{B}\) is assumed to be nonnegative definite, whereas \(\mathbf{A}\) can be any symmetric matrix.
A more general case where \(\mathbf{x} \sim \qfrmvnorm{\boldsymbol{\mu}}{\boldsymbol{\Sigma}}\) can be transformed into the above form when \(\boldsymbol{\Sigma}\) is nonsingular: \(\mathbf{x}_{\mathrm{new}} = \mathbf{K}^{-1} \mathbf{x}\), \(\mathbf{A}_{\mathrm{new}} = \mathbf{K}^T \mathbf{A} \mathbf{K}\), etc., where \(\mathbf{K}\) is an \(n \times n\) matrix that satisfies \(\mathbf{K} \mathbf{K}^T = \boldsymbol{\Sigma}\) (Mathai and Provost, 1992, chap. 3). When \(\boldsymbol{\Sigma}\) is singular, certain conditions need to be met by the argument matrices, \(\boldsymbol{\Sigma}\), and \(\boldsymbol{\mu}\) for this transformation, and hence the following expressions, to be valid (Watanabe, 2023, appendix C).
Assuming \(\mathbf{B}\) to be nonnegative definite, the distribution function of \(Q\) is \[\begin{equation} \begin{aligned} F_Q(q) = & \Pr \left( \frac{\mathbf{x}^T \mathbf{A} \mathbf{x}} {\mathbf{x}^T \mathbf{B} \mathbf{x}} \leq q \right) \\ = & \Pr \left( \mathbf{x}^T \left( \mathbf{A} - q \mathbf{B} \right) \mathbf{x} \leq 0 \right) , \end{aligned} \end{equation}\] so that it can be expressed as the distribution function of the quadratic form \(X_q = \mathbf{x}^T \left( \mathbf{A} - q \mathbf{B} \right) \mathbf{x}\) at \(0\). We are mostly concerned with such \(q\) that makes \(\mathbf{A} - q \mathbf{B}\) indefinite, because otherwise (i.e., when it is positive or negative (semi)definite) \(F_Q(q) = 0, 1\) and \(f_Q(q) = 0\).
Consider the spectral decomposition \(\mathbf{A} - q \mathbf{B} = \mathbf{P} \boldsymbol{\Lambda} \mathbf{P}^T\), with an orthogonal matrix of eigenvectors \(\mathbf{P}\) and a diagonal matrix of eigenvalues \(\boldsymbol{\Lambda} = \qfrdiag \left( \lambda_1 , \dots , \lambda_n \right)\), and let \(\mathbf{P}^T \mathbf{x} = \mathbf{y} = \left( y_1 , \dots , y_n \right)^T \sim \qfrmvnorm{\boldsymbol{\nu}}{\mathbf{I}_n}\) with \(\boldsymbol{\nu} = \mathbf{P}^T \boldsymbol{\mu} = \left( \nu_1 , \dots , \nu_n \right)^T\). Then, \[\begin{equation} \begin{aligned} X_q = & \mathbf{x}^T \left( \mathbf{A} - q \mathbf{B} \right) \mathbf{x} \\ = & \mathbf{y}^T \boldsymbol{\Lambda} \mathbf{y} \\ = & \sum_{i=1}^{n} \lambda_i y_i^2 . \end{aligned} \end{equation}\] Obviously, \(y_i^2 \sim \qfrnchisq{1}{\nu_i^2}\), a noncentral chi-square variable with \(1\) degree of freedom and a noncentrality parameter \(\nu_i^2\), and by construction these are independent of one another. Thus, \(X_q\) is a weighted sum of independent chi-square variables, and the problem boils down to evaluation of the distribution of this quantity.
Explicit formulae for the distribution and density function of \(Q\) have been worked out by Hillier (2001) and Forchini (2002, 2005). These typically involve infinite series of the top-order zonal and invariant polynomials of matrix arguments. The zonal polynomials are certain homogeneous polynomials of eigenvalues of a symmetric matrix which extend powers of scalars into symmetric matrices (e.g., Muirhead, 1982). The invariant polynomials are further extension of the zonal polynomials to multiple matrix arguments (see Chikuse, 1980, 1987; Davis, 1980). These polynomials are used to integrate out components of rotation from a function of random matrices.
Forchini (2002, 2005) derived explicit expressions of \(F_Q\) using the top-order zonal and invariant polynomials.
Let \(\boldsymbol{\Lambda}\) from above be arranged and partitioned such that \[\begin{equation} \boldsymbol{\Lambda} = \left( \begin{matrix} \boldsymbol{\Lambda}_1(q) & \mathbf{0}_{n_{+} \times n_{-}} & \mathbf{0}_{n_{+} \times (n - n_{+} - n_{-})} \\ \mathbf{0}_{n_{-} \times n_{+}} & - \boldsymbol{\Lambda}_2(q) & \mathbf{0}_{n_{-} \times (n - n_{+} - n_{-})} \\ \mathbf{0}_{(n - n_{+} - n_{-}) \times n_{+}} & \mathbf{0}_{(n - n_{+} - n_{-}) \times n_{-}} & \mathbf{0}_{(n - n_{+} - n_{-}) \times (n - n_{+} - n_{-})} \end{matrix}\right), \end{equation}\] where \(\boldsymbol{\Lambda}_1(q)\) and \(- \boldsymbol{\Lambda}_2(q)\) are \(n_{+}\)- and \(n_{-}\)-dimensional diagonal matrices of positive and negative eigenvalues of \(\mathbf{A} - q \mathbf{B}\), respectively. By denoting \[\begin{equation} \begin{aligned} \boldsymbol{\nu} = & \left( \begin{matrix} \boldsymbol{\nu}_1 \\ \boldsymbol{\nu}_2 \\ \boldsymbol{\nu}_3 \end{matrix}\right) , \\ {\boldsymbol{\Lambda}_1^*}^{-1} = & \left( \qfrtr \left( {\boldsymbol{\Lambda}_1}^{-1} \right) + \qfrtr \left( {\boldsymbol{\Lambda}_2}^{-1} \right) \right)^{-1} {\boldsymbol{\Lambda}_1}^{-1}, \\ {\boldsymbol{\Lambda}_2^*}^{-1} = & \left( \qfrtr \left( {\boldsymbol{\Lambda}_1}^{-1} \right) + \qfrtr \left( {\boldsymbol{\Lambda}_2}^{-1} \right) \right)^{-1} {\boldsymbol{\Lambda}_2}^{-1}, \end{aligned} \end{equation}\] with the partition of \(\boldsymbol{\nu}\) corresponding to that of the rows of \(\boldsymbol{\Lambda}\) above, the expression of Forchini (2005) is, after correcting some errors, \[\begin{equation} \begin{aligned} F_Q(q) = & \frac{ \qfrGmf{ \frac{n_{+} + n_{-}}{2} } \exp \left[ - \frac{1}{2} \left( \boldsymbol{\nu}_1^T \boldsymbol{\nu}_1 + \boldsymbol{\nu}_2^T \boldsymbol{\nu}_2 \right) \right] }{ \qfrGmf{ \frac{n_{+}}{2} + 1 } \qfrGmf{ \frac{n_{-}}{2} } \det{\boldsymbol{\Lambda}_1^*}^{\frac{1}{2}} \det{\boldsymbol{\Lambda}_2^*}^{\frac{1}{2}} } \\ & \cdot \sum_{i_1 = 0}^{\infty} \sum_{j_1 = 0}^{\infty} \sum_{i_2 = 0}^{\infty} \sum_{j_2 = 0}^{\infty} \frac{ \qfrrf[i_1 + j_1 + i_2 + j_2]{\frac{n_{+} + n_{-}}{2}} \qfrrf[i_1 + j_1]{\frac{1}{2}} \qfrrf[i_2 + j_2]{\frac{1}{2}} }{ \qfrrf[i_1 + j_1]{\frac{n_{+}}{2} + 1} \qfrrf[i_2 + j_2]{\frac{n_{-}}{2}} \qfrrf[j_1]{\frac{1}{2}} \qfrrf[j_2]{\frac{1}{2}} i_1! j_1! i_2! j_2! } \\ & \quad \cdot \qfrCid{\qfrbrc{i_1}}{\qfrbrc{j_1}}{\qfrbrc{i_1 + j_1}}{ \qfrtr \left( {\boldsymbol{\Lambda}_1^*}^{-1} \right) \mathbf{I}_{n_{+}} - {\boldsymbol{\Lambda}_1^*}^{-1} }{ \frac{1}{2} {\boldsymbol{\Lambda}_1^*}^{-\frac{1}{2}} \boldsymbol{\nu}_1 \boldsymbol{\nu}_1^T {\boldsymbol{\Lambda}_1^*}^{-\frac{1}{2}} } \\ & \quad \cdot \qfrCid{\qfrbrc{i_2}}{\qfrbrc{j_2}}{\qfrbrc{i_2 + j_2}}{ \qfrtr \left( {\boldsymbol{\Lambda}_2^*}^{-1} \right) \mathbf{I}_{n_{-}} - {\boldsymbol{\Lambda}_2^*}^{-1} }{ \frac{1}{2} {\boldsymbol{\Lambda}_2^*}^{-\frac{1}{2}} \boldsymbol{\nu}_2 \boldsymbol{\nu}_2^T {\boldsymbol{\Lambda}_2^*}^{-\frac{1}{2}} } \\ & \quad \cdot {}_2 F_1 \left(\frac{n_{+} + n_{-}}{2} + i_1 + j_1 + i_2 + j_2, 1; \frac{n_{+}}{2} + i_1 + j_1 + 1; \qfrtr \left( {\boldsymbol{\Lambda}_1^*}^{-1} \right) \right) , \end{aligned} \end{equation}\] where \(\qfrCid{\qfrbrc{k_1}}{\qfrbrc{k_2}}{\qfrbrc{k_1 + k_2}}{ \cdot }{ \cdot }\) are the top-order invariant polynomials of \((k_1, k_2)\)-th degree (see above for other notations).
In the central case (\(\boldsymbol{\mu} = \mathbf{0}_n\)), the above expression simplifies into (Forchini, 2002) \[\begin{equation} \begin{aligned} F_Q(q) = & \frac{ \qfrGmf{ \frac{n_{+} + n_{-}}{2} } }{ \qfrGmf{ \frac{n_{+}}{2} + 1 } \qfrGmf{ \frac{n_{-}}{2} } \det{\boldsymbol{\Lambda}_1^*}^{\frac{1}{2}} \det{\boldsymbol{\Lambda}_2^*}^{\frac{1}{2}} } \\ & \cdot \sum_{i_1 = 0}^{\infty} \sum_{i_2 = 0}^{\infty} \frac{ \qfrrf[i_1 + i_2]{\frac{n_{+} + n_{-}}{2}} \qfrrf[i_1]{\frac{1}{2}} \qfrrf[i_2]{\frac{1}{2}} }{ \qfrrf[i_1]{\frac{n_{+}}{2} + 1} \qfrrf[i_2]{\frac{n_{-}}{2}} i_1! i_2! } \\ & \quad \cdot \qfrC[\qfrbrc{i_1}]{ \qfrtr \left( {\boldsymbol{\Lambda}_1^*}^{-1} \right) \mathbf{I}_{n_{+}} - {\boldsymbol{\Lambda}_1^*}^{-1} } \qfrC[\qfrbrc{i_2}]{ \qfrtr \left( {\boldsymbol{\Lambda}_2^*}^{-1} \right) \mathbf{I}_{n_{-}} - {\boldsymbol{\Lambda}_2^*}^{-1} } \\ & \quad \cdot {}_2 F_1 \left(\frac{n_{+} + n_{-}}{2} + i_1 + i_2, 1; \frac{n_{+}}{2} + i_1 + 1; \qfrtr \left( {\boldsymbol{\Lambda}_1^*}^{-1} \right) \right) , \end{aligned} \end{equation}\] where \(\qfrC[\qfrbrc{k}]{ \cdot }\) are the top-order zonal polynomials of \(k\)th degree.
These expressions can be numerically evaluated by truncating the
infinite series at certain higher-order terms (\(i_1 + j_1 + i_2 + j_2 = m\), say), and by
using a recursive algorithm to calculate \[\begin{equation}
\qfrdij[k_1, k_s]{ \mathbf{A}_1 }{ \mathbf{A}_2 } =
\frac{ \qfrrf[k_1 + k_2]{\frac{1}{2}} }{ k_1 ! k_2 !}
\qfrCid{\qfrbrc{k_1}}{\qfrbrc{k_2}}{\qfrbrc{k_1 + k_2}}{ \mathbf{A}_1
}{ \mathbf{A}_2 }
\end{equation}\] by Hillier et al. (2009, 2014) (see also the main vignette
qfratio
). The present package implements this algorithm in
pqfr(..., method = "forchini")
(see below).
The distribution function has points of nonanalyticity around the
eigenvalues of \(\mathbf{B}^{-1}
\mathbf{A}\) (assuming \(\mathbf{B}^{-1}\) is invertible) (Forchini,
2002). Practically speaking, around these points, the series
expression can be slow to converge and evaluation of the hypergeometric
function can fail because the argument \(\qfrtr \left( {\boldsymbol{\Lambda}_1^*}^{-1}
\right)\) becomes very close to 1
. Otherwise, the
series expression can be evaluated with high accuracy, although the
computational cost of the recursive calculations can be substantial in
large problems.
Apparently, the literature has an explicit expression for the density of \(Q\) only for the simple condition when \(\mathbf{B} = \mathbf{I}_n\) and \(\boldsymbol{\mu} = \mathbf{0}_n\). In this case, the distribution of \(Q\) does not depend on the norm of \(\mathbf{x}\), so any spherically symmetric distribution of \(\mathbf{x}\) yields the same distribution of \(Q\) (Hillier, 2001).
Let \(\eta_1 > \dots > \eta_s\) be \(s\) distinct eigenvalues of \(\mathbf{A}\), and \(n_1 , \dots , n_s\) be the corresponding degrees of multiplicity (\(\sum_{i=1}^{s} n_i = n\)). Consider the transformed variable \(V = \frac{Q - \eta_s}{\eta_1 - Q}\) and parameters \(\psi_i = \frac{\eta_i - \eta_s}{\eta_1 - \eta_i}\) for \(i = 2, \dots s - 1\), assuming \(s > 2\) (see below for the case when \(s = 2\)). The density of \(V\) has different functional forms across its domain (from Hillier, 2001, lemmas 3 and 4): \[\begin{equation} \begin{aligned} f_V (v) = & \left[ \qfrBtf{ \frac{p_r}{2} }{ \frac{n - p_r}{2} } \right]^{-1} \left[ \prod_{i=2}^{r+1} \psi_i^{- \frac{n_i}{2}} \right] \left[ \prod_{i=2}^{s-1} (1 + \psi_i)^{\frac{n_i}{2}} \right] v^{\frac{p_r}{2} - 1} (1 + v)^{- \frac{n}{2}} & \\ & \quad \cdot \sum_{j=0}^{\infty} \sum_{k=0}^{\infty} c_r (j, k) \frac{ \qfrrf[j]{\frac{1}{2}} \qfrrf[k]{\frac{1}{2}} }{ j! k! } \qfrC[\qfrbrc{j}]{v \mathbf{D}_{r}} \qfrC[\qfrbrc{k}]{\left( v \mathbf{D}_{r+1} \right)^{-1}} , & \psi_{r + 2} < v < \psi_{r + 1}, \atop r = 1, \dots , s - 2 , \\ f_V (v) = & \left[ \qfrBtf{ \frac{n_s}{2} }{ \frac{n - n_s}{2} } \right]^{-1} \left[ \prod_{i=2}^{s-1} \left( \frac{1 + \psi_i}{\psi_i} \right)^{\frac{n_i}{2}} \right] v^{\frac{n - n_s}{2} - 1} (1 + v)^{- \frac{n}{2}} & \\ & \quad \cdot \sum_{k=0}^{\infty} \frac{ \qfrrf{-\frac{n_s}{2} + 1} \qfrrf{\frac{1}{2}} }{ \qfrrf{\frac{n - n_s}{2} + 1} k! } \qfrC[\qfrbrc{k}]{v \mathbf{D}} , & 0 < v < \psi_{s - 1} , \\ f_V (v) = & \left[ \qfrBtf{ \frac{n_1}{2} }{ \frac{n - n_1}{2} } \right]^{-1} \left[ \prod_{i=2}^{s-1} \left( 1 + \psi_i \right)^{\frac{n_i}{2}} \right] v^{\frac{n_1}{2} - 1} (1 + v)^{- \frac{n}{2}} & \\ & \quad \cdot \sum_{k=0}^{\infty} \frac{ \qfrrf{-\frac{n_1}{2} + 1} \qfrrf{\frac{1}{2}} }{ \qfrrf{\frac{n - n_1}{2} + 1} k! } \qfrC[\qfrbrc{k}]{\left( v \mathbf{D} \right)^{-1}} , & \psi_{2} < v , \end{aligned} \end{equation}\] where \(p_r = \sum_{i=1}^{r+1} n_i\), \(\mathbf{D}_{r} = \qfrdiag \left( \psi_{2}^{-1} \mathbf{I}_{n_{2}} , \dots , \psi_{r+1}^{-1} \mathbf{I}_{n_{r+1}} \right)\), \(\mathbf{D}_{r+1} = \qfrdiag \left( \psi_{r+2}^{-1} \mathbf{I}_{n_{r+2}} , \dots , \psi_{s-1}^{-1} \mathbf{I}_{n_{s-1}} \right)\), \(\mathbf{D} = \qfrdiag \left( \psi_{2}^{-1} \mathbf{I}_{n_{2}} , \dots , \psi_{s-1}^{-1} \mathbf{I}_{n_{s-1}} \right)\), and \(c_r (j, k)\) are the coefficients defined as \[\begin{equation} c_r (j, k) = \begin{cases} 1 , & j = k , \\ \qfrrf[k-j]{ - \frac{p_r}{2} + 1} \large{/} \qfrrf[k-j]{ \frac{n - p_r}{2} } , & j < k , \\ \qfrrf[j-k]{ - \frac{n - p_r}{2} + 1} \large{/} \qfrrf[j-k]{ \frac{p_r}{2} } , & j > k . \\ \end{cases} \end{equation}\] \(c_r (j, k)\) can be \(0\) when \(p_r\) or \(n - p_r\) is even, so that some terms in the above series disappear. Otherwise (whenever \(c_r (j, k) \neq 0\)), it is possible to write \[\begin{equation} c_r (j, k) = (-1)^{j-k} \frac{\qfrGmf{\frac{p_r}{2}} \qfrGmf{\frac{n - p_r}{2}}}{ \qfrGmf{\frac{p_r}{2} + j - k} \qfrGmf{\frac{n - p_r}{2} + k - j} } , \end{equation}\] to simplify calculation. The density is undefined at \(v = \psi_{i}\) (\(q = \eta_i\)) for any \(i\).
From one of the above expressions, \(f_Q(q)\) can be obtained as \[\begin{equation} \begin{aligned} f_Q (q) = & f_V (v) \cdot \left| \frac{\mathrm{d} q}{\mathrm{d} v} \right| \\ = & f_V \left( \frac{q - \eta_s}{\eta_1 - q} \right) \cdot \frac{\eta_1 - \eta_s}{\left( \eta_1 - q \right)^2} . \end{aligned} \end{equation}\] It is seen that, when \(s = 2\) (i.e., there are only two distinct eigenvalues), the above becomes \[\begin{equation} \begin{aligned} f_Q (q) = & \left[ \qfrBtf{ \frac{n_1}{2} }{ \frac{n_s}{2} } \right]^{-1} \frac{ \left( q - \eta_s \right)^{\frac{n_1}{2} - 1} \left( \eta_1 - q \right)^{\frac{n_s}{2} - 1} }{ \left( \eta_1 - \eta_s \right)^{\frac{n_1 + n_s}{2} - 1} } , \end{aligned} \end{equation}\] which is the density of the (scaled) beta distribution in the interval \(\left[ \eta_s , \eta_1 \right]\) with the parameters \(n_1 / 2\) and \(n_s / 2\). This result is expected from the basic relationship between the chi-square and beta distributions, i.e., \(\qfrcchisq{n_1} / \left( \qfrcchisq{n_1} + \qfrcchisq{n_s} \right) = b \sim \qfrBtd{n_1 / 2}{n_s / 2}\), a standard beta distribution in \([0, 1]\), with \(\qfrcchisq{n_i}\) being independent chi-square variables; \(Q = \left( \eta_1 \qfrcchisq{n_1} + \eta_s \qfrcchisq{n_s} \right) / \left( \qfrcchisq{n_1} + \qfrcchisq{n_s} \right) = \eta_1 b + \eta_s \left( 1 - b \right) = \left( \eta_1 - \eta_s \right) b + \eta_s\).
As for the above series expression of \(F_Q\), these expressions can be evaluated
by taking a partial sum of the series and using the recursive algorithm
for \(d\).
dqfr(..., method = "hillier")
implements this algorithm
(see below). This is reasonably quick and accurate in small problems,
but the computational cost can be substantial in large problems.
A popular way to numerically evaluate the distribution function of \(Q\) is to use the inversion formula of the characteristic function (e.g., Stuart and Ord, 1994, chap. 4).
From the famous formula of Imhof (1961) on the distribution of \(X_q\), \[\begin{equation} F_Q(q) = \frac{1}{2} - \frac{1}{\pi} \int_{0}^{\infty} \frac{\sin \beta (u)}{u \gamma (u)} \, \mathrm{d}u , \end{equation}\] where \[\begin{equation} \begin{aligned} \beta (u) = & \frac{1}{2} \sum_{i=1}^{n} \left[ \arctan \left( u \lambda_i \right) + \frac{\nu_i^2 u \lambda_i}{1 + u^2 \lambda_i^2} \right] , \\ \gamma (u) = & \prod_{i=1}^{n} \left( 1 + u^2 \lambda_i^2 \right)^{\frac{1}{4}} \exp \left[ \frac{1}{2} \sum_{i=1}^{n} \frac{\nu_i^2 u^2 \lambda_i^2}{1 + u^2 \lambda_i^2} \right] . \end{aligned} \end{equation}\]
The above integral can be evaluated by using a regular numerical
evaluation algorithm for infinite intervals. Alternatively, it can be
evaluated by the trapezoidal integration algorithm of Davies (1973, 1980) which explicitly controls the
numerical errors involved. The package CompQuadForm
implements these methods in the function imhof()
and
davies()
, respectively (strictly speaking, these are for
\(1 - F_Q\) as per Imhof’s original
result). The present package utilizes those functions via
pqfr(..., method = "imhof", use_cpp = FALSE)
and
pqfr(..., method = "davies")
, respectively (see below). For
the former method, the present package also has its own C++
implementation used via
pqfr(..., method = "imhof", use_cpp = TRUE)
(default). The
numerical inversion can be evaluated fairly quickly on modern computers,
and the accuracy will be sufficient for most practical purposes with
slight error in numerical integration.
The density can be evaluated by numerical inversion of the characteristic function using Geary’s formula (Geary, 1944; Stuart and Ord, 1994, sec. 11.10). Broda and Paolella (2009) demonstrated that \[\begin{equation} f_Q(q) = \frac{1}{\pi} \int_{0}^{\infty} \frac{\rho (u) \cos \beta (u) - u \delta (u) \sin \beta (u)}{2 \gamma (u)} \, \mathrm{d}u , \end{equation}\] where, along with \(\beta\) and \(\gamma\) defined above, \[\begin{equation} \begin{aligned} \rho (u) = & \sum_{i=1}^{n} \frac{h_{ii}}{1 + u^2 \lambda_i^2} + \sum_{i=1}^{n} \sum_{j=1}^{n} \frac{ \nu_i \nu_j h_{ij} \left( 1 - u^2 \lambda_ i \lambda_j \right) }{ \left( 1 + u^2 \lambda_i^2 \right) \left( 1 + u^2 \lambda_j^2 \right)} \\ = & \qfrtr \left( \mathbf{H} \mathbf{F}^{-1} \right) + \boldsymbol{\nu}^T \mathbf{F}^{-1} \left( \mathbf{H} - u^2 \boldsymbol{\Lambda} \mathbf{H} \boldsymbol{\Lambda} \right) \mathbf{F}^{-1} \boldsymbol{\nu} , \\ \delta (u) = & \sum_{i=1}^{n} \frac{ h_{ii} \lambda_i }{1 + u^2 \lambda_i^2} + \sum_{i=1}^{n} \sum_{j=1}^{n} \frac{ 2 \nu_i \nu_j h_{ij} \lambda_i }{ \left( 1 + u^2 \lambda_i^2 \right) \left( 1 + u^2 \lambda_j^2 \right)} \\ = & \qfrtr \left( \mathbf{H} \boldsymbol{\Lambda} \mathbf{F}^{-1} \right) + 2 \boldsymbol{\nu}^T \mathbf{F}^{-1} \mathbf{H} \boldsymbol{\Lambda} \mathbf{F}^{-1} \boldsymbol{\nu} , \\ \end{aligned} \end{equation}\] with \(\mathbf{H} = \mathbf{P}^T \mathbf{B} \mathbf{P} = \left( h_{ij} \right)\) and \(\mathbf{F} = \mathbf{I}_n + u^2 \boldsymbol{\Lambda}^2\).
The above expression can be evaluated with a regular numerical
integration algorithm, and is implemented in
dqfr(..., method = "broda")
(see below).
Saddlepoint approximation (Butler, 2007; Paolella, 2007, chap. 5) provides an alternative way to evaluate (or approximate) \(F_Q\) and \(f_Q\).
Let \(M_{X_q}(s)\) be the moment generating function of \(X_q\), \[\begin{equation} M_{X_q}(s) = \prod_{i=1}^{n} \left( 1 - 2 s \lambda_i^2 \right)^{-\frac{1}{2}} \exp \left[ s \sum_{i=1}^{n} \frac{\nu_i^2 \lambda_i}{1 - 2 s \lambda_i^2} \right] . \end{equation}\] Also, let \(K_{X_q}(s) = \log M_{X_q}(s)\) be the corresponding cumulant generating function. These are convergent within the interval \(1 / 2 \lambda_n < s < 1 / 2 \lambda_1\), where \(\lambda_1\) and \(\lambda_n\) are the largest and smallest of the eigenvalues (which are positive and negative, respectively; see above).
For \(X_q\), the saddlepoint root \(\hat{s}\) is defined as the unique root of \[\begin{equation} 0 = K_{X_q}' \left( \hat{s} \right) = \sum_{i=1}^{n} \left[ \frac{\lambda_i}{1 - 2 \hat{s} \lambda_i^2} + \frac{\nu_i^2 \lambda_i}{\left( 1 - 2 \hat{s} \lambda_i^2 \right)^2} \right] , \end{equation}\] and is found numerically within the above interval.
A first-order saddlepoint approximation formula for the distribution function \(F_Q\) is (Butler and Paolella, 2007, 2008): \[\begin{equation} \widehat{\Pr{}}_1 (Q < q) = \begin{cases} \Phi \left( \hat{w} \right) + \phi \left( \hat{w} \right) \left[ \hat{w}^{-1} - \hat{u}^{-1} \right] , & \text{if } \qfrE \left( X_q \right) \neq 0 , \\ \frac{1}{2} + \frac{ K_{X_q}'''(0) }{ 6 \sqrt{2 \pi} K_{X_q}''(0)^{3/2} } , & \text{if } \qfrE \left( X_q \right) = 0 , \\ \end{cases} \end{equation}\] where \(\Phi \left( \cdot \right)\) and \(\phi \left( \cdot \right)\) are the distribution and density functions, respectively, of the standard normal distribution, and \[\begin{equation} \begin{aligned} \hat{w} = & \qfrsgn \left(\hat{s}\right) \sqrt{-2 K_{X_q} (\hat{s})} , \\ \hat{u} = & \hat{s} \sqrt{K_{X_q}'' (\hat{s})} . \end{aligned} \end{equation}\] The condition \(\qfrE \left( X_q \right) = 0\) is equivalent to \(\hat{s} = 0\), because of the elementary property of the cumulant generating function \(\qfrE \left( X_q \right) = K_{X_q}' \left( 0 \right)\). Higher-order derivatives of \(K_{X_q}\) are (see also Paolella, 2007, chap. 10) \[\begin{equation} \begin{aligned} K_{X_q}'' \left( s \right) = & 2 \sum_{i=1}^{n} \frac{ \lambda_i^2 }{\left( 1 - 2 s \lambda_i \right)^2} \left(1 + \frac{ 2 \nu_i^2 }{1 - 2 s \lambda_i} \right) , \\ K_{X_q}''' \left( s \right) = & 8 \sum_{i=1}^{n} \frac{ \lambda_i^3 }{\left( 1 - 2 s \lambda_i \right)^3} \left(1 + \frac{ 3 \nu_i^2 }{1 - 2 s \lambda_i} \right) , \\ K_{X_q}^{(4)} \left( s \right) = & 48 \sum_{i=1}^{n} \frac{ \lambda_i^4 }{\left( 1 - 2 s \lambda_i \right)^4} \left(1 + \frac{ 4 \nu_i^2 }{1 - 2 s \lambda_i} \right) . \end{aligned} \end{equation}\]
A more accurate second-order approximation is (Butler and Paolella, 2007) \[\begin{equation} \widehat{\Pr{}}_2 (Q < q) = \widehat{\Pr{}}_1 (Q < q) - \phi \left( \hat{w} \right) \left[ \hat{u}^{-1} \left( \frac{\hat{\kappa}_4}{8} - \frac{5}{24} \hat{\kappa}_3^2 \right) - \hat{u}^{-3} - \frac{ \hat{\kappa}_3^2 }{ 2 \hat{u}^2 } + \hat{w}^{-3} \right] , \quad \qfrE \left( X_q \right) \neq 0, \end{equation}\] where \(\hat{\kappa}_j = K_{X_q}^{(j)} \left( \hat{s} \right) / K_{X_q}'' \left( \hat{s} \right)^{j/2}\).
Evaluation of saddlepoint approximation is fairly quick, with the only potential complexity arising from the numerical root-finding. Empirically, the accuracy of this approximation seems to improve for large problems. This is expected since the distribution of \(X_q\) as a weighted sum approaches normality as \(n\) increases.
pqfr(..., method = "butler")
implements this saddlepoint
approximation. The second-order approximation is used by default
(order_spa = 2
) (see below).
A first-order saddlepoint approximation for the density function \(f_Q\) is (Butler and Paolella, 2007, 2008) \[\begin{equation} \hat{f_1}(q) = \frac{ J_q \left( \hat{s} \right) }{ \sqrt{ 2 \pi K_{X_q}'' \left( \hat{s} \right) } } M_{X_q} \left( \hat{s} \right) , \end{equation}\] where \(\hat{s}\) is the same saddlepoint root used above, and \[\begin{equation} J_q \left( s \right) = \qfrtr \left( \boldsymbol{\Xi}^{-1} \mathbf{H} \right) + \boldsymbol{\nu}^T \boldsymbol{\Xi}^{-1} \mathbf{H} \boldsymbol{\Xi}^{-1} \boldsymbol{\nu} , \end{equation}\] using notations defined above and \(\boldsymbol{\Xi} = \mathbf{I}_n - 2 s \boldsymbol{\Lambda}\).
A second-order approximation is (Butler and Paolella, 2007) \[\begin{equation} \hat{f_2}(q) = \hat{f_1}(q) (1 + O) , \end{equation}\] where \[\begin{equation} O = \left( \frac{\hat{\kappa}_4}{8} - \frac{5}{24} \hat{\kappa}_3^2 \right) + \frac{ J_q' \left( \hat{s} \right) \hat{\kappa}_3 }{ 2 J_q \left( \hat{s} \right) \sqrt{K_q'' \left( \hat{s} \right)} } - \frac{ J_q'' \left( \hat{s} \right) }{ 2 J_q \left( \hat{s} \right) K_q'' \left( \hat{s} \right) } , \end{equation}\] with \[\begin{equation} \begin{aligned} J_q' \left( s \right) = & 2 \qfrtr \left( \boldsymbol{\Xi}^{-1} \boldsymbol{\Lambda} \boldsymbol{\Xi}^{-1} \mathbf{H} \right) + 4 \boldsymbol{\nu}^T \boldsymbol{\Xi}^{-1} \boldsymbol{\Lambda} \boldsymbol{\Xi}^{-1} \mathbf{H} \boldsymbol{\Xi}^{-1} \boldsymbol{\nu} \\ = & 2 \qfrtr \left( \boldsymbol{\Xi}^{-2} \boldsymbol{\Lambda} \mathbf{H} \right) + 4 \boldsymbol{\nu}^T \boldsymbol{\Xi}^{-2} \boldsymbol{\Lambda} \mathbf{H} \boldsymbol{\Xi}^{-1} \boldsymbol{\nu} , \\ J_q'' \left( s \right) = & 8 \qfrtr \left( \boldsymbol{\Xi}^{-3} \boldsymbol{\Lambda}^{2} \mathbf{H} \right) + 16 \boldsymbol{\nu}^T \boldsymbol{\Xi}^{-3} \boldsymbol{\Lambda}^{2} \mathbf{H} \boldsymbol{\Xi}^{-1} \boldsymbol{\nu} + 8 \boldsymbol{\nu}^T \boldsymbol{\Xi}^{-2} \boldsymbol{\Lambda} \mathbf{H} \boldsymbol{\Lambda} \boldsymbol{\Xi}^{-2} \boldsymbol{\nu} , \end{aligned} \end{equation}\] (\(\boldsymbol{\Xi}^{-1}\) and \(\boldsymbol{\Lambda}\) commute since these are diagonal matrices).
dqfr(..., method = "butler")
implements this saddlepoint
approximation in a very similar way to
pqfr(..., method = "butler")
(see below).
The above expressions for the distribution and density functions are
implemented in pqfr()
and dqfr()
, which are
defined as:
pqfr <- function(quantile, A, B, p = 1, mu = rep.int(0, n), Sigma = diag(n),
lower.tail = TRUE, log.p = FALSE,
method = c("imhof", "davies", "forchini", "butler"),
trim_values = TRUE, return_abserr_attr = FALSE, m = 100L,
tol_zero = .Machine$double.eps * 100,
tol_sing = tol_zero, ...) { ... }
dqfr <- function(quantile, A, B, p = 1, mu = rep.int(0, n), Sigma = diag(n),
log = FALSE, method = c("broda", "hillier", "butler"),
trim_values = TRUE, normalize_spa = FALSE,
return_abserr_attr = FALSE, m = 100L,
tol_zero = .Machine$double.eps * 100,
tol_sing = tol_zero, ...) { ... }
The basic usage is similar to that of regular probability
distribution functions like stats::pnorm()
, just with many
optional arguments to specify evaluation methods and behaviors at edge
cases. These functions are (pseudo-)vectorized with respect to
quantile
(a vector of \(q\)), using sapply()
.
Log-transformed \(p\)-values or
densities can be obtained by turning log.p = TRUE
or
log = TRUE
, but these are just ad-hoc transformations of
the results so are not supposed to provide as much numerical accuracy as
in regular probability distribution functions.
There is also qqfr()
for the corresponding quantile
function, which is defined as:
qqfr <- function(probability, A, B, p = 1, mu = rep.int(0, n), Sigma = diag(n),
lower.tail = TRUE, log.p = FALSE, trim_values = FALSE,
return_abserr_attr = FALSE, stop_on_error = FALSE, m = 100L,
tol_zero = .Machine$double.eps * 100,
tol_sing = tol_zero, epsabs_q = .Machine$double.eps ^ (1/2),
maxiter_q = 5000, ...) { ... }
This function is not based on any explicit inverse function, but does
numerical root-finding using stats::uniroot()
, internally
calling pqfr()
; i.e., by searching the root q
of pqfr(q, ...) - probability = 0
.
Internally, these functions first check basic argument structures,
and, if Sigma
is specified, transform the arguments
A
, B
, and mu
and call themselves
recursively with new arguments.
The evaluation method is specified by the argument
method
in these functions (by default, both functions
choose a numerical inversion method). According to the choice, the
actual calculations are done in one of the internal functions
described below. Direct use of the internal functions are not
recommended. These internal functions only accept a length-one
quantile
. To reduce computational time, they do not check
argument structures or accommodate Sigma
.
## Choice from alternative methods
A <- diag(1:3)
pqfr(1.5, A, method = "imhof") # default
#> [1] 0.1978686
pqfr(1.5, A, method = "davies") # similar
#> [1] 0.1979048
pqfr(1.5, A, method = "forchini") # series
#> [1] 0.1978686
pqfr(1.5, A, method = "butler") # spa
#> [1] 0.1897189
dqfr(1.5, A, method = "broda") # default
#> [1] 0.4506431
dqfr(1.5, A, method = "hillier") # series
#> [1] 0.4506431
dqfr(1.5, A, method = "butler") # spa
#> [1] 0.4523631
## Not recommended; for diagnostic use only
qfratio:::pqfr_imhof(1.5, A)
#> $p
#> [1] 0.1978686
#>
#> $abserr
#> [1] 2.031334e-09
qfratio:::pqfr_A1B1(1.5, A, m = 9, check_convergence = FALSE)
#> $p
#> [1] 0.1978641
#>
#> $terms
#> [1] 1.495393e-01 3.427348e-02 9.516357e-03 2.974620e-03 1.002253e-03
#> [6] 3.544878e-04 1.295434e-04 4.844575e-05 1.842967e-05 7.103869e-06
## This is okay
x <- c(1.5, 2.5, 3.5)
pqfr(x, A)
#> [1] 0.1978686 0.8021314 1.0000000
## This is not
qfratio:::pqfr_imhof(x, A)
#> Error in qfratio:::pqfr_imhof(x, A): In pqfr_imhof, quantile must be length-one
ks.test()
In principle, pqfr()
is compatible with
stats:::ks.test()
, but care must be exercised as evaluation
result may involve non-trivial error. It is recommended to inspect error
bounds beforehand, using
pqfr(..., method = "imhof", return_abserr_attr = TRUE)
. In
addition, the argument B
is pre-occupied by the same-named
argument in ks.test()
, so it cannot be passed via
...
; this means that a typical syntax with non-default
B
should be something like
ks.test(x, function(q) pqfr(q, A, B, ...))
rather than
ks.test(x, pqfr, A = A, B = B, ...))
. For example,
## Small Monte Carlo sample
A <- diag(1:3)
B <- diag(sqrt(1:3))
x <- rqfr(10, A, B)
## Calculate p-values
pseq <- pqfr(x, A, B, return_abserr_attr = TRUE)
## Maximum error when evaluated at x;
## looks small enough
max(attr(pseq, "abserr"))
#> [1] 8.318564e-07
## Correct syntax, expected outcome
## \(q) syntax could also be used in recent versions of R
ks.test(x, function(q) pqfr(q, A, B))
#>
#> Exact one-sample Kolmogorov-Smirnov test
#>
#> data: x
#> D = 0.29324, p-value = 0.2946
#> alternative hypothesis: two-sided
rather than
## Used in pqfr(..., method = "forchini")
pqfr_A1B1 <- function(quantile, A, B, m = 100L, mu = rep.int(0, n),
check_convergence = c("relative", "strict_relative",
"absolute", "none"),
stop_on_error = FALSE, use_cpp = TRUE,
cpp_method = c("double", "long_double", "coef_wise"),
nthreads = 1,
tol_conv = .Machine$double.eps ^ (1/4),
tol_zero = .Machine$double.eps * 100,
tol_sing = tol_zero,
thr_margin = 100) { ... }
## Used in dqfr(..., method = "hillier")
dqfr_A1I1 <- function(quantile, LA, m = 100L,
check_convergence = c("relative", "strict_relative",
"absolute", "none"),
use_cpp = TRUE,
tol_conv = .Machine$double.eps ^ (1/4),
thr_margin = 100) { ... }
These functions evaluate the above series expressions as partial sums
of the infinite series, using the recursive algorithm to calculate \(d\) (d1_i()
or
d2_ij_m()
), as in the moment-related functions of this
package (see the vignette for moments vignette("qfratio")
).
Most of the arguments are common with those functions.
A <- diag(1:3)
pqfr(1.5, A, method = "forchini")
#> [1] 0.1978686
dqfr(1.5, A, method = "hillier")
#> [1] 0.4506431
B <- diag(sqrt(1:3))
pqfr(1.5, A, B, method = "forchini")
#> [1] 0.6376791
## dqfr method does not accommodate B, mu, or Sigma
dqfr(1.5, A, B, method = "hillier")
#> Error in dqfr(1.5, A, B, method = "hillier"): dqfr() does not accommodate B, mu, or Sigma with method = "hillier"
As stated above, the density is undefined, and the distribution function has points of nonanalyticity, at the eigenvalues of \(\mathbf{B}^{-1} \mathbf{A}\) (assuming nonsingular \(\mathbf{B}\); Hillier 2001; Forchini 2002). Around these points, convergence of the series expressions tends to be very slow, and the evaluation of hypergeometric function involved in the distribution function may fail. In this case, avoid using the series expression methods.
A <- diag(1:3)
## p-value just below 2, an eigenvalue of A
## Typically throws two warnings:
## Maximum iteration in hypergeometric function
## and non-convergence of series
pqfr(1.9999, A, method = "forchini")
#> Warning in p_A1B1_Ed(quantile, A, B, mu, m, stop_on_error, thr_margin, nthreads, : problem in gsl_sf_hyperg_2F1_e():
#> max iteration reached
#> Warning in pqfr_A1B1(q, A, B, m = m, mu = mu, tol_zero = tol_zero, ...): Last term is >1.2e-04 times as large as the series,
#> suggesting non-convergence. Consider using larger m
#> [1] 0.1052446
## More realistic value; expected from symmetry
pqfr(1.9999, A, method = "imhof")
#> [1] 0.4998044
## Used in pqfr(..., method = "imhof") (default)
pqfr_imhof <- function(quantile, A, B, mu = rep.int(0, n),
autoscale_args = 1, stop_on_error = TRUE, use_cpp = TRUE,
tol_zero = .Machine$double.eps * 100,
epsabs = epsrel, epsrel = 1e-6, limit = 1e4) { ... }
## Used in pqfr(..., method = "davies")
pqfr_davies <- function(quantile, A, B, mu = rep.int(0, n),
autoscale_args = 1,
tol_zero = .Machine$double.eps * 100, ...) { ... }
## Used in dqfr(..., method = "broda") (default)
dqfr_broda <- function(quantile, A, B, mu = rep.int(0, n),
autoscale_args = 1, stop_on_error = TRUE,
use_cpp = TRUE, tol_zero = .Machine$double.eps * 100,
epsabs = epsrel, epsrel = 1e-6, limit = 1e4) { ... }
pqfr_imhof(..., use_cpp = TRUE)
and
dqfr_broda(..., use_cpp = TRUE)
conduct numerical
integration by the C
function
gsl_integration_qagi(..., epsabs, epsrel, limit)
from
GSL
. The arguments epsabs
,
epsrel
, and limit
determine the permissible
bounds of absolute and relative errors, and the maximum number of
integration intervals, respectively.
dqfr_broda(..., use_cpp = FALSE)
uses the R
function
stats::integrate(..., rel.tol = epsrel, abs.tol = epsabs, stop.on.error = stop_on_error)
,
instead, and limit
is ignored.
pqfr_imhof(..., use_cpp = FALSE)
and
pqfr_davies()
calculate appropriate parameters from the
arguments and pass them to imhof()
and
davies()
from the CompQuadForm
package.
The above integration functions try to find an absolute error bound \(e_I\) that is bounded by the user-specified tolerance for absolute \(\epsilon_{\mathrm{abs}}\) and relative \(\epsilon_{\mathrm{rel}}\) errors: \(e_I \leq \epsilon_{\mathrm{abs}} + \lvert I \rvert \epsilon_{\mathrm{rel}}\), where \(I\) is the result of integration.
Internally, \(\epsilon_{\mathrm{abs}}\) is calculated
from the user-specified arguments epsabs
and
epsrel
to appropriately constrain the density or
distribution function (whereas \(\epsilon_{\mathrm{rel}}\) is always
specified by epsrel
). In dqfr_broda()
,
pi * epsabs
is used as \(\epsilon_{\mathrm{abs}}\), and the
resultant error bound abserr
is subsequently divided by
pi
, so is the integration result itself to yield the
density: \(f_Q = I / \pi\) (see
above).
Situation is more complicated for pqfr_imhof()
, because
the relative error in \(I\) cannot in
general be directly transformed to that of the distribution function,
which is \(F_Q = 1/2 - I / \pi\) (see
above). In this function, pi * (epsabs * epsrel / 2)
is
passed as \(\epsilon_{\mathrm{abs}}\),
and the resultant error bound abserr
is divided by
pi
. This procedure ensures an equivalent of the above
inequality to hold for \(F_Q\),
provided \(I \leq 0\) (\(F_Q \geq 1/2\)) or \(\epsilon_{\mathrm{rel}} = 0\). Otherwise,
an error bound calculated in the same way can only be conservative;
pqfr_imhof()
returns this value, but it can violate the
user-specified relative tolerance epsrel
.
A <- diag(1:4)
## This error bound satisfies "abserr < value * epsrel"
pqfr(3.9, A, method = "imhof", return_abserr_attr = TRUE,
epsabs = 0, epsrel = 1e-6)
#> [1] 0.9944167
#> attr(,"abserr")
#> [1] 3.325725e-08
## This one violates "abserr < value * epsrel",
## although abserr is a valid error bound
pqfr(1.2, A, method = "imhof", return_abserr_attr = TRUE,
epsabs = 0, epsrel = 1e-6)
#> [1] 0.01611023
#> attr(,"abserr")
#> [1] 4.165996e-07
autoscale_args
Numerical integration involved in these functions typically fail when
the magnitude of eigenvalues is too small or too large, whence the
integrand functions can decrease too slowly (i.e., divergent-looking) or
too quickly (i.e., looks constant 0
) with respect to the
integration parameter (\(u\) above). To
avoid such failures, these functions internally scale the eigenvalues by
default, so that \(\max \lambda_i - \min
\lambda_i\) is equal to the argument autoscale_args
(default 1
); remember that \(\min
\lambda_i\) is negative, so this quantity is sum of the absolute
values.
A <- diag(1:3)
B <- diag(sqrt(1:3))
## Without autoscale_args
## We know these are equal
pqfr(1.5, A, B, autoscale_args = FALSE)
#> [1] 0.6376791
pqfr(1.5, A * 1e-10, B * 1e-10, autoscale_args = FALSE)
#> [1] 0.5
## The latter failed because of numerically small eigenvalues
## With autoscale_args = 1 (default)
pqfr(1.5, A * 1e-10, B * 1e-10)
#> [1] 0.6376791
trim_values
Numerical integration can yield spurious results that are outside the
mathematically permissible supports; \([ 0,
\infty )\) and \([0, 1]\) for
the density and distribution functions, respectively. By default
(trim_values = TRUE
), the external functions
dqfr()
and pqfr()
trim those values into the
permissible range by using tol_zero
as a margin; e.g.,
negative p-values are replaced by ~2.2e-14
(default
tol_zero
). A warning is thrown if this happens, because it
usually means that numerically accurate evaluation was impossible, at
least with the given parameters. Turn trim_values = FALSE
to skip these trimming and warning, although pqfr_imhof()
and pqfr_davies()
can still throw a warning from
CompQuadForm
functions. Note that, on the other hand, all
these functions try to return exact 0
or 1
when \(q\) is outside the possible
range of \(Q\) (as numerically
determined).
## Result without trimming;
## (typically) negative density, which is absurd
## In this case, error interval typically spans across 0
dqfr(1.2, diag(1:30), return_abserr_attr = TRUE,
trim_values = FALSE)
#> [1] -4.638309e-17
#> attr(,"abserr")
#> [1] 8.043729e-08
## Result with trimming (default)
dqfr(1.2, diag(1:30), return_abserr_attr = TRUE)
#> Warning in dqfr(1.2, diag(1:30), return_abserr_attr = TRUE): values < 0 trimmed
#> up to tol_zero
#> [1] 2.220446e-14
#> attr(,"abserr")
#> [1] 8.043726e-08
## Note that the actual value is only bounded by
## 0 and abserr
## Used in pqfr(..., method = "butler")
pqfr_butler <- function(quantile, A, B, mu = rep.int(0, n),
order_spa = 2, stop_on_error = FALSE, use_cpp = TRUE,
tol_zero = .Machine$double.eps * 100,
epsabs = .Machine$double.eps ^ (1/2), epsrel = 0,
maxiter = 5000) { ... }
## Used in dqfr(..., method = "butler")
dqfr_butler <- function(quantile, A, B, mu = rep.int(0, n),
order_spa = 2, stop_on_error = FALSE, use_cpp = TRUE,
tol_zero = .Machine$double.eps * 100,
epsabs = .Machine$double.eps ^ (1/2), epsrel = 0,
maxiter = 5000) { ... }
These functions evaluate the saddlepoint approximations described
above. They conduct numerical root-finding for the saddlepoint by the
Brent method (C
function
gsl_root_fsolver_brent
from GSL
), with the
stopping rule specified by
gsl_root_test_delta(..., epsabs, epsrel)
and the maximum
number of iteration by maxiter
. When
use_cpp = FALSE
, the R
function
stats::uniroot(..., check.conv = stop_on_error, tol = epsabs, maxiter = maxiter)
is used instead, and epsrel
is ignored. The Newton–Raphson
method was also explored in the development stage, but that method
sometimes failed because the derivative can be numerically close to
0
.
The saddlepoint approximation density does not integrate to unity,
but can be normalized by setting normalize_spa = TRUE
in
dqfr()
(note that this is done in the external function).
The normalized density can be more accurate (although it is usually a
matter of empiricism). However, this is usually slower than the
numerical inversion method for a small number of quantiles.
The second-order approximation is used by default
(order_spa = 2
) (internally, any value > 1
calls this option). The first-order approximation can be used by setting
order_spa = 1
, but this is usually less accurate and only
slightly faster than the second-order approximation.
A <- diag(1:3)
## Default for spa distribution function
pqfr(1.2, A, method = "butler", order_spa = 2)
#> [1] 0.07183068
## First-order spa
pqfr(1.2, A, method = "butler", order_spa = 1)
#> [1] 0.0790331
## More accurate numerical inversion
pqfr(1.2, A)
#> [1] 0.07359703
## Default for density
dqfr(1.2, A, method = "butler",
order_spa = 2, normalize_spa = FALSE)
#> [1] 0.3716931
## First-order
dqfr(1.2, A, method = "butler",
order_spa = 1, normalize_spa = FALSE)
#> [1] 0.4577787
## Normalized density, second-order
dqfr(1.2, A, method = "butler",
order_spa = 2, normalize_spa = TRUE)
#> [1] 0.3913688
## Normalized density, first-order
dqfr(1.2, A, method = "butler",
order_spa = 1, normalize_spa = TRUE)
#> [1] 0.4412349
## More accurate numerical inversion
dqfr(1.2, A)
#> [1] 0.3837318
Paolella (2007, program listing 10.4) noted that the second-order approximation for the distribution function can be “problematic”, which presumably means that the evaluation result can be unstable. In development of this package, some instability in the second-order approximation was encountered, but experiments suggest that this was due to sensitivity of the result to the numerically found root \(\hat{s}\). This instability is rarely encountered with the present default setting, but the user may want to adjust root-finding-related parameters when any doubt exists.
pqfr()
and dqfr()
Return values from pqfr_imhof()
and
dqfr_broda()
have an error bound abserr
for
numerical integration, along with the evaluation result itself.
Technically, the error bound from the integration algorithm is divided
by pi
before returned, as the evaluation result itself is.
This can be passed to the external functions and returned as an
attribute by setting return_abserr_attr = TRUE
(as already
used in above examples):
A <- diag(1:4)
pqfr(1.5, A, return_abserr_attr = TRUE)
#> [1] 0.06819534
#> attr(,"abserr")
#> [1] 9.576418e-08
dqfr(1.5, A, return_abserr_attr = TRUE)
#> [1] 0.22202
#> attr(,"abserr")
#> [1] 5.738788e-08
This error bound tries to accommodate the effect of
trim_values
. If the integration result is outside the
permissible support (e.g., negative density), the possible error bound
is only on the direction toward the support (assuming things are
calculated accurately). The returned abserr
is truncated
accordingly, unless trimming is beyond the original abserr
(in which case it is replaced by tol_zero
). See this in
examples:
## Without trimming, result is (typically) negative
## But note that value + abserr is positive
dqfr(1.2, diag(1:35), return_abserr_attr = TRUE,
epsabs = 1e-10, trim_values = FALSE)
#> [1] -2.208719e-18
#> attr(,"abserr")
#> [1] 2.378926e-12
## With trimming, value is replaced by tol_zero
## Note slightly shortened abserr
dqfr(1.2, diag(1:35), return_abserr_attr = TRUE,
epsabs = 1e-10)
#> Warning in dqfr(1.2, diag(1:35), return_abserr_attr = TRUE, epsabs = 1e-10):
#> values < 0 trimmed up to tol_zero
#> [1] 2.220446e-14
#> attr(,"abserr")
#> [1] 2.356719e-12
## When untrimmed value + abserr < tol_zero
dqfr(1.1, diag(1:35), return_abserr_attr = TRUE,
epsabs = 1e-15, trim_values = FALSE)
#> [1] -3.865257e-18
#> attr(,"abserr")
#> [1] 7.796129e-16
## True value is somewhere between 0 and value + abserr
## (assuming these are reliable)
## When trimmed, abserr reflects tol_zero
## because the true value is between 0 and tol_zero
dqfr(1.1, diag(1:35), return_abserr_attr = TRUE,
epsabs = 1e-15)
#> Warning in dqfr(1.1, diag(1:35), return_abserr_attr = TRUE, epsabs = 1e-15):
#> values < 0 trimmed up to tol_zero
#> [1] 2.220446e-14
#> attr(,"abserr")
#> [1] 2.220446e-14
When log
/log.p = TRUE
, abserr
is transformed so that it is a conservative absolute error bound on the
log scale. That is, if the original value and its error bound is denoted
by \(\hat{x}\) and \(\delta \hat{x}\), respectively, and the
log-transformed value and its error bound is by \(\log \hat{x}\) and \(\delta (\log \hat{x})\), the latter error
bound is set so that \(\log \hat{x} - \delta
(\log \hat{x}) = \log (\hat{x} - \delta \hat{x})\), i.e., \(\delta (\log \hat{x}) = - \log \left( 1 -
\frac{\delta \hat{x}}{\hat{x}} \right)\). Note that the upper
error bound \(\log \left( 1 + \frac{\delta
\hat{x}}{\hat{x}} \right)\) is narrower than this unless \(\delta \hat{x} > \hat{x}\) (i.e., \(\hat{x} - \delta \hat{x} < 0\)), in
which case it should be taken as \(\delta
(\log \hat{x}) = \infty\). In summary, the new error bound is
calculated as
ifelse(abserr > ans, Inf, -log1p(-abserr/ans))
.
qqfr()
The option return_abserr_attr = TRUE
is available in
qqfr()
as well:
A <- diag(1:4)
qqfr(0.95, A, return_abserr_attr = TRUE)
#> [1] 3.587557
#> attr(,"abserr")
#> [1] 6.648746e-06
In qqfr()
, numerical errors arise from the root-finding
with stats::uniroot()
as well as in propagation from
pqfr()
. When return_abserr_attr = TRUE
, it
tries to evaluate a conservative error bound as follows:
uniroot()$estim.prec
as \(\delta
q_{\mathrm{rf}}\)pqfr()
at the root as \(\delta F\)dqfr(..., method = "broda")
, so that the conservative slope
of the tangent there is \(b = \max ( f -
\delta f , 0 )\). The error \(\delta
q_{\mathrm{p}}\) in the root arising from pqfr()
is
at most \(b^{-1} \delta F\). If \(\delta F = 0\), \(\delta q_{\mathrm{p}} = 0\) regardless of
\(b\).If log.p = TRUE
, the root-finding is done on \(\log F\), so the slope used in 3 is
replaced by \(b = \max ( f - \delta f , 0 ) /
F\).
For probability = 0
or 1
, the quantile
corresponds to the minimum or maximum of the ratio, and the above error
bound does not apply. At present, an arbitrary value of
.Machine$double.eps * 100
(~2.2e-14
) is
returned as an error bound for a finite minimum/maximum, although the
actual error in calculation can be larger.
For completeness, pqfr()
and dqfr()
can be
used to evaluate powers of ratios of quadratic forms, \(Q^p\), with the exponent specified by the
argument p
(default 1
). Note that, unlike
moment-related functions of this package, the numerator and denominator
must have the same exponent. When p != 1
, these functions
return appropriate results typically by transforming those from
p == 1
with recursive calling.
For the rest of this section, consider the distribution and density functions of \(R = Q^p\) at \(r = q^p\). The Jacobian for the density is \(\left| \frac{\mathrm{d} q}{\mathrm{d} r} \right| = \frac{1}{p} \left| r \right| ^ {\frac{1}{p} - 1}\).
In this case, the relationship between \(Q\) and \(R\) is one-to-one, so that \[\begin{equation}
\begin{aligned}
F_{R}(r)
= &
\Pr \left( Q^p \leq r \right) \\
= &
\Pr \left( Q \leq \qfrsgn{(r)} \cdot \left| r \right| ^{\frac{1}{p}}
\right) \\
= &
F_{Q} \left( \qfrsgn{(r)} \cdot \left| r \right| ^{\frac{1}{p}}
\right) , \\
f_{R}(r)
= &
f_{Q}(q) \cdot \left| \frac{\mathrm{d} q}{\mathrm{d} r} \right| \\
= &
f_{Q} \left( \qfrsgn{(r)} \cdot \left| r \right| ^{\frac{1}{p}}
\right) \cdot \frac{1}{p} \left| r \right| ^ {\frac{1}{p} - 1} .
\end{aligned}
\end{equation}\] Thus, the result can be obtained by a single
recursive call of pqfr(..., p = 1)
or
dqfr(..., p = 1)
with transformed
quantile
.
In this case, \(R\) is an even
function of \(Q\), so that \[\begin{equation}
\begin{aligned}
F_{R}(r)
= &
\Pr \left( Q^p \leq r \right) \\
= &
\begin{cases}
\Pr \left( Q \leq r^{\frac{1}{p}} \right) -
\Pr \left( Q < - r^{\frac{1}{p}} \right)
= F_{Q} \left( r^{\frac{1}{p}} \right) - F_{Q} \left( -
r^{\frac{1}{p}} \right) , & r > 0 , \\
0 , & r \leq 0 ,
\end{cases} \\
f_{R}(r)
= &
\begin{cases}
\left[ f_{Q} \left( r^{\frac{1}{p}} \right) +
f_{Q} \left( - r^{\frac{1}{p}} \right) \right]
\cdot \left| \frac{\mathrm{d} q}{\mathrm{d} r} \right| , & r
> 0 , \\
f_{Q}(0) \cdot \left| \frac{\mathrm{d} q}{\mathrm{d} r} \right|
= \infty , & r = 0 , \\
0 , & r < 0 .
\end{cases}
\end{aligned}
\end{equation}\] Thus, for \(r >
0\), the result is obtained from two recursive calls of
pqfr(..., p = 1)
or dqfr(..., p = 1)
with
transformed quantile
.
In this case, \(R\) can be
undefined, so pqfr()
and dqfr()
return an
error,
"A must be nonnegative definite when p is non-integer"
,
regardless of the value of quantile
.
First we compare evaluation methods for the distribution function:
A <- diag(1:4)
qseq <- seq.int(0.8, 4.2, length.out = 100)
## Generate p-value sequences
## Warning is expected
pseq_inv <- pqfr(qseq, A, method = "imhof",
return_abserr_attr = TRUE)
pseq_ser <- pqfr(qseq, A, method = "forchini",
check_convergence = FALSE)
#> Warning in p_A1B1_Ed(quantile, A, B, mu, m, stop_on_error, thr_margin, nthreads, : problem in gsl_sf_hyperg_2F1_e():
#> evaluation failed due to singularity
#> max iteration reached
pseq_spa <- pqfr(qseq, A, method = "butler")
## Maximum error in numerical inversion;
## looks small enough
max(attr(pseq_inv, "abserr"))
#> [1] 1.269026e-06
## Graphical comparison
par(mar = c(4, 4, 0.1, 0.1))
plot(qseq, type = "n", xlim = c(1, 4), ylim = c(0, 1),
xlab = "q", ylab = "F(q)")
lines(qseq, pseq_inv, col = "gray", lty = 1)
lines(qseq, pseq_ser, col = "tomato", lty = 2)
lines(qseq, pseq_spa, col = "slateblue", lty = 3)
legend("topleft", legend = c("inversion", "series", "saddlepoint"),
col = c("gray", "tomato", "slateblue"), lty = 1:3, cex = 0.8)
## Logical vector to exclude q around eigenvalues of A
avoid_evals <- ((qseq %% 1) > 0.05) & ((qseq %% 1) < 0.95)
## Numerical comparison
all.equal(pseq_inv[avoid_evals], pseq_ser[avoid_evals],
check.attributes = FALSE)
#> [1] "Mean relative difference: 0.0007499845"
all.equal(pseq_inv[avoid_evals], pseq_spa[avoid_evals],
check.attributes = FALSE)
#> [1] "Mean relative difference: 0.009893572"
Around the eigenvalues of A
, the series expression is
slow to converge; this could partly be mitigated by using larger
m
(default 100
), but that will usually be
time-consuming, and evaluation of hypergeometric function may fail
regardless (for which a warning is already thrown above). Apart from
these points, the series and numerical inversion methods yield very
similar values. The saddlepoint approximation yields slightly inaccurate
result, but is usually the fastest among these methods.
Next, we compare methods for the probability density:
## Generate p-value sequences
dseq_inv <- dqfr(qseq, A, method = "broda",
return_abserr_attr = TRUE)
dseq_ser <- dqfr(qseq, A, method = "hillier",
check_convergence = FALSE)
dseq_spa <- dqfr(qseq, A, method = "butler")
## Maximum error in numerical inversion;
## looks small enough
max(attr(dseq_inv, "abserr"))
#> [1] 8.574178e-07
## Graphical comparison
par(mar = c(4, 4, 0.1, 0.1))
plot(qseq, type = "n", xlim = c(1, 4), ylim = c(0, 0.8),
xlab = "q", ylab = "f(q)")
lines(qseq, dseq_inv, col = "gray", lty = 1)
lines(qseq, dseq_ser, col = "tomato", lty = 2)
lines(qseq, dseq_spa, col = "slateblue", lty = 3)
legend("topleft", legend = c("inversion", "series", "saddlepoint"),
col = c("gray", "tomato", "slateblue"), lty = 1:3, cex = 0.8)
## Numerical comparison
all.equal(dseq_inv, dseq_ser, check.attributes = FALSE)
#> [1] "Mean relative difference: 8.806696e-07"
all.equal(dseq_inv, dseq_spa, check.attributes = FALSE)
#> [1] "Mean relative difference: 0.05382156"
## Do densities sum up to 1?
sum(dseq_inv * diff(qseq)[1])
#> [1] 1.001194
sum(dseq_ser * diff(qseq)[1])
#> [1] 1.001193
sum(dseq_spa * diff(qseq)[1])
#> [1] 0.9613508
The series expression looks successful across the range. The saddlepoint approximation usually fails to capture a fancy profile as seen in the above plot. That will be less of a concern as the dimensionality increases, in which case the distribution approaches normality.
The last three lines conduct a rough check on whether the densities
integrate/sum up to unity. The results for the inversion and series
methods are expected to approach 1
as we use a finer
sequence. The saddlepoint approximation density could be normalized at
the cost of slight computational time, although the normalization may or
may not yield more accurate results at a particular quantile: