Additional technical details
sstools Team
2024-05-02
Source:vignettes/E_additional-technical-details.Rmd
E_additional-technical-details.Rmd
Small sample bias
The ssdtools package uses the method of Maximum Likelihood (ML) to estimate parameters for each distribution that is fit to the data. Statistical theory says that maximum likelihood estimators are asymptotically unbiased, but does not guarantee performance in small samples. A detailed account of the issue of small sample bias in estimates can be found in the following pdf.
The inverse Pareto and inverse Weibull as limiting distributions of the Burr Type-III distribution
Burr III distribution
The probability density function, \({f_X}(x;b,c,k)\) and cumulative
distribution function, \({F_X}(x;b,c,k)\) for the Burr III
distribution (also known as the Dagum distribution) as used in
ssdtools
are:
\[\begin{array}{*{20}{c}} {{F_X}(x;b,c,k) = \frac{1}{{{{\left[ {1 + {{\left( {\frac{b}{x}} \right)}^c}} \right]}^k}}}{\rm{ }}}&{b,c,k,x > 0} \end{array}\]
Inverse Pareto distribution
Let \(X \sim Burr(b,c,k)\) have
the pdf given in the box above. It is well known that the
distribution of \(Y = \frac{1}{X}\) is
the inverse Burr distribution (also known as the
SinghMaddala distribution) for which:\[\begin{array}{*{20}{c}}
{{f_Y}(y;b,c,k) = \frac{{c{\kern 1pt} {\kern 1pt} k{{\left(
{\frac{y}{b}} \right)}^c}}}{{y{\kern 1pt} {{\left[ {1 + {{\left(
{\frac{y}{b}} \right)}^c}} \right]}^{k + 1}}}}}&{b,c,k,y > 0}
\end{array}\]
\[\begin{array}{*{20}{c}} {{F_Y}(y;b,c,k) = 1 - \frac{1}{{{{\left[ {1 + {{\left( {\frac{y}{b}} \right)}^c}} \right]}^k}}}}&{b,c,k,y > 0} \end{array}\]
We now consider the limiting distribution when \(c \to \infty\) and \(k \to 0\) in such a way that the product \(ck\) remains constant, i.e. \(ck = \lambda\).
Now, \[\begin{array}{l} \mathop {\mathop {\lim }\limits_{(c,k) \to (\infty ,0)} }\limits_{ck = \lambda } \left\{ {{F_Y}(y;b,c,k)} \right\} = 1 - \mathop {\mathop {\lim }\limits_{(c,k) \to (\infty ,0)} }\limits_{ck = \lambda } \frac{1}{{{{\left[ {1 + {{\left( {\frac{y}{b}} \right)}^c}} \right]}^k}}}\\ \\ and\\ \\ \mathop {\mathop {\lim }\limits_{(c,k) \to (\infty ,0)} }\limits_{ck = \lambda } {\left[ {1 + {{\left( {\frac{y}{b}} \right)}^c}} \right]^k} = \mathop {\mathop {\lim }\limits_{(c,k) \to (\infty ,0)} }\limits_{ck = \lambda } \left\{ {{{\left( {\frac{y}{b}} \right)}^{ck}}{{\left[ {1 + {{\left( {\frac{b}{y}} \right)}^c}} \right]}^k}} \right\}\\ \\ and\\ \\ \mathop {\mathop {\lim }\limits_{(c,k) \to (\infty ,0)} }\limits_{ck = \lambda } \left\{ {{{\left( {\frac{y}{b}} \right)}^{ck}}{{\left[ {1 + {{\left( {\frac{b}{y}} \right)}^c}} \right]}^k}} \right\} = \mathop {\mathop {\lim }\limits_{(c,k) \to (\infty ,0)} }\limits_{ck = \lambda } \left\{ {{{\left( {\frac{y}{b}} \right)}^{ck}}} \right\}\mathop {\mathop {\lim }\limits_{(c,k) \to (\infty ,0)} }\limits_{ck = \lambda } \left\{ {{{\left[ {1 + {{\left( {\frac{b}{y}} \right)}^c}} \right]}^k}} \right\}\\ = \mathop {\mathop {\lim }\limits_{(c,k) \to (\infty ,0)} }\limits_{ck = \lambda } \left\{ {{{\left( {\frac{y}{b}} \right)}^{ck}}} \right\}\; \cdot \,1\\ = {\left( {\frac{y}{b}} \right)^\lambda } \end{array}\]
Therefore, \[\begin{array}{*{20}{c}} {\mathop {\mathop {\lim }\limits_{(c,k) \to (\infty ,0)} }\limits_{ck = \lambda } \left\{ {{F_Y}(y;b,c,k)} \right\} = 1 - {{\left( {\frac{b}{y}} \right)}^\lambda }}&{y \ge b} \end{array}\]
which we recognise as the (American) Pareto distribution. So, if the limiting distribution of \(Y = \frac{1}{X}\) is a Pareto distribution, then the limiting distribution of \(X = \frac{1}{Y}\) is the (American) inverse Pareto distribution:
\[\begin{array}{l} {f_X}\left( {x;\alpha ,\beta } \right) = \lambda {b^\lambda }{x^{\lambda - 1}};{\rm{ }}0 \le x \le {\textstyle{1 \over b}};{\rm{ }}\lambda {\rm{,}}b > 0\\ {F_X}\left( {x;\alpha ,\beta } \right) = {\left( {xb} \right)^\lambda };{\rm{ }}0 \le x \le {\textstyle{1 \over b}};{\rm{ }}\lambda {\rm{,}}b > 0 \end{array}\]
For completeness, the MLEs of this distribution have closed-form expressions and are given by: \[\begin{array}{l} \hat \lambda = {\left[ {\ln \left( {\frac{{{g_X}}}{{\hat b}}} \right)} \right]^{ - 1}}\\ \hat b = \frac{1}{{\max \left\{ {{X_i}} \right\}}}{\rm{ }} \end{array}\]
and \({\rm{ }}{g_X}\)is the geometric mean of the data.
Inverse Weibull distribution
Let \(X \sim Burr(b,c,k)\) have
the pdf given in the box above. We make the transformation
\[Y = \frac{{b{\kern 1pt}
{k^{\tfrac{1}{c}}}{\kern 1pt} \theta }}{X}\] where \(\theta\) is a parameter (constant). The
distribution of \(Y\) is also a Burr
distribution and has cdf \[{G_Y}\left( y \right) = 1 - \frac{1}{{{{\left[ {1
+ {{\left( {\frac{y}{{{k^{\tfrac{1}{c}}}{\kern 1pt} \theta }}}
\right)}^c}} \right]}^k}}}\].
We are interested in the
limiting behaviour of this Burr distribution as \(k \to \infty\).
Now,\[\mathop {\lim }\limits_{k \to \infty }
{G_Y}\left( y \right) = 1 - \mathop {\lim }\limits_{k \to \infty }
{\left[ {1 + {{\left( {\frac{y}{{{k^{\tfrac{1}{c}}}{\kern 1pt} \theta
}}} \right)}^c}} \right]^{ - k}}\]
\[{ = 1 - \mathop {\lim }\limits_{k \to \infty } {{\left[ {1 + \frac{{{{\left( {\frac{y}{\theta }} \right)}^c}}}{{k{\kern 1pt} }}} \right]}^{ - k}}}\]
\[\begin{matrix} =1-\exp \left[ -{{\left( \frac{y}{\theta } \right)}^{c}} \right] \\ \left\{ \text{using the fact that }\underset{n\to \infty }{\mathop{\lim }}\,{{\left( 1+{}^{z}\!\!\diagup\!\!{}_{n}\; \right)}^{-n}}={{e}^{-z}} \right\} \\ \end{matrix}\]
We recognise the last expression as the cdf of a Weibull distribution with parameters \(c\) and \(\theta\).
Licensing
Copyright 2024 Province of British Columbia, Environment and Climate Change Canada, and Australian Government Department of Climate Change, Energy, the Environment and Water
The documentation is released under the CC BY 4.0 License
The code is released under the Apache License 2.0