Type: | Package |
Title: | Estimation of the Extremal Index |
Version: | 1.2.3 |
Date: | 2023-12-02 |
Description: | Performs frequentist inference for the extremal index of a stationary time series. Two types of methodology are used. One type is based on a model that relates the distribution of block maxima to the marginal distribution of series and leads to the semiparametric maxima estimators described in Northrop (2015) <doi:10.1007/s10687-015-0221-5> and Berghaus and Bucher (2018) <doi:10.1214/17-AOS1621>. Sliding block maxima are used to increase precision of estimation. A graphical block size diagnostic is provided. The other type of methodology uses a model for the distribution of threshold inter-exceedance times (Ferro and Segers (2003) <doi:10.1111/1467-9868.00401>). Three versions of this type of approach are provided: the iterated weight least squares approach of Suveges (2007) <doi:10.1007/s10687-007-0034-2>, the K-gaps model of Suveges and Davison (2010) <doi:10.1214/09-AOAS292> and a similar approach of Holesovsky and Fusek (2020) <doi:10.1007/s10687-020-00374-3> that we refer to as D-gaps. For the K-gaps and D-gaps models this package allows missing values in the data, can accommodate independent subsets of data, such as monthly or seasonal time series from different years, and can incorporate information from right-censored inter-exceedance times. Graphical diagnostics for the threshold level and the respective tuning parameters K and D are provided. |
Imports: | chandwich, graphics, methods, Rcpp, RcppRoll, stats |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Depends: | R (≥ 3.3.0) |
Suggests: | knitr, revdbayes, rmarkdown, testthat, zoo (≥ 1.6.4) |
LazyData: | true |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.3 |
VignetteBuilder: | knitr |
URL: | https://github.com/paulnorthrop/exdex, https://paulnorthrop.github.io/exdex/ |
BugReports: | https://github.com/paulnorthrop/exdex/issues |
LinkingTo: | Rcpp, RcppArmadillo |
Config/testthat/edition: | 3 |
NeedsCompilation: | yes |
Packaged: | 2023-12-02 16:17:24 UTC; Paul |
Author: | Paul J. Northrop [aut, cre, cph], Constantinos Christodoulides [aut, cph] |
Maintainer: | Paul J. Northrop <p.northrop@ucl.ac.uk> |
Repository: | CRAN |
Date/Publication: | 2023-12-02 16:50:02 UTC |
exdex: Estimation of the Extremal Index
Description
The extremal index \theta
is a measure of the degree of local
dependence in the extremes of a stationary process. The exdex
package performs frequentist inference about \theta
using the
methodologies proposed in Northrop (2015), Berghaus and Bucher (2018),
Suveges (2007), Suveges and Davison (2010) and Holesovsky and Fusek (2020).
Details
Functions to implement four estimators of the extremal index are provided, namely
-
spm
: semiparametric maxima estimator, using block maxima: (Northrop, 2015; Berghaus and Bucher, 2018) -
kgaps
:K
-gaps estimator, using threshold inter-exceedance times (Suveges and Davison, 2010) -
dgaps
:D
-gaps estimator, using threshold inter-exceedance times (Holesovsky and Fusek, 2020)) -
iwls
: iterated weighted least squares estimator, using threshold inter-exceedance times: (Suveges, 2007)
The functions choose_b
, choose_uk
and
choose_ud
provide graphical diagnostics for choosing the
respective tuning parameters of the semiparametric maxima, K
-gaps and
D
-gaps estimators.
For the K
-gaps and D
-gaps models the 'exdex' package allows
missing values in the data, can accommodate independent subsets of data,
such as monthly or seasonal time series from different years, and can
incorporate information from censored inter-exceedance times.
See vignette("exdex-vignette", package = "exdex")
for an
overview of the package.
Author(s)
Maintainer: Paul J. Northrop p.northrop@ucl.ac.uk [copyright holder]
Authors:
Constantinos Christodoulides [copyright holder]
References
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Suveges, M. (2007) Likelihood estimation of the extremal index. Extremes, 10, 41-55. doi:10.1007/s10687-007-0034-2
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
See Also
spm
: semiparametric maxima estimator.
kgaps
: K
-gaps estimator.
dgaps
: D
-gaps estimator.
iwls
: iterated weighted least squares estimator.
choose_b
, choose_ud
and
choose_ud
for choosing tuning parameters.
newlyn
, sp500
and
cheeseboro
for example datasets.
Sliding and disjoint block maxima
Description
Calculates the (sliding) maxima of all blocks of b
contiguous values
and all sets of the maxima of disjoint blocks of b
contiguous values
in the vector x
. This provides the first step of computations in
spm
.
Usage
all_max_rcpp(x, b = 1, which_dj = c("all", "first", "last"), ...)
Arguments
x |
A numeric vector of raw observations. |
b |
A numeric scalar. The block size. |
which_dj |
A character scalar. Determines Which sets of disjoint
maxima are calculated: |
... |
Further arguments to be passed to
|
Details
Sliding maxima. The function
roll_max
in the RcppRoll
package is used.
Disjoint maxima. If n = length(x)
is an integer
multiple of b
, or if which_dj = "first"
or
which_dj = "last"
then only one set of n / b
disjoint
block maxima are returned.
Otherwise, n - floor(n / b) * b + 1
sets of floor(n / b)
disjoint block maxima are returned. Set i
are the disjoint maxima
of x[i:(i + floor(n / b) * b - 1)]
. That is, all possible sets
of contiguous disjoint maxima achieving the maxima length of
floor(n / b)
are calculated.
In both instances na.rm = TRUE
is passed to max
so
that blocks containing missing values produce a non-missing result.
Also returned are the values in x
that contribute to each set
of block maxima.
Value
A list containing
ys |
a numeric vector containing one set of sliding block maxima. |
xs |
a numeric vector containing the values that
contribute to |
yd |
if |
xd |
if |
See Also
spm
for semiparametric estimation of the
extremal index based on block maxima.
Examples
x <- 1:11
all_max_rcpp(x, 3)
all_max_rcpp(x, 3, which_dj = "first")
all_max_rcpp(x, 3, which_dj = "last")
Cheeseboro hourly maximum wind gusts
Description
The matrix cheeseboro
contains hourly maximum wind gusts (in miles
per hour) recorded at the Cheeseboro weather station near Thousand Oaks,
Southern California, USA during the month of January over the period
2000-2009. These data are analysed in Reich and Shaby (2016).
Usage
cheeseboro
Format
A 744 by 10 numeric matrix. Column i
contains the hourly
maximum wind gusts (in miles per hour) from Cheeseboro in the year
2000 + i
- 1. The columns are named 2000, 2001, ..., 2009 and the
rows are named dayj
hourk
, where j
is the day of the
month and k
the hour of the day.
Note
There are 42 missing values, located in 6 of the 10 years, namely 2000-2003 and 2005-2006.
Source
The Remote Automated Weather Stations USA Climate Archive at https://raws.dri.edu/, more specifically the Daily Summaries of the Cheeseboro page.
References
Reich, B. J. and Shaby, B. A. (2016). 'Time series of Extremes', in Dey, D. K. and Yan, J. (eds.) Extreme Value Modeling and Risk Analysis. New York: Chapman and Hall/CRC, pp. 131-151.
Block length diagnostic for the semiparametric maxima estimator
Description
Creates data for a plot to aid the choice of the block length b
to
supply to spm
. The general idea is to select the smallest
value of b
above which estimates of the extremal index \theta
appear to be constant with respect to b
, taking into account sampling
variability. plot.choose_b
creates the plot.
Usage
choose_b(
data,
b,
bias_adjust = c("BB3", "BB1", "N", "none"),
constrain = TRUE,
varN = TRUE,
level = 0.95,
interval_type = c("norm", "lik"),
conf_scale = c("theta", "log"),
type = c("vertical", "cholesky", "spectral", "none")
)
Arguments
data |
A numeric vector of raw data. No missing values are allowed. |
b |
A numeric scalar. The block size. |
bias_adjust |
A character scalar. Is bias-adjustment of the
raw estimate of |
constrain |
A logical scalar. If |
varN |
A logical scalar. If |
level |
A numeric scalar in (0, 1). The confidence level required. |
interval_type |
A character scalar: |
conf_scale |
A character scalar. If If Any bias-adjustment requested in the original call to |
type |
A character scalar. The argument |
Details
For each block size in b
the extremal index \theta
is estimated using spm
. The estimates of \theta
approximate conf
% confidence intervals for \theta
are
stored for plotting (by plot.choose_b
)
to produce a simple graphical diagnostic to inform the choice of
block size. This plot is used to choose a block size above which the
underlying value of \theta
may be approximately constant.
This is akin to a threshold stability plot: see Chapter 4 of Coles (2001),
for example.
The nature of the calculation of the sampling variances of the estimates
of \theta
(see spm
for details) means that
choose_b
may be a little slow to run if b
contains many
values, particularly if some of them are small.
For very small block sizes it may not be possible to estimate the
confidence intervals. See Details in spm
.
For any such block sizes the intervals will be missing from the plot.
Value
An object of class c("choose_b", "exdex")
containing
theta_sl , theta_dj |
numeric |
lower_sl , lower_dj |
Similarly for the lower limits of the confidence intervals. |
upper_sl , upper_dj |
Similarly for the upper limits of the confidence intervals. |
b |
the input |
call |
the call to |
References
Coles, S. G. (2001) An Introduction to Statistical Modeling of Extreme Values, Springer-Verlag, London. doi:10.1007/978-1-4471-3675-0_3
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
See Also
plot.choose_b
to produce the block length diagnostic
plot.
Examples
# Newlyn seas surges
# Plot like the top left of Northrop (2015)
# Remove the last 14 values because 2880 has lots of factors
b_vals <- c(2,3,4,5,6,8,9,10,12,15,16,18,20,24,30,32,36,40,45,48,54,60)
res <- choose_b(newlyn[1:2880], b_vals)
# Some b are too small for the sampling variance of the sliding blocks
# estimator to be estimated
plot(res)
plot(res, estimator = "BB2018")
plot(res, maxima = "disjoint")
# S&P 500 index: similar to Berghaus and Bucher (2018), Fig 4 top left
b_vals <- c(10, seq(from = 25, to = 350, by = 25), 357)
res500 <- choose_b(sp500, b_vals)
plot(res500, ylim = c(0, 1))
plot(res500, estimator = "BB2018", ylim = c(0, 1))
Threshold u
and runs parameter D
diagnostic for the D
-gaps
estimator
Description
Creates data for a plot to aid the choice of the threshold and
run parameter D
for the D
-gaps estimator (see
dgaps
). plot.choose_ud
creates the plot.
Usage
choose_ud(data, u, D = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector or numeric matrix of raw data. If If |
u , D |
Numeric vectors. Any values in |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. It is known that these times are greater
than or equal to the time observed.
If |
Details
For each combination of threshold in u
and D
in D
the functions dgaps
and dgaps_imt
are called in order to estimate \theta
and to perform the
information matrix test of Holesovsky and Fusek (2020).
Value
An object (a list) of class c("choose_ud", "exdex")
containing
imt |
an object of class |
theta |
a |
References
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
See Also
dgaps
for maximum likelihood estimation of the
extremal index \theta
using the D
-gaps model.
dgaps_imt
for the information matrix test under the
D
-gaps model
plot.choose_ud
to produce the diagnostic plot.
Examples
### S&P 500 index
# Multiple thresholds and left-censoring parameters
u <- quantile(sp500, probs = seq(0.2, 0.9, by = 0.1))
imt_theta <- choose_ud(sp500, u = u, D = 1:5)
plot(imt_theta)
plot(imt_theta, uprob = TRUE)
plot(imt_theta, y = "theta")
# One left-censoring parameter D, many thresholds u
u <- quantile(sp500, probs = seq(0.2, 0.9, by = 0.1))
imt_theta <- choose_ud(sp500, u = u, D = 1)
plot(imt_theta)
plot(imt_theta, y = "theta")
# One threshold u, many left-censoring parameters D
u <- quantile(sp500, probs = 0.9)
imt_theta <- choose_ud(sp500, u = u, D = 1:5)
plot(imt_theta)
plot(imt_theta, y = "theta")
### Newlyn sea surges
u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1))
imt_theta <- choose_ud(newlyn, u = u, D = 1:5)
plot(imt_theta, uprob = TRUE)
### Cheeseboro wind gusts (a matrix containing some NAs)
probs <- c(seq(0.5, 0.95, by = 0.05), 0.99)
u <- quantile(cheeseboro, probs = probs, na.rm = TRUE)
imt_theta <- choose_ud(cheeseboro, u, D = 1:6)
plot(imt_theta, uprob = FALSE, lwd = 2)
### Uccle July temperatures
probs <- c(seq(0.7, 0.95, by = 0.05), 0.99)
u <- quantile(uccle720m, probs = probs, na.rm = TRUE)
imt_theta <- choose_ud(uccle720m, u, D = 1:5)
plot(imt_theta, uprob = TRUE, lwd = 2)
Threshold u
and runs parameter K
diagnostic for the K
-gaps
estimator
Description
Creates data for a plot to aid the choice of the threshold and
run parameter K
for the K
-gaps estimator (see
kgaps
). plot.choose_uk
creates the plot.
Usage
choose_uk(data, u, k = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector or numeric matrix of raw data. If If |
u , k |
Numeric vectors. Any values in |
inc_cens |
A logical scalar indicating whether or not to include contributions from censored inter-exceedance times, relating to the first and last observations. See Attalides (2015) for details. |
Details
For each combination of threshold in u
and K
in k
the functions kgaps
and kgaps_imt
are called in order to estimate \theta
and to perform the
information matrix test of Suveges and Davison (2010).
Value
An object (a list) of class c("choose_uk", "exdex")
containing
imt |
an object of class |
theta |
a |
References
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
See Also
kgaps
for maximum likelihood estimation of the
extremal index \theta
using the K
-gaps model.
kgaps_imt
for the information matrix test under the
K
-gaps model
plot.choose_uk
to produce the diagnostic plot.
Examples
### S&P 500 index
# Multiple thresholds and run parameters
u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1))
imt_theta <- choose_uk(sp500, u = u, k = 1:5)
plot(imt_theta)
plot(imt_theta, uprob = TRUE)
plot(imt_theta, y = "theta")
# One run parameter K, many thresholds u
u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1))
imt_theta <- choose_uk(sp500, u = u, k = 1)
plot(imt_theta)
plot(imt_theta, y = "theta")
# One threshold u, many run parameters K
u <- quantile(sp500, probs = 0.9)
imt_theta <- choose_uk(sp500, u = u, k = 1:5)
plot(imt_theta)
plot(imt_theta, y = "theta")
### Newlyn sea surges
u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1))
imt_theta <- choose_uk(newlyn, u = u, k = 1:5)
plot(imt_theta, uprob = TRUE)
### Cheeseboro wind gusts (a matrix containing some NAs)
probs <- c(seq(0.5, 0.95, by = 0.05), 0.99)
u <- quantile(cheeseboro, probs = probs, na.rm = TRUE)
imt_theta <- choose_uk(cheeseboro, u, k = 1:6)
plot(imt_theta, uprob = FALSE, lwd = 2)
### Uccle July temperatures
probs <- c(seq(0.7, 0.95, by = 0.05), 0.99)
u <- quantile(uccle720m, probs = probs, na.rm = TRUE)
imt_theta <- choose_uk(uccle720m, u, k = 1:5)
plot(imt_theta, uprob = TRUE, lwd = 2)
Maximum likelihood estimation using left-censored inter-exceedances times
Description
Calculates maximum likelihood estimates of the extremal index \theta
based on a model for threshold inter-exceedances times of
Holesovsky and Fusek (2020). We refer to this as the D
-gaps model,
because it uses a tuning parameter D
, whereas the related K
-gaps
model of Suveges and Davison (2010) has a tuning parameter K
.
Usage
dgaps(data, u, D = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector or numeric matrix of raw data. If If |
u |
A numeric scalar. Extreme value threshold applied to data. |
D |
A numeric scalar. The censoring parameter |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. It is known that these times are greater
than or equal to the time observed.
If |
Details
If inc_cens = FALSE
then the maximum likelihood estimate of
the extremal index \theta
under the D
-gaps model of
Holesovsky and Fusek (2020) is calculated. Under this model
inter-exceedance times that are less than or equal to D
are
left-censored, as a strategy to mitigate model mis-specification resulting
from the fact that inter-exceedance times that are equal to 0 are expected
under the model but only positive inter-exceedance times can be observed
in practice.
If inc_cens = TRUE
then information from the right-censored
first and last inter-exceedance times are also included in the likelihood
to be maximized.
For an explanation of the idea see Attalides (2015). The form of the
log-likelihood is given in the Details section of
dgaps_stat
.
It is possible that the estimate of \theta
is equal to 1, and also
possible that it is equal to 0. dgaps_stat
explains the
respective properties of the data that cause these events to occur.
Value
An object (a list) of class c("dgaps", "exdex")
containing
theta |
The maximum likelihood estimate (MLE) of
|
se |
The estimated standard error of the MLE, calculated
using an algebraic expression for the observed information. If the
estimate of |
se_exp |
The estimated standard error of the MLE,
calculated using an algebraic expression for the expected information.
If the estimate of |
ss |
The list of summary statistics returned from
|
D , u , inc_cens |
The input values of |
max_loglik |
The value of the log-likelihood at the MLE. |
call |
The call to |
References
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
See Also
dgaps_confint
to estimate confidence intervals
for \theta
.
dgaps_methods
for S3 methods for "dgaps"
objects.
dgaps_imt
for the information matrix test, which
may be used to inform the choice of the pair (u, D
).
choose_ud
for a diagnostic plot based on
dgaps_imt
.
dgaps_stat
for the calculation of sufficient
statistics for the D
-gaps model.
Examples
### S&P 500 index
u <- quantile(sp500, probs = 0.60)
theta <- dgaps(sp500, u = u, D = 1)
theta
summary(theta)
coef(theta)
nobs(theta)
vcov(theta)
logLik(theta)
### Newlyn sea surges
u <- quantile(newlyn, probs = 0.60)
theta <- dgaps(newlyn, u = u, D = 2)
theta
summary(theta)
### Uccle July temperatures
# Using vector input, which merges data from different years
u <- quantile(uccle720$temp, probs = 0.9, na.rm = TRUE)
theta <- dgaps(uccle720$temp, u = u, D = 2)
theta
# Using matrix input to separate data from different years
u <- quantile(uccle720m, probs = 0.9, na.rm = TRUE)
theta <- dgaps(uccle720m, u = u, D = 2)
theta
Confidence intervals for the extremal index \theta
for "dgaps"
objects
Description
confint
method for objects of class c("dgaps", "exdex")
.
Computes confidence intervals for \theta
based on an object returned
from dgaps
. Two types of interval may be returned:
(a) intervals based on approximate large-sample normality of the estimator
of \theta
, which are symmetric about the point estimate,
and (b) likelihood-based intervals. The plot
method plots the
log-likelihood for \theta
, with the required confidence interval
indicated on the plot.
Usage
## S3 method for class 'dgaps'
confint(
object,
parm = "theta",
level = 0.95,
interval_type = c("both", "norm", "lik"),
conf_scale = c("theta", "log"),
constrain = TRUE,
se_type = c("observed", "expected"),
...
)
## S3 method for class 'confint_dgaps'
plot(x, ...)
## S3 method for class 'confint_dgaps'
print(x, ...)
Arguments
object |
An object of class |
parm |
Specifies which parameter is to be given a confidence interval.
Here there is only one option: the extremal index |
level |
The confidence level required. A numeric scalar in (0, 1). |
interval_type |
A character scalar: |
conf_scale |
A character scalar. If If |
constrain |
A logical scalar. If |
se_type |
A character scalar. Should the confidence intervals for the
|
... |
|
x |
an object of class |
Details
Two type of interval are calculated: (a) an interval based on the
approximate large sample normality of the estimator of \theta
(if conf_scale = "theta"
) or of \log\theta
(if conf_scale = "log"
) and (b) a likelihood-based interval,
based on the approximate large sample chi-squared, with 1 degree of
freedom, distribution of the log-likelihood ratio statistic.
print.confint_dgaps
prints the matrix of confidence
intervals for \theta
.
Value
A list of class c("confint_dgaps", "exdex") containing the following components.
cis |
A matrix with columns giving the lower and upper confidence
limits. These are labelled as (1 - level)/2 and 1 - (1 - level)/2 in
% (by default 2.5% and 97.5%).
The row names indicate the type of interval:
|
call |
The call to |
object |
The input object |
level |
The input |
plot.confint_dgaps
: nothing is returned.
print.confint_dgaps
: the argument x
, invisibly.
References
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
See Also
dgaps
for estimation of the extremal index
\theta
using a semiparametric maxima method.
Examples
u <- quantile(newlyn, probs = 0.90)
theta <- dgaps(newlyn, u)
cis <- confint(theta)
cis
plot(cis)
Information matrix test under the D
-gaps model
Description
Performs an information matrix test (IMT) to diagnose misspecification of
the D
-gaps model of Holesovsky and Fusek (2020).
Usage
dgaps_imt(data, u, D = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector or numeric matrix of raw data. If If |
u , D |
Numeric vectors. Any values in |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. See |
Details
The general approach follows Suveges and Davison (2010).
The D
-gaps IMT is performed a over grid of all
combinations of threshold and D
in the vectors u
and D
. If the estimate of \theta
is 0 then the
IMT statistic, and its associated p
-value is NA
.
Value
An object (a list) of class c("dgaps_imt", "exdex")
containing
imt |
A |
p |
A |
theta |
A |
u , D |
The input |
References
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
See Also
dgaps
for maximum likelihood estimation of the
extremal index \theta
using the D
-gaps model.
Examples
### Newlyn sea surges
u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1))
imt <- dgaps_imt(newlyn, u = u, D = 1:5)
### S&P 500 index
u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1))
imt <- dgaps_imt(sp500, u = u, D = 1:5)
### Cheeseboro wind gusts (a matrix containing some NAs)
probs <- c(seq(0.5, 0.98, by = 0.025), 0.99)
u <- quantile(cheeseboro, probs = probs, na.rm = TRUE)
imt <- dgaps_imt(cheeseboro, u = u, D = 1:5)
### Uccle July temperatures
probs <- c(seq(0.7, 0.98, by = 0.025), 0.99)
u <- quantile(uccle720m, probs = probs, na.rm = TRUE)
imt <- dgaps_imt(uccle720m, u = u, D = 1:5)
Statistics for the D
-gaps information matrix test
Description
Calculates the components required to calculate the value of the information
matrix test under the D
-gaps model, using vector data input.
Called by dgaps_imt
.
Usage
dgaps_imt_stat(data, theta, u, D = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector of raw data. Missing values are allowed, but
they should not appear between non-missing values, that is, they only be
located at the start and end of the vector. Missing values are omitted
using |
theta |
A numeric scalar. An estimate of the extremal index
|
u |
A numeric scalar. Extreme value threshold applied to data. |
D |
A numeric scalar. The censoring parameter |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. See |
Value
A list
relating the quantities given on pages 18-19 of
Suveges and Davison (2010). All but the last component are vectors giving
the contribution to the quantity from each D
-gap, evaluated at the
input value theta
of \theta
.
ldj |
the derivative of the log-likelihood with respect to
|
Ij |
the observed information |
Jj |
the square of the score |
dj |
|
Ddj |
the derivative of |
n_dgaps |
the number of |
References
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Methods for objects of class "dgaps"
Description
Methods for objects of class c("dgaps", "exdex")
returned from
dgaps
.
Usage
## S3 method for class 'dgaps'
coef(object, ...)
## S3 method for class 'dgaps'
vcov(object, type = c("observed", "expected"), ...)
## S3 method for class 'dgaps'
nobs(object, ...)
## S3 method for class 'dgaps'
logLik(object, ...)
## S3 method for class 'dgaps'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'dgaps'
summary(
object,
se_type = c("observed", "expected"),
digits = max(3, getOption("digits") - 3L),
...
)
## S3 method for class 'summary.dgaps'
print(x, ...)
Arguments
object |
and object of class |
... |
For |
type |
A character scalar. Should the estimate of the variance be based on the observed information or the expected information? |
x |
|
digits |
|
se_type |
A character scalar. Should the estimate of the standard error be based on the observed information or the expected information? |
Value
coef.dgaps
. A numeric scalar: the estimate of the extremal index
\theta
.
vcov.dgaps
. A 1 \times 1
numeric matrix containing the
estimated variance of the estimator.
nobs.dgaps
. A numeric scalar: the number of inter-exceedance times
used in the fit. If x$inc_cens = TRUE
then this includes up to 2
censored observations.
logLik.dgaps
. An object of class "logLik"
: a numeric scalar
with value equal to the maximised log-likelihood. The returned object
also has attributes nobs
, the numbers of K
-gaps that
contribute to the log-likelihood and "df"
, which is equal to the
number of total number of parameters estimated (1).
print.dgaps
. The argument x
, invisibly.
summary.dgaps
. Returns a list containing the list element
object$call
and a numeric matrix summary
giving the estimate
of the extremal index \theta
and the estimated standard error
(Std. Error).
print.summary.dgaps
. The argument x
, invisibly.
Examples
See the examples in dgaps
.
See Also
dgaps
for maximum likelihood estimation of the
extremal index \theta
using the K
-gaps model.
confint.dgaps
for confidence intervals for
\theta
.
Sufficient statistics for the left-censored inter-exceedances time model
Description
Calculates sufficient statistics for the the left-censored inter-exceedances
time D
-gaps model for the extremal index \theta
.
Usage
dgaps_stat(data, u, q_u, D = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector of raw data. No missing values are allowed. |
u |
A numeric scalar. Extreme value threshold applied to data. |
q_u |
A numeric scalar. An estimate of the probability with which
the threshold |
D |
A numeric scalar. Run parameter |
inc_cens |
A logical scalar indicating whether or not to include contributions from right-censored inter-exceedance times relating to the first and last observation. It is known that these times are greater than or equal to the time observed. See Attalides (2015) for details. |
Details
The sample inter-exceedance times are
T_0, T_1, ..., T_{N-1}, T_N
,
where T_1, ..., T_{N-1}
are uncensored and
T_0
and T_N
are right-censored. Under the assumption that the
inter-exceedance times are independent, the log-likelihood of the
D
-gaps model is given by
l(\theta; T_0, \ldots, T_N) = N_0 \log(1 - \theta e^{-\theta d}) +
2 N_1 \log \theta - \theta q (I_0 T_0 + \cdots + I_N T_N),
where
-
q
is the threshold exceedance probability, estimated by the proportion of threshold exceedances, -
d = q D
, -
I_j = 1
ifT_j > D
andI_j = 0
otherwise, -
N_0
is the number of sample inter-exceedance times that are left-censored, that is, are less than or equal toD
, (apart from an adjustment for the contributions of
T_0
andT_N
)N_1
is the number of inter-exceedance times that are uncensored, that is, are greater thanD
,specifically, if
inc_cens = TRUE
thenN_1
is equal to the number ofT_1, ..., T_{N-1}
that are uncensored plus(I_0 + I_N) / 2
.
The differing treatment of uncensored and censored K
-gaps reflects
differing contributions to the likelihood. Right-censored
inter-exceedance times whose observed values are less than or equal to
D
add no information to the likelihood because we do not know to
which part of the likelihood they should contribute.
If N_1 = 0
then we are in the degenerate case where there is one
cluster (all inter-exceedance times are left-censored) and the likelihood
is maximized at \theta = 0
.
If N_0 = 0
then all exceedances occur singly (no inter-exceedance
times are left-censored) and the likelihood is maximized at
\theta = 1
.
Value
A list containing the sufficient statistics, with components
N0 |
the number of left-censored inter-exceedance times. |
N1 |
contribution from inter-exceedance times that are not left-censored (see Details). |
sum_qtd |
the sum of the (scaled) inter-exceedance times
that are not left-censored, that is,
|
n_dgaps |
the number of inter-exceedances that contribute to the log-likelihood. |
q_u |
the sample proportion of values that exceed the threshold. |
D |
the input value of |
References
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
See Also
dgaps
for maximum likelihood estimation of the
extremal index \theta
using the D
-gaps model.
Examples
u <- quantile(newlyn, probs = 0.90)
dgaps_stat(newlyn, u = u, D = 1)
Internal exdex functions
Description
Internal exdex functions
Usage
spm_check(
data,
b,
sliding = TRUE,
bias_adjust = c("BB3", "BB1", "N", "none"),
constrain = TRUE,
varN = TRUE,
which_dj = c("last", "first")
)
spm_sigmahat_dj(data, b, check = FALSE)
spm_R_quick(
data,
b,
bias_adjust = c("BB3", "BB1", "N", "none"),
constrain = TRUE,
varN = TRUE,
which_dj = c("last", "first")
)
ests_sigmahat_dj(all_max, b, which_dj, bias_adjust)
log0const_slow(x, const)
log0const(x, const)
ecdf3(x, y)
ecdf2(x, y)
ecdf1(x, y, lenx)
sliding_maxima(x, b = 1)
disjoint_maxima(x, b = 1, which_dj = c("first", "last"))
all_disjoint_maxima(x, b = 1, which_dj = c("all", "first", "last"))
all_maxima(x, b = 1, which_dj = c("all", "first", "last"))
kgaps_loglik(theta, N0, N1, sum_qs, n_kgaps)
kgaps_conf_int(theta_mle, ss, conf = 95)
kgaps_quad_solve(N0, N1, sum_qs)
kgaps_imt_old(data, u, k = 1)
kgaps_exp_info(theta, ss, inc_cens)
dgaps_loglik(theta, N0, N1, sum_qtd, n_dgaps, q_u, D)
g_theta(theta, q_u, D)
gd_theta(theta, q_u, D)
gdd_theta(theta, q_u, D)
gddd_theta(theta, q_u, D)
dgaps_exp_info(theta, ss, inc_cens)
dgaps_conf_int(theta_mle, ss, conf = 95)
iwls_fun(n_wls, N, S_1_sort, exp_qs, ws, nx)
Details
These functions are not intended to be called by the user.
Iterated weighted least squares estimation of the extremal index
Description
Estimates the extremal index \theta
using the iterated weighted least
squares method of Suveges (2007). At the moment no estimates of
uncertainty are provided.
Usage
iwls(data, u, maxit = 100)
Arguments
data |
A numeric vector of raw data. No missing values are allowed. |
u |
A numeric scalar. Extreme value threshold applied to data. |
maxit |
A numeric scalar. The maximum number of iterations. |
Details
The iterated weighted least squares algorithm on page 46 of
Suveges (2007) is used to estimate the value of the extremal index.
This approach uses the time gaps between successive exceedances
in the data data
of the threshold u
. The i
th
gap is defined as T_i - 1
, where T_i
is the difference in
the occurrence time of exceedance i
and exceedance i + 1
.
Therefore, threshold exceedances at adjacent time points produce a gap
of zero.
The model underlying this approach is an exponential-point mas mixture
for scaled gaps, that is, gaps multiplied by the proportion of
values in data
that exceed u
. Under this model
scaled gaps are zero (‘within-cluster’ inter-exceedance times) with
probability 1 - \theta
and otherwise (‘between-cluster’
inter-exceedance times) follow an exponential distribution with mean
1 / \theta
.
The estimation method is based on fitting the ‘broken stick’ model of
Ferro (2003) to an exponential quantile-quantile plot of all of the
scaled gaps. Specifically, the broken stick is a horizontal line
and a line with gradient 1 / \theta
which intersect at
(-\log\theta, 0)
. The algorithm on page 46 of
Suveges (2007) uses a weighted least squares minimization applied to
the exponential
part of this model to seek a compromise between the role of \theta
as the proportion of inter-exceedance times that are between-cluster
and the reciprocal of the mean of an exponential distribution for these
inter-exceedance times. The weights (see Ferro (2003)) are based on the
variances of order statistics of a standard exponential sample: larger
order statistics have larger sampling variabilities and therefore
receive smaller weight than smaller order statistics.
Note that in step (1) of the algorithm on page 46 of Suveges there is a
typo: N_c + 1
should be N
, where N
is the number of
threshold exceedances. Also, the gaps are scaled as detailed above,
not by their mean.
Value
An object (a list) of class "iwls", "exdex"
containing
theta |
The estimate of |
conv |
A convergence indicator: 0 indicates successful
convergence; 1 indicates that |
niter |
The number of iterations performed. |
n_gaps |
The number of time gaps between successive exceedances. |
call |
The call to |
References
Suveges, M. (2007) Likelihood estimation of the extremal index. Extremes, 10, 41-55. doi:10.1007/s10687-007-0034-2
Ferro, C.A.T. (2003) Statistical methods for clusters of extreme values. Ph.D. thesis, Lancaster University.
See Also
iwls_methods
for S3 methods for "iwls"
objects.
Examples
### S&P 500 index
u <- quantile(sp500, probs = 0.60)
theta <- iwls(sp500, u)
theta
coef(theta)
nobs(theta)
### Newlyn sea surges
u <- quantile(newlyn, probs = 0.90)
theta <- iwls(newlyn, u)
theta
Methods for objects of class "iwls"
Description
Methods for objects of class c("iwls", "exdex")
returned from
iwls
.
Usage
## S3 method for class 'iwls'
coef(object, ...)
## S3 method for class 'iwls'
nobs(object, ...)
## S3 method for class 'iwls'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
Arguments
object |
and object of class |
... |
Further arguments. None are used. |
x |
an object of class |
digits |
The argument |
Details
print.iwls
prints the original call to iwls
and the estimate of the extremal index \theta
.
Value
coef.iwls
. A numeric scalar: the estimate of the extremal index
\theta
.
nobs.iwls
. A numeric scalar: the number of inter-exceedance times
used in the fit.
print.iwls
. The argument x
, invisibly.
Examples
See the examples in iwls
.
See Also
iwls
for maximum likelihood estimation of the
extremal index \theta
using the K
-gaps model.
Maximum likelihood estimation for the K
-gaps model
Description
Calculates maximum likelihood estimates of the extremal index \theta
based on the K
-gaps model for threshold inter-exceedances times of
Suveges and Davison (2010).
Usage
kgaps(data, u, k = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector or numeric matrix of raw data. If If |
u |
A numeric scalar. Extreme value threshold applied to data. |
k |
A non-negative numeric scalar. Run parameter |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. It is known that these times are greater
than or equal to the time observed. See Attalides (2015) for details.
If |
Details
If inc_cens = FALSE
then the maximum likelihood estimate of
the extremal index \theta
under the K
-gaps model of
Suveges and Davison (2010) is calculated.
If inc_cens = TRUE
then information from right-censored
first and last inter-exceedance times is also included in the likelihood
to be maximized, following Attalides (2015). The form of the
log-likelihood is given in the Details section of
kgaps_stat
.
It is possible that the estimate of \theta
is equal to 1, and also
possible that it is equal to 0. kgaps_stat
explains the
respective properties of the data that cause these events to occur.
Value
An object (a list) of class c("kgaps", "exdex")
containing
theta |
The maximum likelihood estimate (MLE) of
|
se |
The estimated standard error of the MLE, calculated
using an algebraic expression for the observed information.
If |
se_exp |
The estimated standard error of the MLE,
calculated using an algebraic expression for the expected information.
If the estimate of |
ss |
The list of summary statistics returned from
|
k , u , inc_cens |
The input values of |
max_loglik |
The value of the log-likelihood at the MLE. |
call |
The call to |
References
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
See Also
kgaps_confint
to estimate confidence intervals
for \theta
.
kgaps_methods
for S3 methods for "kgaps"
objects.
kgaps_imt
for the information matrix test, which
may be used to inform the choice of the pair (u, k
).
choose_uk
for a diagnostic plot based on
kgaps_imt
.
kgaps_stat
for the calculation of sufficient
statistics for the K
-gaps model.
kgaps_post
in the
revdbayes
package for Bayesian inference
about \theta
using the K
-gaps model.
Examples
### S&P 500 index
u <- quantile(sp500, probs = 0.60)
theta <- kgaps(sp500, u)
theta
summary(theta)
coef(theta)
nobs(theta)
vcov(theta)
logLik(theta)
### Newlyn sea surges
u <- quantile(newlyn, probs = 0.60)
theta <- kgaps(newlyn, u, k = 2)
theta
summary(theta)
### Cheeseboro wind gusts
theta <- kgaps(cheeseboro, 45, k = 3)
theta
summary(theta)
Confidence intervals for the extremal index \theta
for "kgaps"
objects
Description
confint
method for objects of class c("kgaps", "exdex")
.
Computes confidence intervals for \theta
based on an object returned
from kgaps
. Two types of interval may be returned:
(a) intervals based on approximate large-sample normality of the estimator
of \theta
, which are symmetric about the point estimate,
and (b) likelihood-based intervals. The plot
method plots the
log-likelihood for \theta
, with the required confidence interval
indicated on the plot.
Usage
## S3 method for class 'kgaps'
confint(
object,
parm = "theta",
level = 0.95,
interval_type = c("both", "norm", "lik"),
conf_scale = c("theta", "log"),
constrain = TRUE,
se_type = c("observed", "expected"),
...
)
## S3 method for class 'confint_kgaps'
plot(x, ...)
## S3 method for class 'confint_kgaps'
print(x, ...)
Arguments
object |
An object of class |
parm |
Specifies which parameter is to be given a confidence interval.
Here there is only one option: the extremal index |
level |
The confidence level required. A numeric scalar in (0, 1). |
interval_type |
A character scalar: |
conf_scale |
A character scalar. If If |
constrain |
A logical scalar. If |
se_type |
A character scalar. Should the confidence intervals for the
|
... |
|
x |
an object of class |
Details
Two type of interval are calculated: (a) an interval based on the
approximate large sample normality of the estimator of \theta
(if conf_scale = "theta"
) or of \log\theta
(if conf_scale = "log"
) and (b) a likelihood-based interval,
based on the approximate large sample chi-squared, with 1 degree of
freedom, distribution of the log-likelihood ratio statistic.
print.confint_kgaps
prints the matrix of confidence
intervals for \theta
.
Value
A list of class c("confint_kgaps", "exdex") containing the following components.
cis |
A matrix with columns giving the lower and upper confidence
limits. These are labelled as (1 - level)/2 and 1 - (1 - level)/2 in
% (by default 2.5% and 97.5%).
The row names indicate the type of interval:
|
call |
The call to |
object |
The input object |
level |
The input |
plot.confint_kgaps
: nothing is returned. If
x$object$k = 0
then no plot is produced.
print.confint_kgaps
: the argument x
, invisibly.
References
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
See Also
kgaps
for estimation of the extremal index
\theta
using a semiparametric maxima method.
Examples
u <- quantile(newlyn, probs = 0.90)
theta <- kgaps(newlyn, u)
cis <- confint(theta)
cis
plot(cis)
Information matrix test under the K
-gaps model
Description
Performs the information matrix test (IMT) of Suveges and Davison (2010) to
diagnose misspecification of the K
-gaps model.
Usage
kgaps_imt(data, u, k = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector or numeric matrix of raw data. If If |
u , k |
Numeric vectors. Any values in |
inc_cens |
A logical scalar indicating whether or not to include contributions from censored inter-exceedance times, relating to the first and last observations. See Attalides (2015) for details. |
Details
The K
-gaps IMT is performed a over grid of all
combinations of threshold and K
in the vectors u
and k
. If the estimate of \theta
is 0 then the
IMT statistic, and its associated p
-value is NA
.
For details of the IMT see Suveges and Davison
(2010). There are some typing errors on pages 18-19 that have been
corrected in producing the code: the penultimate term inside {...}
in the middle equation on page 18 should be (c_j(K))^2
, as should
the penultimate term in the first equation on page 19; the {...}
bracket should be squared in the 4th equation on page 19; the factor
n
should be N-1
in the final equation on page 19.
Value
An object (a list) of class c("kgaps_imt", "exdex")
containing
imt |
A |
p |
A |
theta |
A |
u , k |
The input |
References
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
See Also
kgaps
for maximum likelihood estimation of the
extremal index \theta
using the K
-gaps model.
choose_uk
for graphical diagnostic to aid the choice
of the threshold u
and the run parameter K
.
Examples
### Newlyn sea surges
u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1))
imt <- kgaps_imt(newlyn, u = u, k = 1:5)
### S&P 500 index
u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1))
imt <- kgaps_imt(sp500, u = u, k = 1:5)
### Cheeseboro wind gusts (a matrix containing some NAs)
probs <- c(seq(0.5, 0.98, by = 0.025), 0.99)
u <- quantile(cheeseboro, probs = probs, na.rm = TRUE)
imt <- kgaps_imt(cheeseboro, u = u, k = 1:5)
Statistics for the information matrix test
Description
Calculates the components required to calculate the value of the information
matrix test under the K
-gaps model, using vector data input.
Called by kgaps_imt
.
Usage
kgaps_imt_stat(data, theta, u, k = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector of raw data. Missing values are allowed, but
they should not appear between non-missing values, that is, they only be
located at the start and end of the vector. Missing values are omitted
using |
theta |
A numeric scalar. An estimate of the extremal index
|
u |
A numeric scalar. Extreme value threshold applied to data. |
k |
A numeric scalar. Run parameter |
inc_cens |
A logical scalar indicating whether or not to include contributions from censored inter-exceedance times relating to the first and last observation. See Attalides (2015) for details. |
Value
A list relating the quantities given on pages 18-19 of
Suveges and Davison (2010). All but the last component are vectors giving
the contribution to the quantity from each K
-gap, evaluated at the
input value theta
of \theta
.
ldj |
the derivative of the log-likelihood with respect to
|
Ij |
the observed information |
Jj |
the square of the score |
dj |
|
Ddj |
the derivative of |
n_kgaps |
the number of |
References
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
Methods for objects of class "kgaps"
Description
Methods for objects of class c("kgaps", "exdex")
returned from
kgaps
.
Usage
## S3 method for class 'kgaps'
coef(object, ...)
## S3 method for class 'kgaps'
vcov(object, type = c("observed", "expected"), ...)
## S3 method for class 'kgaps'
nobs(object, ...)
## S3 method for class 'kgaps'
logLik(object, ...)
## S3 method for class 'kgaps'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'kgaps'
summary(
object,
se_type = c("observed", "expected"),
digits = max(3, getOption("digits") - 3L),
...
)
## S3 method for class 'summary.kgaps'
print(x, ...)
Arguments
object |
and object of class |
... |
For |
type |
A character scalar. Should the estimate of the variance be based on the observed information or the expected information? |
x |
|
digits |
|
se_type |
A character scalar. Should the estimate of the standard error be based on the observed information or the expected information? |
Value
coef.kgaps
. A numeric scalar: the estimate of the extremal index
\theta
.
vcov.kgaps
. A 1 \times 1
numeric matrix containing the
estimated variance of the estimator.
nobs.kgaps
. A numeric scalar: the number of inter-exceedance times
used in the fit. If x$inc_cens = TRUE
then this includes up to 2
censored observations.
logLik.kgaps
. An object of class "logLik"
: a numeric scalar
with value equal to the maximised log-likelihood. The returned object
also has attributes nobs
, the numbers of K
-gaps that
contribute to the log-likelihood and "df"
, which is equal to the
number of total number of parameters estimated (1).
print.kgaps
. The argument x
, invisibly.
summary.kgaps
. Returns a list containing the list element
object$call
and a numeric matrix summary
giving the estimate
of the extremal index \theta
and the estimated standard error
(Std. Error).
print.summary.kgaps
. The argument x
, invisibly.
Examples
See the examples in kgaps
.
See Also
kgaps
for maximum likelihood estimation of the
extremal index \theta
using the K
-gaps model.
confint.kgaps
for confidence intervals for
\theta
.
Sufficient statistics for the K
-gaps model
Description
Calculates sufficient statistics for the K
-gaps model for the extremal
index \theta
. Called by kgaps
.
Usage
kgaps_stat(data, u, q_u, k = 1, inc_cens = TRUE)
Arguments
data |
A numeric vector of raw data. |
u |
A numeric scalar. Extreme value threshold applied to data. |
q_u |
A numeric scalar. An estimate of the probability with which
the threshold |
k |
A numeric scalar. Run parameter |
inc_cens |
A logical scalar indicating whether or not to include contributions from right-censored inter-exceedance times relating to the first and last observation. It is known that these times are greater than or equal to the time observed. See Attalides (2015) for details. |
Details
The sample K
-gaps are
S_0, S_1, ..., S_{N-1}, S_N
,
where S_1, ..., S_{N-1}
are uncensored and
S_0
and S_N
are right-censored. Under the assumption that the
K
-gaps are independent, the log-likelihood of the K
-gaps
model is given by
l(\theta; S_0, \ldots, S_N) = N_0 \log(1 - \theta) +
2 N_1 \log \theta - \theta q (S_0 + \cdots + S_N),
where
-
q
is the threshold exceedance probability, estimated by the proportion of threshold exceedances, -
N_0
is the number of uncensored sampleK
-gaps that are equal to zero, (apart from an adjustment for the contributions of
S_0
andS_N
)N_1
is the number of positive sampleK
-gaps,specifically, if
inc_cens = TRUE
thenN_1
is equal to the number ofS_1, ..., S_{N-1}
that are positive plus(I_0 + I_N) / 2
, whereI_0 = 1
ifS_0
is greater than zero andI_0 = 0
otherwise, and similarly forI_N
.
The differing treatment of uncensored and right-censored K
-gaps
reflects differing contributions to the likelihood. Right-censored
K
-gaps that are equal to zero add no information to the likelihood.
For full details see Suveges and Davison (2010) and Attalides (2015).
If N_1 = 0
then we are in the degenerate case where there is one
cluster (all K
-gaps are zero) and the likelihood is maximized at
\theta = 0
.
If N_0 = 0
then all exceedances occur singly (all K
-gaps are
positive) and the likelihood is maximized at \theta = 1
.
Value
A list containing the sufficient statistics, with components
N0 |
the number of zero |
N1 |
contribution from non-zero |
sum_qs |
the sum of the (scaled) |
n_kgaps |
the number of |
References
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
See Also
kgaps
for maximum likelihood estimation of the
extremal index \theta
using the K
-gaps model.
Examples
u <- quantile(newlyn, probs = 0.90)
kgaps_stat(newlyn, u)
Newlyn sea surges
Description
The vector newlyn
contains 2894 maximum sea-surges measured at
Newlyn, Cornwall, UK over the period 1971-1976. The observations are
the maximum hourly sea-surge heights over contiguous 15-hour time
periods.
Usage
newlyn
Format
A vector of length 2894.
Source
Coles, S.G. (1991) Modelling extreme multivariate events. PhD thesis, University of Sheffield, U.K.
References
Fawcett, L. and Walshaw, D. (2012) Estimating return levels from serially dependent extremes. Environmetrics, 23(3), 272-283. doi:10.1002/env.2133
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes, 18, 585-603. doi:10.1007/s10687-015-0221-5
Plot block length diagnostic for the semiparametric maxima estimator
Description
plot
method for objects inheriting from class "choose_b"
,
returned from choose_b
Usage
## S3 method for class 'choose_b'
plot(
x,
y,
...,
estimator = c("N2015", "BB2018"),
maxima = c("sliding", "disjoint")
)
Arguments
x |
an object of class |
y |
Not used. |
... |
|
estimator |
Choice of estimator: |
maxima |
Should the estimator be based on sliding or disjoint maxima? |
Details
Produces a simple diagnostic plot to aid the choice of block
length b
based on the object returned from choose_b
.
Estimates of b
and approximate conf
% confidence intervals
are plotted against the value of b
used to produce each estimate.
The type of confidence interval is determined by the arguments
interval_type
, conf_scale
and type
provided in the
call to choose_b
.
Value
Nothing is returned.
Examples
See the examples in choose_b
.
References
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
See Also
Plot threshold u
and runs parameter D
diagnostic for the
D
-gaps estimator
Description
plot
method for objects inheriting from class "choose_ud"
,
returned from choose_ud
Usage
## S3 method for class 'choose_ud'
plot(
x,
y = c("imts", "theta"),
level = 0.95,
interval_type = c("norm", "lik"),
conf_scale = c("theta", "log"),
alpha = 0.05,
constrain = TRUE,
for_abline = list(lty = 2, lwd = 1, col = 1),
digits = 3,
uprob = FALSE,
leg_pos = if (y == "imts") "topright" else "topleft",
...
)
Arguments
x |
an object of class |
y |
A character scalar indicating what should be plotted on the
vertical axes of the plot: information matrix test statistics (IMTS)
if |
level |
A numeric scalar in (0, 1). The confidence level used in
calculating confidence intervals for |
interval_type |
A character scalar. The type of confidence interval
to be plotted, if |
conf_scale |
A character scalar. If |
alpha |
A numeric vector with entries in (0, 1). The size of the test to be performed. |
constrain |
A logical scalar. The argument |
for_abline |
Only relevant when |
digits |
An integer. Used for formatting the value of the threshold
with |
uprob |
A logical scalar. Should we plot |
leg_pos |
A character scalar. The position of any legend added to
a plot. Only relevant when both the arguments |
... |
Additional arguments passed to |
Details
The type of plot produced depends mainly on y
.
If y = "imts"
then the values of IMTS are plotted against the
thresholds in x$u
(or their corresponding approximate sample
quantile levels if uprob = TRUE
) for each value of D
in x$D
. Horizontal lines are added to indicate the critical
values of the IMT for the significance levels in alpha
.
We would not reject at the 100alpha
% level combinations of
threshold and D
corresponding to values of the IMTS that fall
below the line.
If y = "theta"
then estimates of \theta
are plotted on the
vertical axis. If both x$u
and x$D$
have length greater
than one then only these estimates are plotted. If either x$u
or x$D
have length one then approximate 100level
%
confidence intervals are added to the plot and the variable,
x$u
or x$D
that has length greater than one is plotted on
the horizontal axis.
Value
Nothing is returned.
Examples
See the examples in choose_ud
.
See Also
Plot threshold u
and runs parameter K
diagnostic for the
K
-gaps estimator
Description
plot
method for objects inheriting from class "choose_uk"
,
returned from choose_uk
Usage
## S3 method for class 'choose_uk'
plot(
x,
y = c("imts", "theta"),
level = 0.95,
interval_type = c("norm", "lik"),
conf_scale = c("theta", "log"),
alpha = 0.05,
constrain = TRUE,
for_abline = list(lty = 2, lwd = 1, col = 1),
digits = 3,
uprob = FALSE,
leg_pos = if (y == "imts") "topright" else "topleft",
...
)
Arguments
x |
an object of class |
y |
A character scalar indicating what should be plotted on the
vertical axes of the plot: information matrix test statistics (IMTS)
if |
level |
A numeric scalar in (0, 1). The confidence level used in
calculating confidence intervals for |
interval_type |
A character scalar. The type of confidence interval
to be plotted, if |
conf_scale |
A character scalar. If |
alpha |
A numeric vector with entries in (0, 1). The size of the test to be performed. |
constrain |
A logical scalar. The argument |
for_abline |
Only relevant when |
digits |
An integer. Used for formatting the value of the threshold
with |
uprob |
A logical scalar. Should we plot |
leg_pos |
A character scalar. The position of any legend added to
a plot. Only relevant when both the arguments |
... |
Additional arguments passed to |
Details
The type of plot produced depends mainly on y
.
If y = "imts"
then the values of IMTS are plotted against the
thresholds in x$u
(or their corresponding approximate sample
quantile levels if uprob = TRUE
) for each value of K
in x$k
. Horizontal lines are added to indicate the critical
values of the IMT for the significance levels in alpha
.
We would not reject at the 100alpha
% level combinations of
threshold and K
corresponding to values of the IMTS that fall
below the line.
If y = "theta"
then estimates of \theta
are plotted on the
vertical axis. If both x$u
and x$k$
have length greater
than one then only these estimates are plotted. If either x$u
or x$k
have length one then approximate 100level
%
confidence intervals are added to the plot and the variable,
x$u
or x$k
that has length greater than one is plotted on
the horizontal axis.
Value
Nothing is returned.
Examples
See the examples in choose_uk
.
See Also
Daily log returns of the Standard and Poor (S&P) 500 index
Description
Daily log returns of the S&P 500 index, that is, the log of the ratio of successive daily closing prices, from 3rd January 1990 to 9th October 2018.
Usage
sp500
Format
A vector of length 7250, created using zoo
with an "index" attribute giving the date of the corresponding negated
log return.
Source
Yahoo finance: https://finance.yahoo.com/quote/^SPX/history/
Divides data into parts that contain no missing values
Description
Splits the values in a numeric matrix column-wise into sequences of non-missing values.
Usage
split_by_NAs(x)
Arguments
x |
A vector or matrix. |
Details
For each column in x
, split_by_NAs
finds runs of
values that contain no missing values and assigns them to a column in the
matrix that is returned. Different columns are treated separately.
If there are no missing values in a column then that column appears
unmodified in the output matrix. Please see the Examples
for illustrations.
Value
A matrix containing a column for each run of non-missing values in
x
. The number of rows is equal to the longest run of non-missing
values in x
and will therefore be at most nrow{x}
. The
matrix is padded with NA
values at the end of each column, where
necessary.
The returned object has an attribute called split_by_NAs_done
whose value is TRUE
, so that in programming one can avoid calling
split_by_NAs
more than once.
Examples
# Create a simple numeric matrix and insert some NAs
x <- matrix(1:50, 10, 5)
x[c(3, 8), 1] <- NA
x[c(1:2, 5, 10), 3] <- NA
x[1:3, 4] <- NA
x[7:10, 5] <- NA
x
res <- split_by_NAs(x)
res
# An example of a character matrix
x <- matrix(c(letters, letters[1:18]), 11, 4)
x[c(1:2, 5:11), 2] <- NA
x[c(2:4, 6:11), 3] <- NA
x[1:10, 4] <- NA
res <- split_by_NAs(x)
res
Semiparametric maxima estimator of the extremal index
Description
Estimates the extremal index \theta
using a semiparametric block
maxima estimator of Northrop (2015) ("N2015"
) and a variant of this
estimator studied by Berghaus and Bucher (2018) ("BB2018"
), using
both sliding (overlapping) block maxima and disjoint (non-overlapping) block
maxima. A simple modification (subtraction of 1 / b
, where b
is
the block size) of the Berghaus and Bucher (2018) estimator
("BB2018b"
) is also calculated. Estimates of uncertainty are
calculated using the asymptotic theory developed by Berghaus and Bucher
(2018).
Usage
spm(
data,
b,
bias_adjust = c("BB3", "BB1", "N", "none"),
constrain = TRUE,
varN = TRUE,
which_dj = c("last", "first")
)
Arguments
data |
A numeric vector of raw data. No missing values are allowed. |
b |
A numeric scalar. The block size. |
bias_adjust |
A character scalar. Is bias-adjustment of the
raw estimate of |
constrain |
A logical scalar. If |
varN |
A logical scalar. If |
which_dj |
A character scalar. Determines which set of disjoint
maxima are used to calculate an estimate of |
Details
The extremal index \theta
is estimated using the
semiparametric maxima estimator of Northrop (2015) and variant
of this studied by Berghaus and Bucher (2018). In each case a sample
of 'data' is derived from the input data data
, based on the
empirical distribution function of these data evaluated at the
maximum values of of blocks of b
contiguous values in data
.
The estimators are based on an assumption that these 'data' are sampled
approximately from an exponential distribution with mean 1/\theta
.
For details see page 2309 of Berghaus and Bucher (2018), where the
'data' for the Northrop (2015) estimator are denoted Y
and
those for the Berghaus and Bucher (2018) are denoted Z
.
For convenience, we will refer to these values as the
Y
-data and the Z
-data.
The approximate nature of the model for the Y
-data arises from the
estimation of the distribution function F
. A further
approximation motivates the use of the Z
-data. If F
is known
then the variable Z / b
has a beta(1, b\theta
) distribution,
so that that is, Z
has mean 1 / (\theta + 1/b)
.
Therefore, an exponential distribution with mean 1 / (\theta + 1/b)
may provide a better approximate model, which provides the motivation for
subtracting 1 / b
from the Berghaus and Bucher (2018) estimator.
Indeed, the resulting estimates are typically close to those
of the Northrop (2015) estimator.
If sliding = TRUE
then the function uses sliding block maxima,
that is, the largest value observed in all
(length(data) - b + 1
) blocks of b
observations.
If sliding = FALSE
then disjoint block maxima, that is, the
largest values in (floor(n / b)
) disjoint blocks of b
observations, are used.
Estimation of the sampling variances of the estimators is based
on Proposition 4.1 on page 2319 of Berghaus and Bucher (2018).
For the Northrop (2015) variant the user has the choice either to
use the sampling variance based on the Berghaus and Bucher (2018)
estimator, i.e. the Z
-data (varN = FALSE
) or an analogous
version tailored to the Northrop (2015) estimator that uses the
Y
-data (varN = TRUE
).
The estimates of the sampling variances of the sliding blocks estimators
are inferred from those of the disjoint blocks estimators (see page 2319
of Berhaus and Bucher (2018)). The calculation of the latter uses a set
of disjoint block maxima. If length(data)
is not an integer
multiple of b
then there will be more than one set of these, and
all are equally valid. In this event we perform the calculation for all
such sets and use the mean of the resulting estimates. This reduces the
sampling variability of the estimates at the expense of slowing down
the calculation somewhat, particularly if b
is small. This may
become apparent when calling spm
repeatedly in
choose_b
.
This estimator of the sampling variance of the sliding blocks estimator is
not constrained to be positive: a negative estimate may result if the
block size is small. In this event
no warning will be given until the returned object is printed and,
for the affected estimator ("N2015"
or "BB2018/BB2018b"
),
the corresponding estimated standard errors using sliding blocks will be missing in
se_sl
in the returned object,if
bias_adjust == "BB3"
then bias-adjustment based onbias_adjust == "BB1"
will instead be performed when using sliding blocks, because the former relies on the estimated variances of the estimators.
Similarly, bias adjustment under adjust = "BB3"
and/or subtraction
of 1 / b
in the "BB2018b"
case may, rare cases, produce a
negative estimate of \theta
. In these instances an estimate of
zero is returned, but the values returned in bias_dj
and
bias_sl
are not changed.
Value
A list of class c("spm", "exdex")
containing the
components listed below. The components that are vectors are
labelled to indicate the estimator to which the constituent values
relate: "N2015"
for Northrop (2015), "BB2018"
for
Berghaus and Bucher (2018) and "BB2018b"
for the modified version.
theta_sl , theta_dj |
Vectors containing the estimates of
|
se_sl , se_dj |
The estimated standard errors associated
with the estimates in |
bias_sl , bias_dj |
The respective values of the
bias-adjustment applied to the raw estimates, that is, the values
subtracted from the raw estimates. For estimator
|
raw_theta_sl , raw_theta_dj |
Vectors containing the raw estimates
of |
uncon_theta_sl , uncon_theta_dj |
The (bias_adjusted) estimates of
|
data_sl , data_dj |
Matrices containing the |
sigma2dj , sigma2dj_for_sl |
Estimates of the variance
|
sigma2sl |
Estimates of the variance
|
b |
The input value of |
bias_adjust |
The input value of |
call |
The call to |
References
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
See Also
spm_confint
to estimate confidence intervals
for \theta
.
spm_methods
for S3 methods for "spm"
objects.
Examples
### Newlyn sea surges
theta <- spm(newlyn, 20)
theta
summary(theta)
coef(theta)
nobs(theta)
vcov(theta)
### S&P 500 index
theta <- spm(sp500, 100)
theta
summary(theta)
Confidence intervals for the extremal index \theta
for "spm"
objects
Description
confint
method for objects of class c("spm", "exdex")
.
Computes confidence intervals for \theta
based on an object returned
from spm
. Two types of interval may be returned:
(a) intervals that are based on approximate large-sample normality of the
estimators of \theta
(or of \log\theta
if
conf_scale = "log"
), and which are symmetric about the respective
point estimates, and (b) likelihood-based intervals based on an adjustment
of a naive (pseudo-) loglikelihood, using the
adjust_loglik
function in the
chandwich
package. The plot
method plots the
log-likelihood for \theta
, with the required confidence interval(s)
indicated on the plot.
Usage
## S3 method for class 'spm'
confint(
object,
parm = "theta",
level = 0.95,
maxima = c("sliding", "disjoint"),
interval_type = c("norm", "lik", "both"),
conf_scale = c("theta", "log"),
constrain = TRUE,
bias_adjust = TRUE,
type = c("vertical", "cholesky", "spectral", "none"),
...
)
## S3 method for class 'confint_spm'
plot(x, estimator = "all", ndec = 2, ...)
## S3 method for class 'confint_spm'
print(x, ...)
Arguments
object |
An object of class |
parm |
Specifies which parameter is to be given a confidence interval.
Here there is only one option: the extremal index |
level |
A numeric scalar in (0, 1). The confidence level required. |
maxima |
A character scalar specifying whether to estimate confidence intervals based on sliding maxima or disjoint maxima. |
interval_type |
A character scalar: |
conf_scale |
A character scalar. If If Any bias-adjustment requested in the original call to |
constrain |
A logical scalar. If |
bias_adjust |
A logical scalar. If If |
type |
A character scalar. The argument |
... |
Additional optional arguments to be passed to
|
x |
an object of class |
estimator |
A character vector specifying which of the three variants
of the semiparametric maxima estimator to include in the plot:
|
ndec |
An integer scalar. The legend (if included on the plot)
contains the confidence limits rounded to |
Details
The likelihood-based intervals are estimated using the
adjust_loglik
function in the
chandwich
package, followed by a call to
conf_intervals
.
This adjusts the naive (pseudo-)loglikelihood so that the curvature of
the adjust loglikelihood agrees with the estimated standard errors of
the estimators. The option type = "none"
should not be used in
practice because the resulting confidence intervals will be wrong.
In particular, in the intervals based on sliding maxima will provide
vast underestimates of uncertainty.
If object$se
contains NA
s, because the block size b
was too small or too large in the call to spm
then
confidence intervals cannot be estimated. A matrix of NA
s
will be returned. See the Details section of the
spm
documentation for more information.
print.confint_spm
prints the matrix of confidence
intervals for \theta
.
Value
A list of class c("confint_spm", "exdex") containing the following components.
cis |
A matrix with columns giving the lower and upper confidence
limits. These are labelled as (1 - level)/2 and 1 - (1 - level)/2 in
% (by default 2.5% and 97.5%).
The row names are a concatenation of the variant of the estimator
( |
ciN |
The object returned from
|
ciBB |
The object returned from
|
ciBBb |
The object returned from
|
call |
The call to |
object |
The input |
maxima |
The input |
theta |
The relevant estimates of |
level |
The input |
plot.confint_spm
: nothing is returned.
print.confint_spm
: the argument x
, invisibly.
References
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
See Also
spm
for estimation of the extremal index
\theta
using a semiparametric maxima method.
Examples
# Newlyn sea surges
theta <- spm(newlyn, 20)
confint(theta)
cis <- confint(theta, interval_type = "lik")
cis
plot(cis)
# S&P 500 index
theta <- spm(sp500, 100)
confint(theta)
cis <- confint(theta, interval_type = "lik")
cis
plot(cis)
Methods for objects of class "spm"
Description
Methods for objects of class c("spm", "exdex")
returned from
spm
.
Usage
## S3 method for class 'spm'
coef(
object,
maxima = c("sliding", "disjoint"),
estimator = "all",
constrain = FALSE,
...
)
## S3 method for class 'spm'
vcov(object, maxima = c("sliding", "disjoint"), estimator = "all", ...)
## S3 method for class 'spm'
nobs(object, maxima = c("sliding", "disjoint"), ...)
## S3 method for class 'spm'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'spm'
summary(object, digits = max(3, getOption("digits") - 3L), ...)
## S3 method for class 'summary.spm'
print(x, ...)
Arguments
object |
an object of class |
maxima |
A character scalar specifying whether to return the number of observed sliding maxima or disjoint maxima. |
estimator |
A character vector specifying which of the three variants
of the semiparametric maxima estimator to use: |
constrain |
A logical scalar. If |
... |
For |
x |
|
digits |
An integer. Used for number formatting with
|
Details
print.spm
prints the original call to spm
and the estimates of the extremal index \theta
, based on all three
variants of the semiparametric maxima estimator and both sliding
and disjoint blocks.
Value
coef.spm
. A numeric scalar (or a vector of length 3 if
estimator = "all"
): the required estimate(s) of the extremal index
\theta
.
vcov.spm
. A 1 \times 1
numeric matrix if
estimator = "N2015"
or "BB2018"
and a vector of length 3 if
estimator = "all"
, containing the estimated variance(s) of the
estimator(s).
nobs.spm
. A numeric scalar: the number of observations used in the
fit.
print.spm
. The argument x
, invisibly.
summary.spm
. Returns an object (a list) of class
"summary.spm"
containing the list element object$call
and a
numeric matrix matrix
giving, for all three variants of the
semiparametric estimator and both sliding and disjoint blocks,
the (bias-adjusted) Estimate of the extremal index \theta
,
the estimated standard error (Std. Error),
and the bias adjustment (Bias adj.) applied to obtain the estimate, i.e.
the value subtracted from the raw estimate. If any of the
(bias-adjusted) estimates are greater than 1 then a column
containing the unconstrained estimates (Uncon. estimate) is added.
print.summary.spm
. The argument x
, invisibly.
Examples
See the examples in spm
.
See Also
spm
for semiparametric estimation of the
extremal index based on block maxima.
Uccle maximum daily temperatures
Description
The dataframe uccle
contains daily maximum temperatures in degrees C
recorded at the Uccle, Belgium from 1/1/1833 to 23/1/2011. The Station
identifier in the source file is 17 and the Source identifier is 117882.
Usage
uccle
Format
A data frame with 65036 observations on the following and 5 variables.
-
temp:
daily maximum temperature in degrees C. -
year:
the year. -
month:
the month of the year. -
day:
day of the month. -
date:
date with theDate
class, in the format YYYY-MM-DD.
Note
There are 5336 missing values.
Source
Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int. J. of Climatol., 22, 1441-1453 doi:10.1002/joc.773. Data and metadata available at https://www.ecad.eu. The data were downloaded on 27/3/2022 using a Custom query (ASCII), selecting "non-blend" for type of series.
20th century Uccle maximum daily temperatures in July - data frame
Description
The dataframe uccle720
contains daily maximum temperatures in degrees C
recorded at the Uccle, Belgium during July for the years 1901 to 1999.
The Station identifier in the source file is 17 and the Source identifier is
117882. These data are analysed in Holesovsky and Fusek (2020).
Usage
uccle720
Format
A data frame with 3100 observations on the following and 5 variables.
-
temp:
daily maximum temperature in degrees C. -
year:
the year. -
month:
the month of the year. -
day:
day of the month. -
date:
date with theDate
class, in the format YYYY-MM-DD.
Note
There are 6 missing values, one located in each of the years 1925, 1926, 1956, 1963, 1969 and 1976.
Source
Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int. J. of Climatol., 22, 1441-1453 doi:10.1002/joc.773. Data and metadata available at https://www.ecad.eu. The data were downloaded on 27/3/2022 using a Custom query (ASCII), selecting "non-blend" for type of series.
References
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes, 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Examples
uccle720_ts <- ts(uccle720$temp, start = c(1901, 1), frequency = 31)
plot(uccle720_ts, ylab = "daily maximum temperature in July / degrees C",
xlab = "year")
20th century Uccle maximum daily temperatures in July - matrix
Description
The matrix uccle720m
contains daily maximum temperatures in degrees C
recorded at the Uccle, Belgium during July for the years 1901 to 1999.
The Station identifier in the source file is 17 and the Source identifier is
117882. These data are analysed in Holesovsky and Fusek (2020).
Usage
uccle720m
Format
A 31 by 100 numeric matrix. Column i
contains the maximum
daily temperature in degrees C at Uccle in the year 1900 + i
- 1.
The columns are named 1900, 1901, ..., 1999 and the rows are named
after the day of the month: 1, 2, .., 31.
Note
There are 6 missing values, one located in each of the years 1925, 1926, 1956, 1963, 1969 and 1976.
Source
Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int. J. of Climatol., 22, 1441-1453 doi:10.1002/joc.773. Data and metadata available at https://www.ecad.eu. The data were downloaded on 27/3/2022 using a Custom query (ASCII), selecting "non-blend" for type of series.
References
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes, 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Examples
uccle720_ts <- ts(uccle720$temp, start = c(1901, 1), frequency = 31)
plot(uccle720_ts, ylab = "daily maximum temperature in July / degrees C",
xlab = "year")