Title: | Tukey Region and Median |
---|---|
Description: | Tukey regions are polytopes in the Euclidean space, viz. upper-level sets of the Tukey depth function on given data. The bordering hyperplanes of a Tukey region are computed as well as its vertices, facets, centroid, and volume. In addition, the Tukey median set, which is the non-empty Tukey region having highest depth level, and its barycenter (= Tukey median) are calculated. Tukey regions are visualized in dimension two and three. For details see Liu, Mosler, and Mozharovskyi (2019, <doi:10.1080/10618600.2018.1546595>). See file LICENSE.note for additional license information. |
Authors: | C.B. Barber [aut, cph] (Qhull library), The Geometry Center University of Minnesota [cph] (Qhull library), Pavlo Mozharovskyi [aut, cre] |
Maintainer: | Pavlo Mozharovskyi <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.6.3 |
Built: | 2025-03-05 05:33:24 UTC |
Source: | https://github.com/cran/TukeyRegion |
Tukey regions are polytopes in the Euclidean space, viz. upper-level sets of the Tukey depth function on given data. The bordering hyperplanes of a Tukey region are computed as well as its vertices, facets, centroid, and volume. In addition, the Tukey median set, which is the non-empty Tukey region having highest depth level, and its barycenter (= Tukey median) are calculated. Tukey regions are visualized in dimension two and three. For details see Liu, Mosler, and Mozharovskyi (2019).
Proposed initially by John W. Tukey in 1975 (see also Donoho and Gasko, 1992) Tukey depth measures centrality of an arbitrary point in the Euclidean space w.r.t. a data cloud. For a point, Tukey (also halfspace or location) depth is defined as the smallest portion of observations that can be cut off by a closed halfspace containing this. For a given depth level Tukey (trimmed) region is defined as the upper level set of the Tukey depth function; it constitutes a closed polytope. The Tukey region of the highest level is referred to as the Tukey median set, while its barycenter is mentioned as the Tukey median. Due to properties of affine invariance, quasiconcavity, vanishing at infinity, high breakdown point of the median set, the Tukey depth attracted attention of statisticians and experienced substantial theoretical development. It is used in numerous applications including multivariate data analysis, outlier detection, tests for location (also scale and symmetry), classification, statistical quality control, imputation of missing data, etc.
Package TukeyRegion
provides routines for computation (TukeyRegion
) and visualization (plot
) of the Tukey depth trimmed region, the Tukey median set and Tukey median (TukeyMedian
), and Tukey depth weighted and/or trimmed mean (depth.wm
).
For cumputation of Tukey depth see function depth.halfspace
of package ddalpha
.
Package: | TukeyRegion |
Type: | Package |
Version: | 0.1.6.3 |
Date: | 2023-04-17 |
License: | GPL (>= 3) |
Authors: C.B. Barber [aut, cph] (Qhull library), The Geometry Center of University of Minnesota [cph] (Qhull library), Pavlo Mozharovskyi [aut, cre]
Maintainer: Pavlo Mozharovskyi, <[email protected]>
Donoho, D.L. and Gasko, M (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. The Annals of Statistics, 20(4), 1803-1827.
Dyckerhoff, R. and Mozharovskyi, P. (2016). Exact computation of the halfspace depth. Computational Statistics and Data Analysis, 98, 19-30.
Hallin, M., Paindaveine, D., and Siman, M. (2010). Multivariate quantiles and multiple-output regression quantiles: from L1-optimization to halfspace depth. The Annals of Statistics, 38, 635-669.
Kong, L. and Mizera, I. (2012). Quantile tomography: using quantiles with multivariate data. Statistica Sinica, 22, 1589-1610. Published online as arXiv:0805.0056 [stat.ME]
(2008).
Liu, X., Luo, S., and Zuo, Y. (2020). Some results on the computing of Tukey's halfspace median. Statistical Papers, 61, 303-316.
Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.
Tukey, J.W. (1975). Mathematics and the picturing of data. In: James, R.D. (Ed.), Proceeding of the International Congress of Mathematicians (Volume 2), Canadian Mathematical Congress, Vancouver, 523-531.
TukeyRegion
, TukeyMedian
, depth.wm
,
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(100, rep(0, 3), diag(3)) # Compute and visualize two Tukey regions Tr1 <- TukeyRegion(X, 5, "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) Tr2 <- TukeyRegion(X, 25, "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) plot(Tr1, colorFacets = "red", colorRidges = "red", colorPoints = "blue", alpha = 0.35) plot(Tr2, newPlot = FALSE, drawPoints = FALSE, colorFacets = "green", colorRidges = "green", alpha = 0.65) (Tr1$barycenter) (Tr2$barycenter) # Compute arithmetic mean T.mean <- colMeans(X) (T.mean) # Compute Tukey depth trimmed weighted mean (approximate depth) T.approx1 <- depth.wm(X, 0.25) (T.approx1) T.approx2 <- depth.wm(X, 75) (T.approx2) # Compute Tukey depth trimmed weighted mean (exact depth) T.exact1 <- depth.wm(X, 0.25, exact = TRUE) (T.exact1) T.exact2 <- depth.wm(X, 75, exact = TRUE) (T.exact2) # Compute and visualize Tukey median Tm <- TukeyMedian(X) (Tm$barycenter) plot(Tm, newPlot = FALSE, drawPoints = FALSE)
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(100, rep(0, 3), diag(3)) # Compute and visualize two Tukey regions Tr1 <- TukeyRegion(X, 5, "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) Tr2 <- TukeyRegion(X, 25, "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) plot(Tr1, colorFacets = "red", colorRidges = "red", colorPoints = "blue", alpha = 0.35) plot(Tr2, newPlot = FALSE, drawPoints = FALSE, colorFacets = "green", colorRidges = "green", alpha = 0.65) (Tr1$barycenter) (Tr2$barycenter) # Compute arithmetic mean T.mean <- colMeans(X) (T.mean) # Compute Tukey depth trimmed weighted mean (approximate depth) T.approx1 <- depth.wm(X, 0.25) (T.approx1) T.approx2 <- depth.wm(X, 75) (T.approx2) # Compute Tukey depth trimmed weighted mean (exact depth) T.exact1 <- depth.wm(X, 0.25, exact = TRUE) (T.exact1) T.exact2 <- depth.wm(X, 75, exact = TRUE) (T.exact2) # Compute and visualize Tukey median Tm <- TukeyMedian(X) (Tm$barycenter) plot(Tm, newPlot = FALSE, drawPoints = FALSE)
Computes the Tukey depth weighted and/or trimmed for a given depth level or for a given number of deepest points.
depth.wm(data, depth.level = 1/nrow(data), weighted = TRUE, break.ties = "atRandom", ...)
depth.wm(data, depth.level = 1/nrow(data), weighted = TRUE, break.ties = "atRandom", ...)
data |
data set for which the weighted mean should be computed, a matrix having > 2 columns and more rows than columns. |
depth.level |
either Tukey depth level for trimming (a numeric between 1/(number of rows in |
weighted |
whether the trimmed mean should be weighted by depth, logical, |
break.ties |
the way to break ties if the number of deepest points is given, character. If |
... |
further agruments passed to function |
After having computed the Tukey depth of each point in data
the function operates in two possible modes. If depth.level
lies between 0 and 1 then the function computes trimmed (weighted if specified by flag weighted
) mean of all points having at least given depth level. If depth.level
specifies the number of points (an integer between 1 and number of rows in data
) then the trimmed (weighted) mean of depth.level
deepest points are calculated breaking ties due to argument break.ties
(ties can occur due to discrete nature of the Tukey depth). This follows the idea of Donoho and Gasko (1992), also see this article for the breakdown point.
Depth of points is calculated by means of external function depth.halfspace
from package ddalpha
, whose arguments can be specified as well. In particular, argument exact
specifies whether Tukey depth is computed exactly (TRUE
) or approximated (FALSE
) by random projections; for the latter case argument num.directions
specifies the number of random directions to use. For further details about the algorithm see Dyckerhoff and Mozharovskyi (2016).
The function returns the weighted and/or trimmed mean, a point in the d
-variate Euclidean space (d
is the number of columns in data
), a numeric vector.
Pavlo Mozharovskyi <[email protected]>
Donoho, D.L. and Gasko, M (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. The Annals of Statistics, 20(4), 1803-1827.
Dyckerhoff, R. and Mozharovskyi, P. (2016). Exact computation of the halfspace depth. Computational Statistics and Data Analysis, 98, 19-30.
# Load required packages require(TukeyRegion) require(bfp) # Generate data set.seed(1) X <- bfp:::rmvt(150, diag(3), rep(0, 3), 1) # Compute arithmetic mean T.mean <- colMeans(X) (T.mean) # Compute Tukey depth trimmed weighted mean (approximate depth) T.approx1 <- depth.wm(X, 0.25) (T.approx1) T.approx2 <- depth.wm(X, 25) (T.approx2) # Compute Tukey depth trimmed weighted mean (exact depth) T.exact1 <- depth.wm(X, 0.25, exact = TRUE) (T.exact1) T.exact2 <- depth.wm(X, 25, exact = TRUE) (T.exact2) # Compute Tukey median Tm <- TukeyMedian(X) (Tm$barycenter)
# Load required packages require(TukeyRegion) require(bfp) # Generate data set.seed(1) X <- bfp:::rmvt(150, diag(3), rep(0, 3), 1) # Compute arithmetic mean T.mean <- colMeans(X) (T.mean) # Compute Tukey depth trimmed weighted mean (approximate depth) T.approx1 <- depth.wm(X, 0.25) (T.approx1) T.approx2 <- depth.wm(X, 25) (T.approx2) # Compute Tukey depth trimmed weighted mean (exact depth) T.exact1 <- depth.wm(X, 0.25, exact = TRUE) (T.exact1) T.exact2 <- depth.wm(X, 25, exact = TRUE) (T.exact2) # Compute Tukey median Tm <- TukeyMedian(X) (Tm$barycenter)
Plots the two-dimensional and the three-dimensional Tukey region.
## S3 method for class 'TukeyRegion' plot(x, newPlot = TRUE, drawPoints = TRUE, drawRidges = TRUE, colorBackground = "white", colorPoints = "red", colorFacets = "blue", colorRidges = "green", lwd = 1, lty = 1, alpha = 1, ...)
## S3 method for class 'TukeyRegion' plot(x, newPlot = TRUE, drawPoints = TRUE, drawRidges = TRUE, colorBackground = "white", colorPoints = "red", colorFacets = "blue", colorRidges = "green", lwd = 1, lty = 1, alpha = 1, ...)
x |
object of class |
newPlot |
whether to create a new plot(2D)/scene(3D). |
drawPoints |
whether to show the data points. |
drawRidges |
whether to show the ridges; works for non-triangulated facets only. |
colorBackground |
background color of the plot(2D)/scene(3D). |
colorPoints |
color of the points in case they are shown. |
colorFacets |
color of the facets. |
colorRidges |
color of the facets' ridges in case they are shown. |
lwd |
line width of the facets in 2D. |
lty |
line type of the facets in 2D. |
alpha |
transperency of the facets (and ridges if shown). |
... |
included for compatibility and should not be used. |
If dimension is equal to two, the traditional plot
is produced. If dimension is equal to three, the 3D-scene is produced using the package rgl
.
Pavlo Mozharovskyi <[email protected]>
# See examples in TukeyRegion or TukeyMedian
# See examples in TukeyRegion or TukeyMedian
Prints basic information about the Tukey region.
## S3 method for class 'TukeyRegion' summary(object, ...)
## S3 method for class 'TukeyRegion' summary(object, ...)
object |
object of class |
... |
included for compatibility and should not be used. |
Prints in the console basic information about the computed Tukey region.
Pavlo Mozharovskyi <[email protected]>
# See examples in TukeyRegion or TukeyMedian
# See examples in TukeyRegion or TukeyMedian
Computes the Tukey median set and its barycenter, the Tukey median, starting from the region with dpeth 1
iteratively increasing it, according to the algorithm by Fojtik et al. (2022).
TukeyKMedian(data, algMedian = "upwards", method = "bfs", trgFacets = TRUE, retHalfspaces = FALSE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = TRUE, retFacets = TRUE, retVolume = FALSE, retBarycenter = TRUE, verbosity = 0)
TukeyKMedian(data, algMedian = "upwards", method = "bfs", trgFacets = TRUE, retHalfspaces = FALSE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = TRUE, retFacets = TRUE, retVolume = FALSE, retBarycenter = TRUE, verbosity = 0)
data |
data set for which the Tukey median shall be computed, a matrix having > 2 columns and more rows than columns. |
algMedian |
the algorithm used to compute the Tukey median, a string containing |
method |
the method to use to compute the Tukey region, a string containing |
trgFacets |
whether to triangulate facets, logical, |
retHalfspaces |
whether to return all found halfspaces, logical, |
retHalfspacesNR |
whether to return non-redundant halfspaces, logical, |
retInnerPoint |
whether to return inner point, logical, |
retVertices |
whether to return vertices, logical, |
retFacets |
whether to return facets, logical, |
retVolume |
whether to return volume, logical, |
retBarycenter |
whether to return the region's barycenter, logical, |
verbosity |
level of details to print during execution, integer, from |
The function computes the Tukey median set, i.e. the region with the highest depth value, for n
points in the Euclidean d
-variate space contained in data
.
It also computes this set's barycenter, which is the Tukey median. Four search algorithms are implemented: Algorithm bsbarydepth
is the most efficient, it is the bisection algorithm starting with the lower bound as the maximum of the theoretical minimum and the depth of the componentwise median, and updating lower bound with the depth of the barycenter of the last found region. Algorithm "cutintwo"
sequntially cuts the range of remaining depths into two parts starting with the range from 1
to the upper bound obtained by Liu, Luo, and Zuo (2016). Algorithm "downwards"
is checking each depth value decrementally with step 1
starting with the upper bound obtained by Liu, Luo, and Zuo (2016) until the first existing region is found. Algorithm "upwards"
is checking each depth value incrementally until the first non-existing region is found.
The main goal of the function is to provide the polytope (Tukey median set) and its barycenter (Tukey median); the settings can be adjusted though. After the median depth is found, the TukeyRegion
function is called.
See function TukeyRegion
for details regarding the output.
The function returns an object of class TukeyRegion
with fields specified by ret...
-flags in the arguments:
data |
the input data set. |
depth |
chosen depth level. |
numRegions |
number of time the depth region has been computed. |
halfspacesFound |
whether at least one of the determining Tukey region halfspaces has been found. |
halfspaces |
if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
innerPointFound |
a logical indicating whether an inner point of the region has been found. If |
innerPoint |
coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
halfspacesNR |
non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
vertices |
vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field |
triangulated |
a logical repeating the |
facets |
facets of the Tukey region. If input argument |
volume |
volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
barycenter |
the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
numRidges |
number of used ridges (for computing the last region). |
Pavlo Mozharovskyi <[email protected]>
Liu, X., Luo, S., and Zuo, Y. (2020). Some results on the computing of Tukey's halfspace median. Statistical Papers, 61, 303-316.
Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.
Vit Fojtik, Petra Laketa, Pavlo Mozharovskyi, and Stanislav Nagy (2022). On exact computation of Tukey depth central regions. arXiv:2208.04587.
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(100, rep(0, 3), matrix(c(1, 1, 1, 1, 2, 2, 1, 2, 4), nrow = 3)) # Compute the Tukey median Tm <- TukeyKMedian(X) summary(Tm) # Visualize the Tukey median plot(Tm)
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(100, rep(0, 3), matrix(c(1, 1, 1, 1, 2, 2, 1, 2, 4), nrow = 3)) # Compute the Tukey median Tm <- TukeyKMedian(X) summary(Tm) # Visualize the Tukey median plot(Tm)
Computes the Tukey depth trimmed regions for all depth levels from 1
to K
.
TukeyKRegions(data, maxDepth, method = "bfs", trgFacets = FALSE, checkInnerPoint = TRUE, retHalfspaces = TRUE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = FALSE, retFacets = FALSE, retVolume = FALSE, retBarycenter = FALSE, verbosity = 0L)
TukeyKRegions(data, maxDepth, method = "bfs", trgFacets = FALSE, checkInnerPoint = TRUE, retHalfspaces = TRUE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = FALSE, retFacets = FALSE, retVolume = FALSE, retBarycenter = FALSE, verbosity = 0L)
data |
data set for which the Tukey region shall be computed, a matrix having > 2 columns and more rows than columns. |
maxDepth |
depth level until which Tukey regions to compute, an integer between |
method |
the method to use to compute the Tukey region, a string containing |
trgFacets |
whether to triangulate facets, logical, |
checkInnerPoint |
whether to check correctness of the inner point in case it is provided, logical, |
retHalfspaces |
whether to return all found halfspaces, logical, |
retHalfspacesNR |
whether to return non-redundant halfspaces, logical, |
retInnerPoint |
whether to return inner point, logical, |
retVertices |
whether to return vertices, logical, |
retFacets |
whether to return facets, logical, |
retVolume |
whether to return volume, logical, |
retBarycenter |
whether to return the region's barycenter, logical, |
verbosity |
level of details to print during execution, integer, from |
The function computes the Tukey regions (upper-level set of the Tukey depth function) for n
points in the Euclidean d
-variate space contained in data
at the depth values specified in the argument maxDepth
. This function calls iteratively function TukeyRegion
for depth levels from 1
to maxDepth
, where each time the initial set of ridges conincides with all the ridges found at the previous step (see Fojtik et al., 2022).
Due to the nature of the function, arguments halfspaces
and/or innerPoint
cannot be provided here anymore.
The function returns a list of objects of class TukeyRegion
with fields specified by ret...
-flags in the arguments:
data |
the input data set. |
depth |
chosen depth level. |
halfspacesFound |
whether at least one of the determining Tukey region halfspaces has been found. |
halfspaces |
if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
innerPointFound |
a logical indicating whether an inner point of the region has been found. If |
innerPoint |
coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
halfspacesNR |
non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
vertices |
vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field |
triangulated |
a logical repeating the |
facets |
facets of the Tukey region. If input argument |
volume |
volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
barycenter |
the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
numRidges |
number of used ridges. |
Pavlo Mozharovskyi <[email protected]>
Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697. Vit Fojtik, Petra Laketa, Pavlo Mozharovskyi, and Stanislav Nagy (2022). On exact computation of Tukey depth central regions. arXiv:2208.04587.
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(500, rep(0, 3), matrix(c(1, 0.25, 0.25, 0.25, 1, 0.25, 0.25, 0.25, 1), nrow = 3)) # Compute the Tukey region Trs <- TukeyKRegions(X, 5, "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) for (i in 1:5){ summary(Trs[[i]]) cat("\n") }
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(500, rep(0, 3), matrix(c(1, 0.25, 0.25, 0.25, 1, 0.25, 0.25, 0.25, 1), nrow = 3)) # Compute the Tukey region Trs <- TukeyKRegions(X, 5, "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) for (i in 1:5){ summary(Trs[[i]]) cat("\n") }
Computes the Tukey median set and its barycenter, the Tukey median.
TukeyMedian(data, algMedian = "bsbarydepth", method = "bfs", trgFacets = TRUE, retHalfspaces = FALSE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = TRUE, retFacets = TRUE, retVolume = FALSE, retBarycenter = TRUE, verbosity = 0)
TukeyMedian(data, algMedian = "bsbarydepth", method = "bfs", trgFacets = TRUE, retHalfspaces = FALSE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = TRUE, retFacets = TRUE, retVolume = FALSE, retBarycenter = TRUE, verbosity = 0)
data |
data set for which the Tukey median shall be computed, a matrix having > 2 columns and more rows than columns. |
algMedian |
the algorithm used to compute the Tukey median, a string containing |
method |
the method to use to compute the Tukey region, a string containing |
trgFacets |
whether to triangulate facets, logical, |
retHalfspaces |
whether to return all found halfspaces, logical, |
retHalfspacesNR |
whether to return non-redundant halfspaces, logical, |
retInnerPoint |
whether to return inner point, logical, |
retVertices |
whether to return vertices, logical, |
retFacets |
whether to return facets, logical, |
retVolume |
whether to return volume, logical, |
retBarycenter |
whether to return the region's barycenter, logical, |
verbosity |
level of details to print during execution, integer, from |
The function computes the Tukey median set, i.e. the region with the highest depth value, for n
points in the Euclidean d
-variate space contained in data
.
It also computes this set's barycenter, which is the Tukey median. Four search algorithms are implemented: Algorithm bsbarydepth
is the most efficient, it is the bisection algorithm starting with the lower bound as the maximum of the theoretical minimum and the depth of the componentwise median, and updating lower bound with the depth of the barycenter of the last found region. Algorithm "cutintwo"
sequntially cuts the range of remaining depths into two parts starting with the range from 1
to the upper bound obtained by Liu, Luo, and Zuo (2016). Algorithm "downwards"
is checking each depth value decrementally with step 1
starting with the upper bound obtained by Liu, Luo, and Zuo (2016) until the first existing region is found. Algorithm "upwards"
is checking each depth value incrementally until the first non-existing region is found.
The main goal of the function is to provide the polytope (Tukey median set) and its barycenter (Tukey median); the settings can be adjusted though. After the median depth is found, the TukeyRegion
function is called.
See function TukeyRegion
for details regarding the output.
The function returns an object of class TukeyRegion
with fields specified by ret...
-flags in the arguments:
data |
the input data set. |
depth |
chosen depth level. |
numRegions |
number of time the depth region has been computed. |
halfspacesFound |
whether at least one of the determining Tukey region halfspaces has been found. |
halfspaces |
if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
innerPointFound |
a logical indicating whether an inner point of the region has been found. If |
innerPoint |
coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
halfspacesNR |
non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
vertices |
vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field |
triangulated |
a logical repeating the |
facets |
facets of the Tukey region. If input argument |
volume |
volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
barycenter |
the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
numRidges |
number of used ridges (for computing the last region). |
Pavlo Mozharovskyi <[email protected]>
Liu, X., Luo, S., and Zuo, Y. (2020). Some results on the computing of Tukey's halfspace median. Statistical Papers, 61, 303-316.
Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(100, rep(0, 3), matrix(c(1, 1, 1, 1, 2, 2, 1, 2, 4), nrow = 3)) # Compute the Tukey median Tm <- TukeyMedian(X) summary(Tm) # Visualize the Tukey median plot(Tm)
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(100, rep(0, 3), matrix(c(1, 1, 1, 1, 2, 2, 1, 2, 4), nrow = 3)) # Compute the Tukey median Tm <- TukeyMedian(X) summary(Tm) # Visualize the Tukey median plot(Tm)
Computes the Tukey depth trimmed region for a given depth level.
TukeyRegion(data, depth, method = "bfs", trgFacets = FALSE, checkInnerPoint = TRUE, retHalfspaces = TRUE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = FALSE, retFacets = FALSE, retVolume = FALSE, retBarycenter = FALSE, halfspaces = matrix(0), innerPoint = numeric(1), verbosity = 0L)
TukeyRegion(data, depth, method = "bfs", trgFacets = FALSE, checkInnerPoint = TRUE, retHalfspaces = TRUE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = FALSE, retFacets = FALSE, retVolume = FALSE, retBarycenter = FALSE, halfspaces = matrix(0), innerPoint = numeric(1), verbosity = 0L)
data |
data set for which the Tukey region shall be computed, a matrix having |
depth |
depth of the Tukey region, an integer between |
method |
the method to use to compute the Tukey region, a string containing |
trgFacets |
whether to triangulate facets, logical, |
checkInnerPoint |
whether to check correctness of the inner point in case it is provided, logical, |
retHalfspaces |
whether to return all found halfspaces, logical, |
retHalfspacesNR |
whether to return non-redundant halfspaces, logical, |
retInnerPoint |
whether to return inner point, logical, |
retVertices |
whether to return vertices, logical, |
retFacets |
whether to return facets, logical, |
retVolume |
whether to return volume, logical, |
retBarycenter |
whether to return the region's barycenter, logical, |
halfspaces |
halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
innerPoint |
inner point, a vector of length equal to dimension. |
verbosity |
level of details to print during execution, integer, from |
The function computes the Tukey region (upper-level set of the Tukey depth function) for n
points in the Euclidean d
-variate space contained in data
at the depth value depth
.
Three methods are implemented: Method "bfs"
is the most efficient, it starts with an initial set of ridges and traverses all facets using the breadth-first search algorithm. Method "cmb"
considers all subspaces spanned by combinations of d - 1
points, projects data
onto their orthogonal complements (planes), and searches for bivariate quantiles these planes. Method "bf"
employs the brute-force strategy by checking all halfspaces defined by hyperplanes containing d
points from data
. If d = 2
, method "bf"
is used. See Liu, Mosler, and Mozharovskyi (2019) for details on algorithms.
The function proceeds in three main steps. Step 1: Calculate all the halfspaces defining Tukey region in their intersection. Many of them are usually redundant. Step 2: Find the inner point of the Tukey region, i.e. a point which lies simultaneously in all the before calculated halfspaces. If such a point does not exist neither does the Tukey region exist for this depth level. The algorithm stops and returns FALSE
in the field innerPointFound
. If the inner point has been found, the algorithm proceeds to Step 3: Filter the halfspaces leaving only those containing the facets of the Tukey region. Step 3 provides infirmation to compute vertices, facets, volume, and barycenter of the Tukey region.
halfspaces
and/or innerPoint
can be provided as function arguments.
The function tries to fulfill all the requirements indicated by the input flags. Step 1 is performed anyway (even if retHalfspaces
is unset, which means the halfspaces just should not be output, except they are provided by the argument halfspaces
). If any further ret...
-flag is set Step 2 is performed, except retHalfspacesNR
is unset and the argument innerPoint
provided. If any of retVertices
, retFacets
, retVolume
, retBarycenter
is set, Step 3 is performed.
The region can be visualized in 2- and 3-dimensional space by plot(...)
, general information can be printed by print(...)
, statistics can be summarized by summary(...)
.
The function returns an object of class TukeyRegion
with fields specified by ret...
-flags in the arguments:
data |
the input data set. |
depth |
chosen depth level. |
halfspacesFound |
whether at least one of the determining Tukey region halfspaces has been found. |
halfspaces |
if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
innerPointFound |
a logical indicating whether an inner point of the region has been found. If |
innerPoint |
coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
halfspacesNR |
non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
vertices |
vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field |
triangulated |
a logical repeating the |
facets |
facets of the Tukey region. If input argument |
volume |
volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
barycenter |
the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
numRidges |
number of used ridges. |
Pavlo Mozharovskyi <[email protected]>
Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(500, rep(0, 3), matrix(c(1, 1, 1, 1, 2, 2, 1, 2, 4), nrow = 3)) # Compute the Tukey region Tr <- TukeyRegion(X, 10, "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) summary(Tr) # Visualize the Tukey region plot(Tr)
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(500, rep(0, 3), matrix(c(1, 1, 1, 1, 2, 2, 1, 2, 4), nrow = 3)) # Compute the Tukey region Tr <- TukeyRegion(X, 10, "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) summary(Tr) # Visualize the Tukey region plot(Tr)
Computes the Tukey depth trimmed regions for given depth levels.
TukeyRegions(data, depths, method = "bfs", trgFacets = FALSE, checkInnerPoint = TRUE, retHalfspaces = TRUE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = FALSE, retFacets = FALSE, retVolume = FALSE, retBarycenter = FALSE, verbosity = 0L)
TukeyRegions(data, depths, method = "bfs", trgFacets = FALSE, checkInnerPoint = TRUE, retHalfspaces = TRUE, retHalfspacesNR = FALSE, retInnerPoint = FALSE, retVertices = FALSE, retFacets = FALSE, retVolume = FALSE, retBarycenter = FALSE, verbosity = 0L)
data |
data set for which the Tukey region shall be computed, a matrix having |
depths |
depths of the Tukey regions to compute, a vector of integers between |
method |
the method to use to compute the Tukey region, a string containing |
trgFacets |
whether to triangulate facets, logical, |
checkInnerPoint |
whether to check correctness of the inner point in case it is provided, logical, |
retHalfspaces |
whether to return all found halfspaces, logical, |
retHalfspacesNR |
whether to return non-redundant halfspaces, logical, |
retInnerPoint |
whether to return inner point, logical, |
retVertices |
whether to return vertices, logical, |
retFacets |
whether to return facets, logical, |
retVolume |
whether to return volume, logical, |
retBarycenter |
whether to return the region's barycenter, logical, |
verbosity |
level of details to print during execution, integer, from |
The function computes the Tukey regions (upper-level set of the Tukey depth function) for n
points in the Euclidean d
-variate space contained in data
at the depth values specified in the argument depths
. This function calls multiple times function TukeyRegion
and provides computational convenience by shifting the loop on the C++
level.
The main difference with TukeyMedian
is that arguments halfspaces
and/or innerPoint
cannot be provided here anymore.
The function returns a list of objects of class TukeyRegion
with fields specified by ret...
-flags in the arguments:
data |
the input data set. |
depth |
chosen depth level. |
halfspacesFound |
whether at least one of the determining Tukey region halfspaces has been found. |
halfspaces |
if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
innerPointFound |
a logical indicating whether an inner point of the region has been found. If |
innerPoint |
coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
halfspacesNR |
non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in |
vertices |
vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field |
triangulated |
a logical repeating the |
facets |
facets of the Tukey region. If input argument |
volume |
volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
barycenter |
the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. |
numRidges |
number of used ridges. |
Pavlo Mozharovskyi <[email protected]>
Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(500, rep(0, 3), matrix(c(1, 0.25, 0.25, 0.25, 1, 0.25, 0.25, 0.25, 1), nrow = 3)) # Compute the Tukey region Trs <- TukeyRegions(X, c(5, 25), "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) summary(Trs[[1]]) summary(Trs[[2]]) # Visualize the Tukey region plot(Trs[[2]], drawRidges = FALSE, colorFacets = "green", alpha = 1) plot(Trs[[1]], drawRidges = FALSE, newPlot = FALSE, colorFacets = "blue", alpha = 0.5)
# Load required packages require(TukeyRegion) require(MASS) # Generate data set.seed(1) X <- mvrnorm(500, rep(0, 3), matrix(c(1, 0.25, 0.25, 0.25, 1, 0.25, 0.25, 0.25, 1), nrow = 3)) # Compute the Tukey region Trs <- TukeyRegions(X, c(5, 25), "bfs", retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE) summary(Trs[[1]]) summary(Trs[[2]]) # Visualize the Tukey region plot(Trs[[2]], drawRidges = FALSE, colorFacets = "green", alpha = 1) plot(Trs[[1]], drawRidges = FALSE, newPlot = FALSE, colorFacets = "blue", alpha = 0.5)