Package 'TukeyRegion'

Title: Tukey Region and Median
Description: Tukey regions are polytopes in the Euclidean space, viz. upper-level sets of the Tukey depth function on given data. The bordering hyperplanes of a Tukey region are computed as well as its vertices, facets, centroid, and volume. In addition, the Tukey median set, which is the non-empty Tukey region having highest depth level, and its barycenter (= Tukey median) are calculated. Tukey regions are visualized in dimension two and three. For details see Liu, Mosler, and Mozharovskyi (2019, <doi:10.1080/10618600.2018.1546595>). See file LICENSE.note for additional license information.
Authors: C.B. Barber [aut, cph] (Qhull library), The Geometry Center University of Minnesota [cph] (Qhull library), Pavlo Mozharovskyi [aut, cre]
Maintainer: Pavlo Mozharovskyi <[email protected]>
License: GPL (>= 3)
Version: 0.1.6.3
Built: 2025-03-05 05:33:24 UTC
Source: https://github.com/cran/TukeyRegion

Help Index


Computation of the Tukey Region and the Tukey Median

Description

Tukey regions are polytopes in the Euclidean space, viz. upper-level sets of the Tukey depth function on given data. The bordering hyperplanes of a Tukey region are computed as well as its vertices, facets, centroid, and volume. In addition, the Tukey median set, which is the non-empty Tukey region having highest depth level, and its barycenter (= Tukey median) are calculated. Tukey regions are visualized in dimension two and three. For details see Liu, Mosler, and Mozharovskyi (2019).

Details

Proposed initially by John W. Tukey in 1975 (see also Donoho and Gasko, 1992) Tukey depth measures centrality of an arbitrary point in the Euclidean space w.r.t. a data cloud. For a point, Tukey (also halfspace or location) depth is defined as the smallest portion of observations that can be cut off by a closed halfspace containing this. For a given depth level Tukey (trimmed) region is defined as the upper level set of the Tukey depth function; it constitutes a closed polytope. The Tukey region of the highest level is referred to as the Tukey median set, while its barycenter is mentioned as the Tukey median. Due to properties of affine invariance, quasiconcavity, vanishing at infinity, high breakdown point of the median set, the Tukey depth attracted attention of statisticians and experienced substantial theoretical development. It is used in numerous applications including multivariate data analysis, outlier detection, tests for location (also scale and symmetry), classification, statistical quality control, imputation of missing data, etc.

Package TukeyRegion provides routines for computation (TukeyRegion) and visualization (plot) of the Tukey depth trimmed region, the Tukey median set and Tukey median (TukeyMedian), and Tukey depth weighted and/or trimmed mean (depth.wm).

For cumputation of Tukey depth see function depth.halfspace of package ddalpha.

Package: TukeyRegion
Type: Package
Version: 0.1.6.3
Date: 2023-04-17
License: GPL (>= 3)

Author(s)

Authors: C.B. Barber [aut, cph] (Qhull library), The Geometry Center of University of Minnesota [cph] (Qhull library), Pavlo Mozharovskyi [aut, cre]

Maintainer: Pavlo Mozharovskyi, <[email protected]>

References

Donoho, D.L. and Gasko, M (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. The Annals of Statistics, 20(4), 1803-1827.

Dyckerhoff, R. and Mozharovskyi, P. (2016). Exact computation of the halfspace depth. Computational Statistics and Data Analysis, 98, 19-30.

Hallin, M., Paindaveine, D., and Siman, M. (2010). Multivariate quantiles and multiple-output regression quantiles: from L1-optimization to halfspace depth. The Annals of Statistics, 38, 635-669.

Kong, L. and Mizera, I. (2012). Quantile tomography: using quantiles with multivariate data. Statistica Sinica, 22, 1589-1610. Published online as arXiv:0805.0056 [stat.ME] (2008).

Liu, X., Luo, S., and Zuo, Y. (2020). Some results on the computing of Tukey's halfspace median. Statistical Papers, 61, 303-316.

Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.

Tukey, J.W. (1975). Mathematics and the picturing of data. In: James, R.D. (Ed.), Proceeding of the International Congress of Mathematicians (Volume 2), Canadian Mathematical Congress, Vancouver, 523-531.

See Also

TukeyRegion, TukeyMedian, depth.wm,

depth.halfspace, ddalpha.

Examples

# Load required packages
require(TukeyRegion)
require(MASS)
# Generate data
set.seed(1)
X <- mvrnorm(100, rep(0, 3), diag(3))
# Compute and visualize two Tukey regions
Tr1 <- TukeyRegion(X, 5, "bfs",
  retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE)
Tr2 <- TukeyRegion(X, 25, "bfs",
  retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE)
plot(Tr1, colorFacets = "red", colorRidges = "red",
  colorPoints = "blue", alpha = 0.35)
plot(Tr2, newPlot = FALSE, drawPoints = FALSE, colorFacets = "green",
  colorRidges = "green", alpha = 0.65)
(Tr1$barycenter)
(Tr2$barycenter)
# Compute arithmetic mean
T.mean <- colMeans(X)
(T.mean)
# Compute Tukey depth trimmed weighted mean (approximate depth)
T.approx1 <- depth.wm(X, 0.25)
(T.approx1)
T.approx2 <- depth.wm(X, 75)
(T.approx2)
# Compute Tukey depth trimmed weighted mean (exact depth)
T.exact1 <- depth.wm(X, 0.25, exact = TRUE)
(T.exact1)
T.exact2 <- depth.wm(X, 75, exact = TRUE)
(T.exact2)
# Compute and visualize Tukey median
Tm <- TukeyMedian(X)
(Tm$barycenter)
plot(Tm, newPlot = FALSE, drawPoints = FALSE)

Computation of the Tukey depth weighted and/or trimmed mean

Description

Computes the Tukey depth weighted and/or trimmed for a given depth level or for a given number of deepest points.

Usage

depth.wm(data, depth.level = 1/nrow(data), weighted = TRUE, 
  break.ties = "atRandom", ...)

Arguments

data

data set for which the weighted mean should be computed, a matrix having > 2 columns and more rows than columns.

depth.level

either Tukey depth level for trimming (a numeric between 1/(number of rows in data) and 1) or the number of deepest points to take into account (an integer between one and the number of rows in data).

weighted

whether the trimmed mean should be weighted by depth, logical, TRUE by default.

break.ties

the way to break ties if the number of deepest points is given, character. If "atRandom" (by default) ties are broken at random, for any other value input points' order is used.

...

further agruments passed to function depth.halfspace of package ddalpha. See ‘Details’ for additional information.

Details

After having computed the Tukey depth of each point in data the function operates in two possible modes. If depth.level lies between 0 and 1 then the function computes trimmed (weighted if specified by flag weighted) mean of all points having at least given depth level. If depth.level specifies the number of points (an integer between 1 and number of rows in data) then the trimmed (weighted) mean of depth.level deepest points are calculated breaking ties due to argument break.ties (ties can occur due to discrete nature of the Tukey depth). This follows the idea of Donoho and Gasko (1992), also see this article for the breakdown point.

Depth of points is calculated by means of external function depth.halfspace from package ddalpha, whose arguments can be specified as well. In particular, argument exact specifies whether Tukey depth is computed exactly (TRUE) or approximated (FALSE) by random projections; for the latter case argument num.directions specifies the number of random directions to use. For further details about the algorithm see Dyckerhoff and Mozharovskyi (2016).

Value

The function returns the weighted and/or trimmed mean, a point in the d-variate Euclidean space (d is the number of columns in data), a numeric vector.

Author(s)

Pavlo Mozharovskyi <[email protected]>

References

Donoho, D.L. and Gasko, M (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. The Annals of Statistics, 20(4), 1803-1827.

Dyckerhoff, R. and Mozharovskyi, P. (2016). Exact computation of the halfspace depth. Computational Statistics and Data Analysis, 98, 19-30.

See Also

TukeyMedian

Examples

# Load required packages
require(TukeyRegion)
require(bfp)
# Generate data
set.seed(1)
X <- bfp:::rmvt(150, diag(3), rep(0, 3), 1)
# Compute arithmetic mean
T.mean <- colMeans(X)
(T.mean)
# Compute Tukey depth trimmed weighted mean (approximate depth)
T.approx1 <- depth.wm(X, 0.25)
(T.approx1)
T.approx2 <- depth.wm(X, 25)
(T.approx2)
# Compute Tukey depth trimmed weighted mean (exact depth)
T.exact1 <- depth.wm(X, 0.25, exact = TRUE)
(T.exact1)
T.exact2 <- depth.wm(X, 25, exact = TRUE)
(T.exact2)
# Compute Tukey median
Tm <- TukeyMedian(X)
(Tm$barycenter)

Plot the Tukey Region

Description

Plots the two-dimensional and the three-dimensional Tukey region.

Usage

## S3 method for class 'TukeyRegion'
plot(x, newPlot = TRUE, drawPoints = TRUE, 
  drawRidges = TRUE, colorBackground = "white", 
  colorPoints = "red", colorFacets = "blue", 
  colorRidges = "green", lwd = 1, lty = 1, alpha = 1, ...)

Arguments

x

object of class TukeyRegion to be plotted.

newPlot

whether to create a new plot(2D)/scene(3D).

drawPoints

whether to show the data points.

drawRidges

whether to show the ridges; works for non-triangulated facets only.

colorBackground

background color of the plot(2D)/scene(3D).

colorPoints

color of the points in case they are shown.

colorFacets

color of the facets.

colorRidges

color of the facets' ridges in case they are shown.

lwd

line width of the facets in 2D.

lty

line type of the facets in 2D.

alpha

transperency of the facets (and ridges if shown).

...

included for compatibility and should not be used.

Details

If dimension is equal to two, the traditional plot is produced. If dimension is equal to three, the 3D-scene is produced using the package rgl.

Author(s)

Pavlo Mozharovskyi <[email protected]>

See Also

TukeyRegion, TukeyMedian

Examples

# See examples in TukeyRegion or TukeyMedian

Prints Summary of the Tukey Region

Description

Prints basic information about the Tukey region.

Usage

## S3 method for class 'TukeyRegion'
summary(object, ...)

Arguments

object

object of class TukeyRegion for which the summary should be printed.

...

included for compatibility and should not be used.

Value

Prints in the console basic information about the computed Tukey region.

Author(s)

Pavlo Mozharovskyi <[email protected]>

See Also

TukeyRegion, TukeyMedian

Examples

# See examples in TukeyRegion or TukeyMedian

Computation of the Tukey median set and Tukey median

Description

Computes the Tukey median set and its barycenter, the Tukey median, starting from the region with dpeth 1 iteratively increasing it, according to the algorithm by Fojtik et al. (2022).

Usage

TukeyKMedian(data, algMedian = "upwards", method = "bfs",
  trgFacets = TRUE, retHalfspaces = FALSE, retHalfspacesNR = FALSE,
  retInnerPoint = FALSE, retVertices = TRUE, retFacets = TRUE,
  retVolume = FALSE, retBarycenter = TRUE, verbosity = 0)

Arguments

data

data set for which the Tukey median shall be computed, a matrix having > 2 columns and more rows than columns.

algMedian

the algorithm used to compute the Tukey median, a string containing bsbarydepth, or "cutintwo", or "downwards", or "upwards", see ‘Details’, "bsbarydepth" by default.

method

the method to use to compute the Tukey region, a string containing "bfs", or "cmb", or "bf", see TukeyRegion, "bfs" by default.

trgFacets

whether to triangulate facets, logical, FALSE by default. In this case no facet ridges are plotted, see ‘Value’.

retHalfspaces

whether to return all found halfspaces, logical, FALSE by default.

retHalfspacesNR

whether to return non-redundant halfspaces, logical, FALSE by default.

retInnerPoint

whether to return inner point, logical, FALSE by default.

retVertices

whether to return vertices, logical, TRUE by default.

retFacets

whether to return facets, logical, TRUE by default.

retVolume

whether to return volume, logical, FALSE by default.

retBarycenter

whether to return the region's barycenter, logical, FALSE by default.

verbosity

level of details to print during execution, integer, from 0 (= print nothing) to 2 (= print all details).

Details

The function computes the Tukey median set, i.e. the region with the highest depth value, for n points in the Euclidean d-variate space contained in data.

It also computes this set's barycenter, which is the Tukey median. Four search algorithms are implemented: Algorithm bsbarydepth is the most efficient, it is the bisection algorithm starting with the lower bound as the maximum of the theoretical minimum and the depth of the componentwise median, and updating lower bound with the depth of the barycenter of the last found region. Algorithm "cutintwo" sequntially cuts the range of remaining depths into two parts starting with the range from 1 to the upper bound obtained by Liu, Luo, and Zuo (2016). Algorithm "downwards" is checking each depth value decrementally with step 1 starting with the upper bound obtained by Liu, Luo, and Zuo (2016) until the first existing region is found. Algorithm "upwards" is checking each depth value incrementally until the first non-existing region is found.

The main goal of the function is to provide the polytope (Tukey median set) and its barycenter (Tukey median); the settings can be adjusted though. After the median depth is found, the TukeyRegion function is called.

See function TukeyRegion for details regarding the output.

Value

The function returns an object of class TukeyRegion with fields specified by ret...-flags in the arguments:

data

the input data set.

depth

chosen depth level.

numRegions

number of time the depth region has been computed.

halfspacesFound

whether at least one of the determining Tukey region halfspaces has been found.

halfspaces

if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data.

innerPointFound

a logical indicating whether an inner point of the region has been found. If FALSE then the region of the given depth does not exist. If the field is absent then the inner point has not been requested by the input arguments.

innerPoint

coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

halfspacesNR

non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

vertices

vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field facets is returned, this field is returned as well.

triangulated

a logical repeating the trgFacets input argument. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

facets

facets of the Tukey region. If input argument trgFacets is set, then this is a list where each element is an array enumerating numbers of the rows in field vertices, their number for each facet can vary. If input argument trgFacets is unset, then this is a matrix with each row corresponding to a triangulated facet, and no facets' ridges reconstruction is performed, so it cannot be visualized. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

volume

volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

barycenter

the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

numRidges

number of used ridges (for computing the last region).

Author(s)

Pavlo Mozharovskyi <[email protected]>

References

Liu, X., Luo, S., and Zuo, Y. (2020). Some results on the computing of Tukey's halfspace median. Statistical Papers, 61, 303-316.

Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.

Vit Fojtik, Petra Laketa, Pavlo Mozharovskyi, and Stanislav Nagy (2022). On exact computation of Tukey depth central regions. arXiv:2208.04587.

See Also

TukeyRegion, depth.wm

Examples

# Load required packages
require(TukeyRegion)
require(MASS)
# Generate data
set.seed(1)
X <- mvrnorm(100, rep(0, 3),
  matrix(c(1, 1, 1, 1, 2, 2, 1, 2, 4), nrow = 3))
# Compute the Tukey median
Tm <- TukeyKMedian(X)
summary(Tm)
# Visualize the Tukey median
plot(Tm)

Computation of the Tukey Region

Description

Computes the Tukey depth trimmed regions for all depth levels from 1 to K.

Usage

TukeyKRegions(data, maxDepth, method = "bfs",
  trgFacets = FALSE, checkInnerPoint = TRUE,
  retHalfspaces = TRUE, retHalfspacesNR = FALSE,
  retInnerPoint = FALSE, retVertices = FALSE,
  retFacets = FALSE, retVolume = FALSE, retBarycenter = FALSE,
  verbosity = 0L)

Arguments

data

data set for which the Tukey region shall be computed, a matrix having > 2 columns and more rows than columns.

maxDepth

depth level until which Tukey regions to compute, an integer between 1 and the half of the number of rows in data.

method

the method to use to compute the Tukey region, a string containing "bfs", or "cmb", or "bf", see ‘Details’, "bfs" by default.

trgFacets

whether to triangulate facets, logical, FALSE by default. In this case no facet ridges are plotted, see ‘Value’.

checkInnerPoint

whether to check correctness of the inner point in case it is provided, logical, TRUE by default.

retHalfspaces

whether to return all found halfspaces, logical, TRUE by default.

retHalfspacesNR

whether to return non-redundant halfspaces, logical, FALSE by default.

retInnerPoint

whether to return inner point, logical, FALSE by default.

retVertices

whether to return vertices, logical, FALSE by default.

retFacets

whether to return facets, logical, FALSE by default.

retVolume

whether to return volume, logical, FALSE by default.

retBarycenter

whether to return the region's barycenter, logical, FALSE by default.

verbosity

level of details to print during execution, integer, from 0 (= print nothing) to 2 (= print all details).

Details

The function computes the Tukey regions (upper-level set of the Tukey depth function) for n points in the Euclidean d-variate space contained in data at the depth values specified in the argument maxDepth. This function calls iteratively function TukeyRegion for depth levels from 1 to maxDepth, where each time the initial set of ridges conincides with all the ridges found at the previous step (see Fojtik et al., 2022).

Due to the nature of the function, arguments halfspaces and/or innerPoint cannot be provided here anymore.

Value

The function returns a list of objects of class TukeyRegion with fields specified by ret...-flags in the arguments:

data

the input data set.

depth

chosen depth level.

halfspacesFound

whether at least one of the determining Tukey region halfspaces has been found.

halfspaces

if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data.

innerPointFound

a logical indicating whether an inner point of the region has been found. If FALSE then the region of the given depth does not exist. If the field is absent then the inner point has not been requested by the input arguments.

innerPoint

coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

halfspacesNR

non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

vertices

vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field facets is returned, this field is returned as well.

triangulated

a logical repeating the trgFacets input argument. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

facets

facets of the Tukey region. If input argument trgFacets is set, then this is a list where each element is an array enumerating numbers of the rows in field vertices; their number for each facet can vary. If input argument trgFacets is unset, then this is a matrix with each row corresponding to a triangulated facet, and no facets' ridges reconstruction is performed. So it cannot be visualized. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

volume

volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

barycenter

the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

numRidges

number of used ridges.

Author(s)

Pavlo Mozharovskyi <[email protected]>

References

Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697. Vit Fojtik, Petra Laketa, Pavlo Mozharovskyi, and Stanislav Nagy (2022). On exact computation of Tukey depth central regions. arXiv:2208.04587.

See Also

TukeyRegion TukeyMedian

Examples

# Load required packages
require(TukeyRegion)
require(MASS)
# Generate data
set.seed(1)
X <- mvrnorm(500, rep(0, 3),
  matrix(c(1, 0.25, 0.25, 0.25, 1, 0.25, 0.25, 0.25, 1), nrow = 3))
# Compute the Tukey region
Trs <- TukeyKRegions(X, 5, "bfs",
  retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE)
for (i in 1:5){
  summary(Trs[[i]])
  cat("\n")
}

Computation of the Tukey median set and Tukey median

Description

Computes the Tukey median set and its barycenter, the Tukey median.

Usage

TukeyMedian(data, algMedian = "bsbarydepth", method = "bfs",
  trgFacets = TRUE, retHalfspaces = FALSE, retHalfspacesNR = FALSE,
  retInnerPoint = FALSE, retVertices = TRUE, retFacets = TRUE,
  retVolume = FALSE, retBarycenter = TRUE, verbosity = 0)

Arguments

data

data set for which the Tukey median shall be computed, a matrix having > 2 columns and more rows than columns.

algMedian

the algorithm used to compute the Tukey median, a string containing bsbarydepth, or "cutintwo", or "downwards", or "upwards", see ‘Details’, "bsbarydepth" by default.

method

the method to use to compute the Tukey region, a string containing "bfs", or "cmb", or "bf", see TukeyRegion, "bfs" by default.

trgFacets

whether to triangulate facets, logical, FALSE by default. In this case no facet ridges are plotted, see ‘Value’.

retHalfspaces

whether to return all found halfspaces, logical, FALSE by default.

retHalfspacesNR

whether to return non-redundant halfspaces, logical, FALSE by default.

retInnerPoint

whether to return inner point, logical, FALSE by default.

retVertices

whether to return vertices, logical, TRUE by default.

retFacets

whether to return facets, logical, TRUE by default.

retVolume

whether to return volume, logical, FALSE by default.

retBarycenter

whether to return the region's barycenter, logical, FALSE by default.

verbosity

level of details to print during execution, integer, from 0 (= print nothing) to 2 (= print all details).

Details

The function computes the Tukey median set, i.e. the region with the highest depth value, for n points in the Euclidean d-variate space contained in data.

It also computes this set's barycenter, which is the Tukey median. Four search algorithms are implemented: Algorithm bsbarydepth is the most efficient, it is the bisection algorithm starting with the lower bound as the maximum of the theoretical minimum and the depth of the componentwise median, and updating lower bound with the depth of the barycenter of the last found region. Algorithm "cutintwo" sequntially cuts the range of remaining depths into two parts starting with the range from 1 to the upper bound obtained by Liu, Luo, and Zuo (2016). Algorithm "downwards" is checking each depth value decrementally with step 1 starting with the upper bound obtained by Liu, Luo, and Zuo (2016) until the first existing region is found. Algorithm "upwards" is checking each depth value incrementally until the first non-existing region is found.

The main goal of the function is to provide the polytope (Tukey median set) and its barycenter (Tukey median); the settings can be adjusted though. After the median depth is found, the TukeyRegion function is called.

See function TukeyRegion for details regarding the output.

Value

The function returns an object of class TukeyRegion with fields specified by ret...-flags in the arguments:

data

the input data set.

depth

chosen depth level.

numRegions

number of time the depth region has been computed.

halfspacesFound

whether at least one of the determining Tukey region halfspaces has been found.

halfspaces

if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data.

innerPointFound

a logical indicating whether an inner point of the region has been found. If FALSE then the region of the given depth does not exist. If the field is absent then the inner point has not been requested by the input arguments.

innerPoint

coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

halfspacesNR

non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

vertices

vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field facets is returned, this field is returned as well.

triangulated

a logical repeating the trgFacets input argument. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

facets

facets of the Tukey region. If input argument trgFacets is set, then this is a list where each element is an array enumerating numbers of the rows in field vertices, their number for each facet can vary. If input argument trgFacets is unset, then this is a matrix with each row corresponding to a triangulated facet, and no facets' ridges reconstruction is performed, so it cannot be visualized. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

volume

volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

barycenter

the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

numRidges

number of used ridges (for computing the last region).

Author(s)

Pavlo Mozharovskyi <[email protected]>

References

Liu, X., Luo, S., and Zuo, Y. (2020). Some results on the computing of Tukey's halfspace median. Statistical Papers, 61, 303-316.

Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.

See Also

TukeyRegion, depth.wm

Examples

# Load required packages
require(TukeyRegion)
require(MASS)
# Generate data
set.seed(1)
X <- mvrnorm(100, rep(0, 3),
  matrix(c(1, 1, 1, 1, 2, 2, 1, 2, 4), nrow = 3))
# Compute the Tukey median
Tm <- TukeyMedian(X)
summary(Tm)
# Visualize the Tukey median
plot(Tm)

Computation of the Tukey Region

Description

Computes the Tukey depth trimmed region for a given depth level.

Usage

TukeyRegion(data, depth, method = "bfs",
  trgFacets = FALSE, checkInnerPoint = TRUE,
  retHalfspaces = TRUE, retHalfspacesNR = FALSE,
  retInnerPoint = FALSE, retVertices = FALSE,
  retFacets = FALSE, retVolume = FALSE, retBarycenter = FALSE,
  halfspaces = matrix(0), innerPoint = numeric(1),
  verbosity = 0L)

Arguments

data

data set for which the Tukey region shall be computed, a matrix having > 2 columns and more rows than columns.

depth

depth of the Tukey region, an integer between 1 and the half of the number of rows in data.

method

the method to use to compute the Tukey region, a string containing "bfs", or "cmb", or "bf", see ‘Details’, "bfs" by default.

trgFacets

whether to triangulate facets, logical, FALSE by default. In this case no facet ridges are plotted, see ‘Value’.

checkInnerPoint

whether to check correctness of the inner point in case it is provided, logical, TRUE by default.

retHalfspaces

whether to return all found halfspaces, logical, TRUE by default.

retHalfspacesNR

whether to return non-redundant halfspaces, logical, FALSE by default.

retInnerPoint

whether to return inner point, logical, FALSE by default.

retVertices

whether to return vertices, logical, FALSE by default.

retFacets

whether to return facets, logical, FALSE by default.

retVolume

whether to return volume, logical, FALSE by default.

retBarycenter

whether to return the region's barycenter, logical, FALSE by default.

halfspaces

halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data.

innerPoint

inner point, a vector of length equal to dimension.

verbosity

level of details to print during execution, integer, from 0 (= print nothing) to 2 (= print all details).

Details

The function computes the Tukey region (upper-level set of the Tukey depth function) for n points in the Euclidean d-variate space contained in data at the depth value depth.

Three methods are implemented: Method "bfs" is the most efficient, it starts with an initial set of ridges and traverses all facets using the breadth-first search algorithm. Method "cmb" considers all subspaces spanned by combinations of d - 1 points, projects data onto their orthogonal complements (planes), and searches for bivariate quantiles these planes. Method "bf" employs the brute-force strategy by checking all halfspaces defined by hyperplanes containing d points from data. If d = 2, method "bf" is used. See Liu, Mosler, and Mozharovskyi (2019) for details on algorithms.

The function proceeds in three main steps. Step 1: Calculate all the halfspaces defining Tukey region in their intersection. Many of them are usually redundant. Step 2: Find the inner point of the Tukey region, i.e. a point which lies simultaneously in all the before calculated halfspaces. If such a point does not exist neither does the Tukey region exist for this depth level. The algorithm stops and returns FALSE in the field innerPointFound. If the inner point has been found, the algorithm proceeds to Step 3: Filter the halfspaces leaving only those containing the facets of the Tukey region. Step 3 provides infirmation to compute vertices, facets, volume, and barycenter of the Tukey region.

halfspaces and/or innerPoint can be provided as function arguments.

The function tries to fulfill all the requirements indicated by the input flags. Step 1 is performed anyway (even if retHalfspaces is unset, which means the halfspaces just should not be output, except they are provided by the argument halfspaces). If any further ret...-flag is set Step 2 is performed, except retHalfspacesNR is unset and the argument innerPoint provided. If any of retVertices, retFacets, retVolume, retBarycenter is set, Step 3 is performed.

The region can be visualized in 2- and 3-dimensional space by plot(...), general information can be printed by print(...), statistics can be summarized by summary(...).

Value

The function returns an object of class TukeyRegion with fields specified by ret...-flags in the arguments:

data

the input data set.

depth

chosen depth level.

halfspacesFound

whether at least one of the determining Tukey region halfspaces has been found.

halfspaces

if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data.

innerPointFound

a logical indicating whether an inner point of the region has been found. If FALSE then the region of the given depth does not exist. If the field is absent then the inner point has not been requested by the input arguments.

innerPoint

coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

halfspacesNR

non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

vertices

vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field facets is returned, this field is returned as well.

triangulated

a logical repeating the trgFacets input argument. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

facets

facets of the Tukey region. If input argument trgFacets is set, then this is a list where each element is an array enumerating numbers of the rows in field vertices; their number for each facet can vary. If input argument trgFacets is unset, then this is a matrix with each row corresponding to a triangulated facet, and no facets' ridges reconstruction is performed. So it cannot be visualized. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

volume

volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

barycenter

the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

numRidges

number of used ridges.

Author(s)

Pavlo Mozharovskyi <[email protected]>

References

Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.

See Also

TukeyMedian

Examples

# Load required packages
require(TukeyRegion)
require(MASS)
# Generate data
set.seed(1)
X <- mvrnorm(500, rep(0, 3),
  matrix(c(1, 1, 1, 1, 2, 2, 1, 2, 4), nrow = 3))
# Compute the Tukey region
Tr <- TukeyRegion(X, 10, "bfs",
  retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE)
summary(Tr)
# Visualize the Tukey region
plot(Tr)

Computation of the Tukey Region

Description

Computes the Tukey depth trimmed regions for given depth levels.

Usage

TukeyRegions(data, depths, method = "bfs",
  trgFacets = FALSE, checkInnerPoint = TRUE,
  retHalfspaces = TRUE, retHalfspacesNR = FALSE,
  retInnerPoint = FALSE, retVertices = FALSE,
  retFacets = FALSE, retVolume = FALSE, retBarycenter = FALSE,
  verbosity = 0L)

Arguments

data

data set for which the Tukey region shall be computed, a matrix having > 2 columns and more rows than columns.

depths

depths of the Tukey regions to compute, a vector of integers between 1 and the half of the number of rows in data.

method

the method to use to compute the Tukey region, a string containing "bfs", or "cmb", or "bf", see ‘Details’, "bfs" by default.

trgFacets

whether to triangulate facets, logical, FALSE by default. In this case no facet ridges are plotted, see ‘Value’.

checkInnerPoint

whether to check correctness of the inner point in case it is provided, logical, TRUE by default.

retHalfspaces

whether to return all found halfspaces, logical, TRUE by default.

retHalfspacesNR

whether to return non-redundant halfspaces, logical, FALSE by default.

retInnerPoint

whether to return inner point, logical, FALSE by default.

retVertices

whether to return vertices, logical, FALSE by default.

retFacets

whether to return facets, logical, FALSE by default.

retVolume

whether to return volume, logical, FALSE by default.

retBarycenter

whether to return the region's barycenter, logical, FALSE by default.

verbosity

level of details to print during execution, integer, from 0 (= print nothing) to 2 (= print all details).

Details

The function computes the Tukey regions (upper-level set of the Tukey depth function) for n points in the Euclidean d-variate space contained in data at the depth values specified in the argument depths. This function calls multiple times function TukeyRegion and provides computational convenience by shifting the loop on the C++ level.

The main difference with TukeyMedian is that arguments halfspaces and/or innerPoint cannot be provided here anymore.

Value

The function returns a list of objects of class TukeyRegion with fields specified by ret...-flags in the arguments:

data

the input data set.

depth

chosen depth level.

halfspacesFound

whether at least one of the determining Tukey region halfspaces has been found.

halfspaces

if requested, halfspaces defining the Tukey region by their intersection, a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data.

innerPointFound

a logical indicating whether an inner point of the region has been found. If FALSE then the region of the given depth does not exist. If the field is absent then the inner point has not been requested by the input arguments.

innerPoint

coordinates of a point inside of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

halfspacesNR

non-redundant halfspaces (i.e. those containing Tukey region's facets), a matrix with number of columns equal to space dimension and where each row corresponds to a halfspace defined by three point numbers in data. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

vertices

vertices of the Tukey region, a matrix with number of columns equal to space dimension and where each row represents vertex coordinates. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments. If field facets is returned, this field is returned as well.

triangulated

a logical repeating the trgFacets input argument. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

facets

facets of the Tukey region. If input argument trgFacets is set, then this is a list where each element is an array enumerating numbers of the rows in field vertices; their number for each facet can vary. If input argument trgFacets is unset, then this is a matrix with each row corresponding to a triangulated facet, and no facets' ridges reconstruction is performed. So it cannot be visualized. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

volume

volume of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

barycenter

the barycenter of the Tukey region. If the field is absent then either no halfspaces or no inner point have been found or facet computation has not been requested by the input arguments.

numRidges

number of used ridges.

Author(s)

Pavlo Mozharovskyi <[email protected]>

References

Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension p > 2. Journal of Computational and Graphical Statistics, 28, 682-697.

See Also

TukeyRegion TukeyMedian

Examples

# Load required packages
require(TukeyRegion)
require(MASS)
# Generate data
set.seed(1)
X <- mvrnorm(500, rep(0, 3),
  matrix(c(1, 0.25, 0.25, 0.25, 1, 0.25, 0.25, 0.25, 1), nrow = 3))
# Compute the Tukey region
Trs <- TukeyRegions(X, c(5, 25), "bfs",
  retFacets = TRUE, retVolume = TRUE, retBarycenter = TRUE)
summary(Trs[[1]])
summary(Trs[[2]])
# Visualize the Tukey region
plot(Trs[[2]], drawRidges = FALSE, colorFacets = "green", alpha = 1)
plot(Trs[[1]], drawRidges = FALSE, newPlot = FALSE, colorFacets = "blue",
  alpha = 0.5)