Scaled Torus Principal Component Analysis Articles uri icon

publication date

  • October 2022

start page

  • 1

end page

  • 12

International Standard Serial Number (ISSN)

  • 1061-8600

Electronic International Standard Serial Number (EISSN)

  • 1537-2715

abstract

  • A particularly challenging context for dimensionality reduction is multivariate circular data, that is, data supported on a torus. Such kind of data appears, for example, in the analysis of various phenomena in environmental sciences and astronomy, as well as in molecular structures. This article introduces Scaled Torus Principal Component Analysis (ST-PCA), a novel approach to perform dimensionality reduction with toroidal data. ST-PCA finds a data-driven map from a torus to a sphere of the same dimension and a certain radius. The map is constructed with multidimensional scaling to minimize the discrepancy between pairwise geodesic distances in both spaces. ST-PCA then resorts to principal nested spheres to obtain a nested sequence of subspheres that best fits the data, which can afterwards be inverted back to the torus. Numerical experiments illustrate how ST-PCA can be used to achieve meaningful dimensionality reduction on low-dimensional torii, particularly with the purpose of clusters separation, while two data applications in astronomy (on a three-dimensional torus) and molecular biology (seven-dimensional torus) show that ST-PCA outperforms existing methods for the investigated datasets. Supplementary materials for this article are available online.

subjects

  • Statistics

keywords

  • dimension reduction; directional statistics; multidimensional scaling; principal component analysis; statistics on manifolds