В пятницу, 26 октября, в 10.00, состоится лекция на тему "Compositional Data Analysis and representation of omics-abundances"


Vera Pawlowsky-Glahn - Emeritus Professor, University of Girona, Spain
Juan José Egozcue - Emeritus Professor, Technical University of Catalonia, Barcelona, Spain

Место проведения мероприятия - Мраморный зал.

Приглашаются все желающие!

О чем пойдет речь:

Raw analysis of D-part Compositional Data (CoDa) is based on the
assumption that the sample space for such data is the D-dimensional real
space endowed with the standard Euclidean geometry. This assumption can
lead to spurious correlations and other non-sensical results when using
multivariate statistical methods. The present state-of-the-art in CoDa
assumes the simplex to be a representation of the sample space and the
Aitchison geometry on the simplex as a way to overcome the difficulties.
These assumptions are based on the principles of scale invariance and
subcompositional coherence, which justification and implications are
discussed, as well as the basic operations (perturbation, powering,
Aitchison inner product) underlying the Aitchison geometry.
This theoretical framework is applied to omics-abundances. Omics studies
generate data sets characterized by a large number of relative
abundances referred to species, OTU's, metabolites and the like. Most of
these data sets are affected by a large number of zero counts. However,
there is a common consensus that these data should be treated as
compositional. The isometric log-ratio representation (ilr) of
compositions provides a way to analyse abundance data avoiding the
misleading results obtained when abundances are considered as real
variables. The main available compositional tools are the compositional
singular value decomposition (csvd) and sequential binary partitions
(sbp). The csvd provides data driven ilr-coordinates or simple-sparse
approximations by principal balances. The use of sbp's gives the
opportunity of designing ilr-coordinates (balances) adapted to the problem.