In general, conventional analysis for high-dimensional chemical data may be divided into two distinct families: clustering and classification. Clustering strategies are seeking to describe a data set in groups of related objectives. Clusters are defined via determining how individual measurements relate to each other using some kind of optimized distance criterion (e. g. , nearest neighbor, minimizing cluster variance). classification methods, alternatively, try to predict data about an unknown sample based on previously obtained data. each method is typically optimized for one task and not for the other. Analytical methods can be supervised, in which case the evaluation algorithm consists of information about known individual samples and is forced to treat them in the same way, or unsupervised, wherein all cases are evaluated equally, with out external information.
In general
, conventional analysis for high-dimensional chemical
data
may
be divided
into two distinct families: clustering and classification. Clustering strategies are seeking to
describe
a
data
set in groups of related objectives. Clusters
are defined
via determining how individual measurements relate to each other using
some
kind of optimized distance criterion (
e. g.
,
nearest
neighbor, minimizing cluster variance).
classification
methods,
alternatively
, try to predict
data
about an unknown sample based on previously obtained
data
.
each
method is
typically
optimized for one task and not for the other. Analytical methods can
be supervised
, in which case the evaluation algorithm consists of information about known individual samples and
is forced
to treat them
in the same way
, or unsupervised, wherein all cases
are evaluated
equally
,
with out
external information.