Feature Descriptive Statistics#
FeatureDescriptiveStatistics #
approx_num_distinct_values property #
approx_num_distinct_values: int | None
Approximate number of distinct values.
distinctness property #
distinctness: float | None
Fraction of distinct values of a feature over the number of all its values. Distinct values occur at least once.
Example
[a, a, b] contains two distinct values a and b, so distinctness is 2/3.
entropy property #
entropy: float | None
Entropy is a measure of the level of information contained in an event (feature value) when considering all possible events (all feature values).
Entropy is estimated using observed value counts as the negative sum of (value_count/total_count) * log(value_count/total_count).
Example
[a, b, b, c, c] has three distinct values with counts [1, 2, 2].
Entropy is then (-1/5*log(1/5)-2/5*log(2/5)-2/5*log(2/5)) = 1.055.
exact_num_distinct_values property #
exact_num_distinct_values: int | None
Exact number of distinct values.
extended_statistics property #
extended_statistics: dict | None
Additional statistics computed on the feature values such as histograms and correlations.