This topic introduces the CorrelationCalculator which is part of the Infragistics Math Calculators™ library and explains, with code examples, how to use it to calculate correlation between two variables in a data set.
The topic is organized as follows:
Assembly Requirements
Data Requirements
In statistic, data correlation is often referred to as Pearson’s product-moment correlation coefficient (PPMCC). Correlation is used to measure a linear association between two variables in a data set. Furthermore, it indicates the degree of linear relationship between these variables. The Correlation coefficient is calculated using the CorrelationCalculator class.
Correlation coefficient is derived by dividing the covariance of the two variables by the product of their standard deviations:
where
Figure 1 – Formula for Data Correlation
The correlation coefficient ranges between positive one and negative one. The closer the coefficient is to either −1 or 1, the stronger the relationship between the variables in a data set. In a perfect positive (increasing) relationship, the correlation is +1. It’s −1 with a perfect decreasing (negative) linear relationship and a value between −1 and 1 in all other cases. As the correlation approaches zero there is less of a relationship between the variables in a data set.
Table 1 – Types of Data Correlation
This section provides a list of properties of the CorrelationCalculator object.
In order to use the CorrelationCalculator, the following Nuget package reference must be added to a WPF project.
Infragistics.WPF.Math.Calculators
For more information on setting up the NuGet feed and adding NuGet packages, you can take a look at the following documentation: NuGet Feeds.
The CorrelationCalculator uses ItemsSource property for data binding and XMemberPath and YMemberPath properties for data mapping. Any object that meets the following requirements can be bound to this property:
The data model must implement IEnumerable interface (e.g. List, Collection, Queue, Stack)
The data model must contain items that have at least two numeric data columns for calculating the correlation between them.
An example of object that meets above criteria is the CorrelationDataSample which you can download from the Correlation Data Sample resource and use it in your project.
This example demonstrates how to calculate correlation between two variables in a set of data using the CorrelationCalculator. The CorrelationCalculator is a non-visual element and it should be defined in resources section on application, page, or control level, the same way as you would define a data source. Refer also to the Series Data Correlation topic for example on how to integrate the CorrelationCalculator with the xamDataChart™ control.
In XAML:
xmlns:ig="http://schemas.infragistics.com/xaml" xmlns:local="clr-namespace:Infragistics.Samples.Data.Models.Series"
In XAML:
<local:CorrelationDataSample x:Key="Data"/> <ig:CorrelationCalculator x:Key="CorrelationCalc" XMemberPath="X" YMemberPath="Y" ItemsSource="{StaticResource Data}"> </ig:CorrelationCalculator>
In Visual Basic:
Imports Infragistics.Samples.Data.Models.Series Imports Infragistics.Math.Calculators '... Dim data As New CorrelationDataSample() Dim correlationCalc As New CorrelationCalculator() correlationCalc.ItemsSource = data correlationCalc.XMemberPath = "X" correlationCalc.YMemberPath = "Y" Dim correlation As Double = correlationCalc.Value
In C#:
using Infragistics.Samples.Data.Models.Series; using Infragistics.Math.Calculators; //... CorrelationDataSample data = new CorrelationDataSample(); CorrelationCalculator correlationCalc = new CorrelationCalculator(); correlationCalc.ItemsSource = data; correlationCalc.XMemberPath = "X"; correlationCalc.YMemberPath = "Y"; double correlation = correlationCalc.Value;