Categorical Data

karthik · updated · flag

Let’s build up on the previous data. This time, an additional column ‘Brand name’. So if I survey the same sample and tabulate, this is how it might look.

Name No. of crayons Brand name
Michael 20 Camlin
Pam 40 Camlin
Dwight 0 None
Jim 8 Faber Castell
You 12 Camlin
Stanley 10 Faber Castell
Kevin 20 Camlin
Angela 1 Faber Castell
Oscar 2 Camlin
Kelly 30 Faber Castell

Here, the ‘Brand name’ column gives us the categorical data about the brand usage. 50% of the sample use Camlin and 40% use Faber Castell. 94 crayons are Camlin and 49 are Faber Castell. Besides this insight, this also gives us an option to renew the summarization that we did in the previous lesson. In other words, we can now group the sample based on the brand and combine the non-numerical data.

In the upcoming Mindspace crash courses, we’ll be looking at how can we find relationships between these data types and test against different hypothesis with practical examples.