Facing data with Chernoff faces

Introduction

This blog post proclaims the Python package “ChernoffFace” and outlines and exemplifies its function chernoff_face that generates Chernoff diagrams.

The design, implementation strategy, and unit tests closely resemble the Wolfram Repository Function (WFR) ChernoffFace, [AAf1], and the original Mathematica package “ChernoffFaces.m”, [AAp1].


Installation

To install from GitHub use the shell command:

python -m pip install git+https://github.com/antononcube/Python-packages.git#egg=ChernoffFace\&subdirectory=ChernoffFace

To install from PyPI:

python -m pip install ChernoffFace


Usage examples

Setup

from ChernoffFace import *
import numpy
import matplotlib.cm

Random data

# Generate data
numpy.random.seed(32)
data = numpy.random.rand(16, 12)
# Make Chernoff faces
fig = chernoff_face(data=data,
                    titles=[str(x) for x in list(range(len(data)))],
                    color_mapper=matplotlib.cm.Pastel1)
png

Employee attitude data

Get Employee attitude data

dfData=load_employee_attitude_data_frame()
dfData.head()
RatingComplaintsPrivilegesLearningRaisesCriticalAdvancement
043513039619245
163645154637347
271706869768648
361634547548435
481785666718347

Rescale the variables:

dfData2 = variables_rescale(dfData)
dfData2.head()
RatingComplaintsPrivilegesLearningRaisesCriticalAdvancement
00.0666670.2641510.0000000.1219510.4000001.0000000.425532
10.5111110.5094340.3962260.4878050.4444440.5581400.468085
20.6888890.6226420.7169810.8536590.7333330.8604650.489362
30.4666670.4905660.2830190.3170730.2444440.8139530.212766
40.9111110.7735850.4905660.7804880.6222220.7906980.468085

Make the corresponding Chernoff faces:

fig = chernoff_face(data=dfData2,
                    n_columns=5,
                    long_face=False,
                    color_mapper=matplotlib.cm.tab20b,
                    figsize=(8, 8), dpi=200)
png

USA arrests data

Get USA arrests data:

dfData=load_usa_arrests_data_frame()
dfData.head()
StateNameMurderAssaultUrbanPopulationRape
0Alabama13.22365821.2
1Alaska10.02634844.5
2Arizona8.12948031.0
3Arkansas8.81905019.5
4California9.02769140.6

Rescale the variables:

dfData2 = variables_rescale(dfData)
dfData2.head()
StateNameMurderAssaultUrbanPopulationRape
0Alabama0.7469880.6541100.4406780.359173
1Alaska0.5542170.7465750.2711860.961240
2Arizona0.4397590.8527400.8135590.612403
3Arkansas0.4819280.4965750.3050850.315245
4California0.4939760.7910961.0000000.860465

Make the corresponding Chernoff faces using USA state names as titles:

fig = chernoff_face(data=dfData2,
                    n_columns=5,
                    long_face=False,
                    color_mapper=matplotlib.cm.tab20c_r,
                    figsize=(12, 12), dpi=200)
png

References

Articles

[AA1] Anton Antonov, “Making Chernoff faces for data visualization”, (2016), MathematicaForPrediction at WordPress.

Functions and packages

[AAf1] Anton Antonov, ChernoffFace, (2019), Wolfram Function Repository.

[AAp1] Anton Antonov, Chernoff faces implementation in Mathematica, (2016), MathematicaForPrediction at GitHub.

Leave a comment