Contents

Gotta Graph 'Em All!

Figure 1 — Scatter plot of Health vs Pokemon Rank: The Pokemon’s evolution stage in shown by the different style of marker and color. The Pokemon’s speed determines the size of the marker. Pokemon’s with larger speeds have larger markers. Hover your mouse over a point to see the corresponding Pokemon.

 

Things I do on a Friday night

Friday 5:00PM: I have nothing better to do and my new blog needs content so I guess I will revisit a Pokemon dataset I got hold of last year. Actually, it was with this dataset that I wrote my first blog related to data visualization. I spent like 3 days collecting the data, processing it, making graphs, and writing the whole damn thing. It’s been over a year and the Pokemon blog I wrote in Medium has barely gained any views. However, I’ve learned so many tricks since then so I figured why not create a cool graph with this data.

 

Two hours later: Well I could not find the original Pokemon dataset I used before so I had to find three different data sets and compile them into one. One dataset contained all the Pokemon’s stats but was missing the rank and evolutionary stage. So I had to dig around the web to find the missing information.

 

Friday 9:00PM: After I compiled all the information I took a break. Now I am realizing that seaborn and mpld3 don’t like each other very well. The whole point of this article is to create a scatter plot that is interactive and shows a Pokemon image as you hover over the points. I want to avoid using matplotlib because it will take too much work to make a nice looking plot where I can control the color, marker size, and marker type based on a Pokemon’s stats. There has to be a way to do it.

 

Friday 11:00PM I have been digging deeper and deeper into the net on how to make this work. I have found multiple examples on how mpld3 and maplotlib are compatible and offer the feature I want when used together. As a matter of fact, you can use only matplotlib to create a matplotlib object that shows an image as you hover a point. That’s great but since I want to embed the scatter plot on my webpage, I need to save the matplotlib figure as an html file. It turns out that when you save an matplotlib figure as html it loses some of it’s functionalities. Because of that, I looked into plotly the obvious choice to make interactive figures; however, plotly does not innately allows you to have images pop out as you hover over a point. This is a feature that has been requested since 2016 but remains to be added.

 

Friday 11:55PM: Well that took much longer than I expected but here we go. Go hover your mouse over a point in the figure above to see a cool trick. To make this graph, I had to do a bunch of maneuvers. The main packages that I used are pandas, seaborn, matplotlib, and mpld3. Making seaborn and mpld3 to work correctly was a bit of a challenge. This is the line that did the trick for me:

1
`plugins.connect(fig, mpld3.plugins.PointHTMLTooltip(ax.get_children()[0], labels))`

 

I will just paste the Python source code here and you can figure out the rest.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
###############################################################################
#                          1. Importing Libraries                             #
###############################################################################
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
from mpld3 import fig_to_html, plugins
import mpld3

import io
import requests


###############################################################################
#                             2. Helper Functions                             #
###############################################################################
def reorder_data(data, rank, stage):
    """Sorts the data df based on the rank df"""
    ordered_pokemon = list(rank['Pokemon'].values)
    unordered_pokemon = list(data['name'].values)

    # Find order
    ordered_indices = [unordered_pokemon.index(pokemon) for pokemon in ordered_pokemon]

    # Reoder data
    df = data.reindex(ordered_indices)
    df['rank'] = rank['Rank'].values


    # Get stages
    name = [name.strip() for name in stage['name'].values]
    stage['name'] = name
    keep = [line in ordered_pokemon for line in stage['name']]
    stage = stage[keep]
    stage = stage.drop_duplicates()
    stage = stage.reset_index(drop = True)
    unordered_pokemon_stage = list(stage['name'].values)

    # Get stages sorted
    # Find order
    ordered_indices = [unordered_pokemon_stage.index(pokemon) for pokemon in ordered_pokemon]

    # Reset index
    stage = stage.reindex(ordered_indices)

    # Add stage
    df['stage'] = stage['stage'].values
    return df

def get_labels(pokemons, data):
    "connects labels to images"
    temp_data = pokemons[['name', 'rank']]
    ordered_pokemon = list(temp_data['name'].values)
    unordered_pokemon = list(data['name'].values)

    # Find order
    ordered_indices = [unordered_pokemon.index(pokemon)+1 for pokemon in ordered_pokemon]

    # Create empty list
    labels = []
    # Create tags
    for ii in ordered_indices:
        raw_path = f"https://raw.githubusercontent.com/frank-ceballos/frank-blog/master/content/posts/04PokemonGraph/images/main_sprites/{ii}.png"
        temp_tag = f'<img src="{raw_path}" alt="image name">'
        labels.append(temp_tag)

    return labels

###############################################################################
#                             3. Create Dataset                               #
###############################################################################
# Define urls
url_data = "https://raw.githubusercontent.com/frank-ceballos/frank-blog/master/content/posts/04PokemonGraph/data/pokemon_data.csv"
url_rank = "https://raw.githubusercontent.com/frank-ceballos/frank-blog/master/content/posts/04PokemonGraph/data/pokemon_rank.csv"
url_stage = "https://raw.githubusercontent.com/frank-ceballos/frank-blog/master/content/posts/04PokemonGraph/data/stages.csv"

# Get pokemon data
s=requests.get(url_data).content
data = pd.read_csv(io.StringIO(s.decode('utf-8')))

# Get rank data
s=requests.get(url_rank).content
rank = pd.read_csv(io.StringIO(s.decode('utf-8')))

# Get stage data
s=requests.get(url_stage).content
stage = pd.read_csv(io.StringIO(s.decode('utf-8')))

# Reorderdata based on rank
pokemons = reorder_data(data, rank, stage)

# Process data
pokemons = pokemons.loc[pokemons['generation'] == 1]

# Drop features
features_to_remove = ['japanese_name', 'percentage_male', 'height_m', 'weight_kg', 'classfication', 'abilities', 'type2']
pokemons = pokemons.drop(features_to_remove, axis = 1)


###############################################################################
#                               3. Create Graph                               #
###############################################################################
# Set seaborn enviroment and font size
sns.set(font_scale = 1.5)
sns.set_style({"axes.facecolor": "0.95", "axes.edgecolor": "1", "grid.color": "1",
               "grid.linestyle": "-", 'axes.labelcolor': '0', "xtick.color": "1",
               'ytick.color': '1', 'axes.spines.left': True,
 'axes.spines.bottom': True,
 'axes.spines.right': True,
 'axes.spines.top': True})

# Define color palette
color_palette = ['#e53d00', '#00cc66', '#ffb400']

# Create figure
fig, ax = plt.subplots(figsize=(12,9))

# Create scatterplot
sns.scatterplot(x = 'rank', y = 'hp' , hue = 'stage', style = 'stage',
                     label = None, size = 'speed', sizes=(50, 400),
                     palette = color_palette, legend = False, data = pokemons,
                     ax = ax)

# Change axis labels
plt.xlabel('Pokemon Rank')
plt.ylabel('Health (HP)')

# Change the x-axis range
plt.xlim(-1, 155)

# Manually create a legend since mpld3 cant render the sns.scatterplot legend
ax.plot([], [], "o", color=color_palette[0] , label="Basic")
ax.plot([], [], "x", color=color_palette[2] , label="Stage 1")
ax.plot([], [], "o", color=color_palette[1] , label="Stage 2")
ax.legend(title="Evolutionary Stage", loc="best", framealpha=1, fontsize = 'medium',
          markerscale = 2, facecolor = 'white')

# Tight layout
plt.tight_layout()

# Create labels for points
labels = get_labels(pokemons, data)

# Connect sns and mpld3
plugins.connect(fig, mpld3.plugins.PointHTMLTooltip(ax.get_children()[0], labels))

# Save figure
file_name = 'pokemon_graph.html'
mpld3.save_html(fig, file_name)

Until next time, take care, and code everyday!

Buy Me A Coffee