Collin K. Berke, Ph.D.
  • Home
  • About
  • Now
  • Blog
  • Today I Learned

On this page

  • Background
  • Data description
  • What are the lifespans of historical figures born on a leap day?
  • An attempt using Tableau

Exploring the lifespans of historical figures born on a Leap Day

data wrangling
data visualization
tidytuesday
plotly
Tableau
A contribution to the 2024-02-27 #tidytuesday social data project
Author

Collin K. Berke, Ph.D.

Published

March 5, 2024

Photo by Nick Hillier
library(tidyverse)
library(skimr)
library(plotly)
library(here)

Background

Happy belated Leap Day! This week’s #tidytuesday is focused on significant historical events and people who were born or died on a Leap Day. The aim of this post is to contribute a couple data visualizations to this social data project. Specifically, I used plotly and Tableau to create my contributions.

data_births <- read_csv(
  here(
    "blog",
    "posts",
    "2024-02-27-tidytuesday-2024-02-27-leap-day",
    "births.csv"
  )
)

Let’s do a quick glimpse() and skim() of our data, just so we get an idea of what we’re working with here.

glimpse(data_births)
Rows: 121
Columns: 4
$ year_birth  <dbl> 1468, 1528, 1528, 1572, 1576, 1640, 1692, 1724, 1736, 1792, 1812, 1828, 1836, …
$ person      <chr> "Pope Paul III", "Albert V", "Domingo Báñez", "Edward Cecil", "Antonio Neri", …
$ description <chr> NA, "Duke of Bavaria", "Spanish theologian", "1st Viscount Wimbledon", "Floren…
$ year_death  <dbl> 1549, 1579, 1604, 1638, 1614, 1704, 1763, 1822, 1784, 1868, 1880, 1921, 1908, …
skim(data_births)
Data summary
Name data_births
Number of rows 121
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
person 0 1.00 6 29 0 121 0
description 1 0.99 12 95 0 107 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
year_birth 0 1.00 1919.90 101.01 1468 1920 1944.0 1976 2004 ▁▁▁▁▇
year_death 65 0.46 1933.61 126.53 1549 1920 1989.5 2013 2023 ▁▁▁▁▇

Data description

This week’s data comes from the February 29th Wikipedia page. Three data sets are made available, one focused on significant events, as well as births and deaths of historical figures that occurred on a Leap Day. Given what’s available, I was interested in exploring the age and lifespan of the historical figures born on a Leap Day. Here’s the wrangling code I created to explore the data.

data_age <- data_births |>
  mutate(
    is_alive = ifelse(is.na(year_death), 1, 0),
    year_death = ifelse(is.na(year_death), 2024, year_death),
    age = year_death - year_birth
  ) |>
  arrange(desc(age)) |> 
  relocate(person, description, year_birth, year_death, age)

data_age$person <- factor(data_age$person, levels = data_age$person[order(data_age$year_birth)])

What are the lifespans of historical figures born on a leap day?

To explore this question, I decided to create a dumbbell chart. In the chart, the blue dots represent the person’s birth year. The black dot represents the year the person died. Absence of the black dot indicates a person is still alive, while the grey line represents the person’s lifespan. If you hover over the dots, a tool tip with information about each person is shown.

not_alive <- data_age |> filter(is_alive == 0)

plot_ly(
  data_age, 
  color = I("gray80"),
  text = ~paste(
    person, "<br>",
    "Age: ", age, "<br>",
    description 
  ),
  hoverinfo = "text"
) |>
  add_segments(x = ~year_birth, xend = ~year_death, y = ~person, yend = ~person, showlegend = FALSE) |>
  add_markers(x = ~year_birth, y = ~person, color = I("#0000FF"), name = "Birth year") |>
  add_markers(data = not_alive, x = ~year_death, y = ~person, color = I("black"), name = "Year passed") |>
  layout(
    title = list(
      text = "<b>Lifespans of historical figures born on a Leap Day</b>",
      xanchor = "center",
      yanchor = "top",
      font = list(family = "arial", size = 24)
    ),
    xaxis = list(
      title = "Year born | Year died"
    ),
    yaxis = list(
      title = ""
    )
  )

An attempt using Tableau

I also created a version of this visualization using Tableau. You can view my attempt here. I was required to make a few concessions with this attempt, as I was unable to have as much fine control of the plot elements as I would have liked. However, I’m happy with what turned out.

Reuse

CC BY 4.0

Citation

BibTeX citation:
@misc{berke2024,
  author = {Berke, Collin K},
  title = {Exploring the Lifespans of Historical Figures Born on a
    {Leap} {Day}},
  date = {2024-03-05},
  langid = {en}
}
For attribution, please cite this work as:
Berke, Collin K. 2024. “Exploring the Lifespans of Historical Figures Born on a Leap Day.” March 5, 2024.