How to make a sunburst plot in R or Python?

3 min read 08-10-2024
How to make a sunburst plot in R or Python?


Sunburst plots are a powerful visualization tool that helps represent hierarchical data through an interactive circular layout. This technique allows viewers to see multiple levels of the hierarchy simultaneously, making it easier to understand complex datasets. In this article, we will walk you through the steps to create a sunburst plot using both R and Python.

Understanding the Problem

Creating a sunburst plot may seem daunting at first, especially if you're unfamiliar with the required libraries and functions. However, with a clear understanding of the process, you can generate stunning visualizations in no time. This article aims to simplify the creation of sunburst plots and provide you with practical examples using R and Python.

Scenario Overview

Let's say you have a dataset that details a company's sales data categorized by region, product type, and sales channel. Your goal is to visualize this hierarchy using a sunburst plot. We'll provide example code in both R and Python to help you achieve this.

Example Data

To illustrate the process, let's create a sample dataset structured as follows:

Region Product Type Sales Channel Sales Amount
North Electronics Online 2000
North Furniture In-store 1500
South Electronics Online 3000
South Furniture In-store 3500
East Electronics Online 2500
East Furniture In-store 2000
West Electronics Online 4000
West Furniture In-store 3000

Original Code

Creating a Sunburst Plot in R

For R, we will use the plotly and dplyr libraries to create our sunburst plot.

# Load required libraries
library(plotly)
library(dplyr)

# Sample dataset
data <- data.frame(
  Region = c("North", "North", "South", "South", "East", "East", "West", "West"),
  Product = c("Electronics", "Furniture", "Electronics", "Furniture", "Electronics", "Furniture", "Electronics", "Furniture"),
  Channel = c("Online", "In-store", "Online", "In-store", "Online", "In-store", "Online", "In-store"),
  Sales = c(2000, 1500, 3000, 3500, 2500, 2000, 4000, 3000)
)

# Create a sunburst plot
plot_ly(data, 
        ids = ~paste(Region, Product, Channel, sep = "/"), 
        labels = ~paste(Region, Product, Channel, sep = ": "), 
        parents = ~paste(Region, Product, sep = "/"), 
        values = ~Sales, 
        type = 'sunburst') %>%
  layout(title = "Sales Data Sunburst Plot")

Creating a Sunburst Plot in Python

For Python, we will use the plotly library to visualize the data.

import pandas as pd
import plotly.express as px

# Sample dataset
data = {
    "Region": ["North", "North", "South", "South", "East", "East", "West", "West"],
    "Product": ["Electronics", "Furniture", "Electronics", "Furniture", "Electronics", "Furniture", "Electronics", "Furniture"],
    "Channel": ["Online", "In-store", "Online", "In-store", "Online", "In-store", "Online", "In-store"],
    "Sales": [2000, 1500, 3000, 3500, 2500, 2000, 4000, 3000]
}

df = pd.DataFrame(data)

# Create a sunburst plot
fig = px.sunburst(df, 
                   path=['Region', 'Product', 'Channel'], 
                   values='Sales', 
                   title='Sales Data Sunburst Plot')

# Show plot
fig.show()

Analysis and Clarification

Creating sunburst plots is a straightforward process, provided you have structured your data in a hierarchical manner. Here are some additional insights to optimize your sunburst plots:

  1. Hierarchical Structure: Make sure your data reflects a clear hierarchy for the sunburst plot to convey meaningful information. Each level should represent a parent-child relationship.

  2. Interactivity: One of the advantages of using libraries like Plotly is the ability to create interactive plots that allow users to hover over and click on segments for additional details.

  3. Visual Aesthetics: Customize your plots with colors and layout adjustments to make them visually appealing. Adding titles and labels can also help in understanding the plots better.

  4. Comparative Analysis: Sunburst plots can be especially useful for comparing categories across multiple levels. If your dataset grows in complexity, consider using additional visualizations to complement the sunburst plot.

Additional Resources

Conclusion

Creating a sunburst plot is a great way to visualize hierarchical data in a visually engaging manner. Whether you're using R or Python, the process is relatively simple with the right libraries. By following this guide and utilizing the example code, you'll be well on your way to representing your data effectively. Happy plotting!


Feel free to reach out if you have any further questions or need additional assistance with data visualization!