How to visualize data in Python using bokeh library

We have seen in the previous article about data visualization and the one of the libraries matplotlib used in Python for visualization. Now let us cover another one Bokeh library clearly with example.

          Bokeh is a Python native library that allows you to create interactive web-based plots. Bokeh renders its plots using HTML and Javascript providing high level interactivity.

          The fundamental behind Bokeh is that graphs are built up one layer at a time. It provides different level of interfaces to users to choose  from basic plots with very few customization to high level advanced plots with full control over visualization.

          Typically the interfaces are divided into two levels.

  • bokeh.plotting: The intermediate level interface that is comparable to matplotlib. It provides functions to create figures and glyphs for a plot/graphic.
  • bokeh.model: This provides a low level interface that provides high-end flexibility to the application developers. This interface provides complete control over how the Bokeh plots are assembled, configured, and displayed.

Installing the Bokeh Library:

Install the Bokeh library with the following command.

pip install pandas-bokeh

conda install bokeh

After installing packages now let us see two examples. Bokeh offers wide variety of ways to produce interactive outputs. The most common ones are:

  • output_file( )

Generate simple standalone HTML documents for visualization.

  • output_notebook( )

Display output in Jupyter/Zeppelin Notebooks.

Let us see examples of implementing Bokeh library in Python First we will see a basic graph of drawing some points with glyphs

from bokeh.plotting import figure, output_file, show
p=figure(width=400, height=400, tools="pan,rest,save")
p.circle([1,3,5,2,3],[2,3,1,2.5], radius=0.2, alpha=0.5)
output_file("sample2.html")
show(p)

Plotting Graph with CSV File

Now we will try to read data from CSV file.

from bokeh.plotting import figure
from bokeh.io import output_file, show
import pandas
df=pandas.read_csv("sample1.csv")
x=df["x"]
y=df["y"]
output_file("sample1.html")
fig1=figure(width=300, height=300, tools="pan,reset,save")
fig1.line(x,y)
show(fig1)

In this article we have seen the Bokeh library with examples like basic graph visualization and extracting data from csv file and visualizing it.

How to visualize data using matplotlib library

Data Visualization

Data Visualization is a fundamental step involved in the activities of a data scientist. It is a process of projecting complex information in a visual form or context to give more understanding and insights.

It communicates the relationships of the data with images. This is important because it allows trends and patterns to be more easily seen.

It helps turn complex numbers into a story that people can easily understand.

Primary uses include

  • Explore data
  • Communicate data

In this article let’s explore the matplotlib library in python.

matplotlib:

A wide variety of tools exist for visualizing data like matplotlib, pandas visualization, seaborn, ggplot, plotly.

Today we will explore matplotlib

It is the most widely used package for 2D graphics. It is low level and provides lot of freedom.

As matplotlib is not part of the core python library first we need to install it using the command

python -m pip install matplotlib

We will be using matplotlib.pyplot module.

How to plot a line chart using matplotlib?

Line chart is useful in tracking changes over short and long periods of time. When smaller changes exist, line charts are better to understand. It is also useful in comparing changes over same period of time for more than one group.

Syntax:

import matplotlib.pyplot as plt

plt.plot(x_values,y_values)

Here x_values means values to plot on the x-axis and y_values means values to plot on the y-axis.

Example:

Now let us see the number of employees in Company A for the last five years.

import matplotlib.pyplot as plt

# number of employees

emp_count = [325, 400, 530, 605, 710, 600]

year = [2016, 2017, 2018, 2019, 2020, 2021]

# plot a line chart

plt.plot(year, emp_count,'o-g')

#set axis titles

plt.xlabel("Year")

plt.ylabel("Employees")

#set chart title

plt.title("Employee Growth")

plt.show()

Output:

How to chart multiple lines in a single chart?

Now let us consider two companies A and B and its employee growth for the same time period.

import matplotlib.pyplot as plt

# number of employees

emp_count_A = [325, 400, 530, 605, 710, 600]

emp_count_B = [225, 310, 360, 300, 450, 560]

year = [2016, 2017, 2018, 2019, 2020, 2021]

# plot a line chart

plt.plot(year, emp_count_A,'o-b')

plt.plot(year, emp_count_B,'o-r')

#set axis titles

plt.xlabel("Year")

plt.ylabel("Employees")

#set chart title

plt.title("Employee Growth")

#legend

plt.legend(['A', 'B'])

plt.show()

Output:

In this above example both lines share the same axis. So to distinguish them we have used legends.

In the above two examples in addition to the values for x and y axis we have third argument ‘o-g’ which means format string. These are abbreviation for quickly setting line properties.

A format string consists of three parts ‘[marker][line][color]’ with each of them being optional.

To know about bokeh library click here!