How to visualize data in Python using bokeh library

We have seen in the previous article about data visualization and the one of the libraries matplotlib used in Python for visualization. Now let us cover another one Bokeh library clearly with example.

          Bokeh is a Python native library that allows you to create interactive web-based plots. Bokeh renders its plots using HTML and Javascript providing high level interactivity.

          The fundamental behind Bokeh is that graphs are built up one layer at a time. It provides different level of interfaces to users to choose  from basic plots with very few customization to high level advanced plots with full control over visualization.

          Typically the interfaces are divided into two levels.

  • bokeh.plotting: The intermediate level interface that is comparable to matplotlib. It provides functions to create figures and glyphs for a plot/graphic.
  • bokeh.model: This provides a low level interface that provides high-end flexibility to the application developers. This interface provides complete control over how the Bokeh plots are assembled, configured, and displayed.

Installing the Bokeh Library:

Install the Bokeh library with the following command.

pip install pandas-bokeh

conda install bokeh

After installing packages now let us see two examples. Bokeh offers wide variety of ways to produce interactive outputs. The most common ones are:

  • output_file( )

Generate simple standalone HTML documents for visualization.

  • output_notebook( )

Display output in Jupyter/Zeppelin Notebooks.

Let us see examples of implementing Bokeh library in Python First we will see a basic graph of drawing some points with glyphs

from bokeh.plotting import figure, output_file, show
p=figure(width=400, height=400, tools="pan,rest,save")
p.circle([1,3,5,2,3],[2,3,1,2.5], radius=0.2, alpha=0.5)
output_file("sample2.html")
show(p)

Plotting Graph with CSV File

Now we will try to read data from CSV file.

from bokeh.plotting import figure
from bokeh.io import output_file, show
import pandas
df=pandas.read_csv("sample1.csv")
x=df["x"]
y=df["y"]
output_file("sample1.html")
fig1=figure(width=300, height=300, tools="pan,reset,save")
fig1.line(x,y)
show(fig1)

In this article we have seen the Bokeh library with examples like basic graph visualization and extracting data from csv file and visualizing it.

Deep Insight into Big Data: Understanding Big Data Basics

What is Big Data?

Big data refers to the large collection of structured, semi-structured and unstructured data mostly collected from Internet connected devices. This represents the massive amount of data an organization is exposed to daily and cannot be managed by traditional database management systems. It led to the evolution of model-driven paradigm to data-driven paradigm. It is always important how an organization use this large data to yield insights that results in better informed decisions. The importance of big data is not in the amount of data but how you use this data.

Characteristics of Big Data

The term big data refers to large data set (Volume), structured, semi-structured and unstructured data (Variety), arriving faster than before (Velocity).  These are 3V.

3V:

Volume:       

            The volume of data stored today is growing exponentially and exploding. Now the data volume has grown from terabytes to zettabytes.

Velocity:

            Represents both the rate at which the data is generated and needs to be handled.

Variety:

            As data collected is not from a single source the variety of data also differs according to the source like emails, web, text or sensors by structured or unstructured data.

Now as big data evolved in due course of time the characteristics also evolved from 3Vs to 6Vs.

6V:

As data grows tremendously in todays internet world, today’s big data is tomorrow’s small data.