Tag : EDA


Relationship between Binomial and Poisson distributions

In this post, we are going to discuss the Relationship between Binomial and Poisson distributions. We know that Poisson distribution is a limit of Binomial distribution for a large n (number of trials) and small p (independent probability for each trial) values. A large number of trials n with very small probability p indicates a rare event in a binomial distribution. Considering this, we will simulate these distributions and then we will create a  CDF (cumulative distributed function) plot of Binomial and Poisson distributions. It will help us to understand the similarity between a Poisson experiment and a rare event Binomial experiment.

In this post, we will not be going into the mathematical details of Binomial and Poisson distributions. However, we will be using NumPy’s random module available in Python to simulate these distributions using a technique called bootstrapping.

Relationship between Binomial and Poisson distributions

Let’s start by understanding the … More


Exploratory Data Analysis (EDA) using Python – Second step in Data Science and Machine Learning

In the previous post, “Tidy Data in Python – First Step in Data Science and Machine Learning”, we discussed the importance of the tidy data and its principles. In a Machine Learning project, once we have a tidy dataset in place, it is always recommended to perform EDA (Exploratory Data Analysis) on the underlying data before fitting it into a Machine Learning model. Let’s start understanding the importance of EDA and some basic EDA techniques which are very useful.

What is Exploratory Data Analysis (EDA)

Exploratory Data Analysis or EDA, is the process of organizing, plotting and summarizing the data to find trends, patterns, and outliers using statistical and visual methods. It takes input data from a tabular format and represents it in a graphical format which makes it more human interpretable. It is an important step in a Machine Learning/Data Science project which should be performed before … More