Tag : python use case


Install Spark on Windows (Local machine) with PySpark – Step by Step

Apache Spark is a general-purpose big data processing engine. It is a very powerful cluster computing framework which can run from a single cluster to thousands of clusters. It can run on clusters managed by Hadoop YARN, Apache Mesos, or by Spark’s standalone cluster manager itself. To read more on Spark Big data processing framework, visit this post “Big Data processing using Apache Spark – Introduction“. Here, in this post, we will learn how we can install Apache Spark on a local Windows Machine in a pseudo-distributed mode (managed by Spark’s standalone cluster manager) and run it using PySpark (Spark’s Python API).

Install Spark on Local Windows Machine

To install Apache Spark on a local Windows machine, we need to follow below steps:

Step 1 – Download and install Java JDK 8

Java JDK 8 is required as a prerequisite for the Apache Spark installation. We … More


Python use case – Save each worksheet as a separate excel workbook 1

In this post “Python use case – Save each worksheet as a separate excel workbook“, we are going to learn that how we can create a separate workbook for each worksheet of a given excel file. We will be copying data, values, formatting and all other settings of the sheet in the newly created workbook. Assume that we have an excel workbook having five sheets named “Sheet1“, “Sheet2“, “Sheet3“, “Sheet4“, and “Sheet5” which is as this.

Now, we want to create 5 different excel workbooks (one for each worksheet).

We are going to use Excel application to copy data, values, formatting and other settings of each sheet as a new workbook in a python script. Below is the python code.

To execute the above python script, we can call this .py file using command prompt window as this.… More