Tutorial

Using StandardScaler() Function to Standardize Python Data

Published on August 3, 2022
author

Safa Mulani

Using StandardScaler() Function to Standardize Python Data

Hello, readers! In this article, we will be focusing on one of the most important pre-processing techniques in Python - Standardization using StandardScaler() function.

So, let us begin!!


Need for Standardization

Before getting into Standardization, let us first understand the concept of Scaling.

Scaling of Features is an essential step in modeling the algorithms with the datasets. The data that is usually used for the purpose of modeling is derived through various means such as:

  • Questionnaire
  • Surveys
  • Research
  • Scraping, etc.

So, the data obtained contains features of various dimensions and scales altogether. Different scales of the data features affect the modeling of a dataset adversely.

It leads to a biased outcome of predictions in terms of misclassification error and accuracy rates. Thus, it is necessary to Scale the data prior to modeling.

This is when standardization comes into picture.

Standardization is a scaling technique wherein it makes the data scale-free by converting the statistical distribution of the data into the below format:

  • mean - 0 (zero)
  • standard deviation - 1
Standardization
Standardization

By this, the entire data set scales with a zero mean and unit variance, altogether.

Let us now try to implement the concept of Standardization in the upcoming sections.


Python sklearn StandardScaler() function

Python sklearn library offers us with StandardScaler() function to standardize the data values into a standard format.

Syntax:

object = StandardScaler()
object.fit_transform(data)

According to the above syntax, we initially create an object of the StandardScaler() function. Further, we use fit_transform() along with the assigned object to transform the data and standardize it.

Note: Standardization is only applicable on the data values that follows Normal Distribution.


Standardizing data with StandardScaler() function

Have a look at the below example!

from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
 
dataset = load_iris()
object= StandardScaler()
 
# Splitting the independent and dependent variables
i_data = dataset.data
response = dataset.target
 
# standardization 
scale = object.fit_transform(i_data) 
print(scale)

Explanation:

  1. Import the necessary libraries required. We have imported sklearn library to use the StandardScaler function.
  2. Load the dataset. Here we have used the IRIS dataset from sklearn.datasets library. You can find the dataset here.
  3. Set an object to the StandardScaler() function.
  4. Segregate the independent and the target variables as shown above.
  5. Apply the function onto the dataset using the fit_transform() function.

Output:

Standardization Output
Standardization-Output

Conclusion

By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.

For more posts related to Python, Stay tuned @ Python with JournalDev and till then, Happy Learning!! :)

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the authors
Default avatar
Safa Mulani

author

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Join the Tech Talk
Success! Thank you! Please check your email for further details.

Please complete your information!

Become a contributor for community

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

DigitalOcean Documentation

Full documentation for every DigitalOcean product.

Resources for startups and SMBs

The Wave has everything you need to know about building a business, from raising funding to marketing your product.

Get our newsletter

Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.

New accounts only. By submitting your email you agree to our Privacy Policy

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

Get started for free

Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

*This promotional offer applies to new accounts only.