You are here

An Introduction to Data Science

An Introduction to Data Science

December 2017 | 288 pages | SAGE Publications, Inc

An Introduction to Data Science is an easy-to-read, gentle introduction for advanced undergraduate, certificate, and graduate students coming from a wide range of backgrounds into the world of data science. After introducing the basic concepts of data science, the book builds on these foundations to explain data science techniques using the R programming language and RStudio® from the ground up. Short chapters allow instructors to group concepts together for a semester course and provide students with manageable amounts of information for each concept. By taking students systematically through the R programming environment, the book takes the fear out of data science and familiarizes students with the environment so they can be successful when performing advanced functions.


The authors cover statistics from a conceptual standpoint, focusing on how to use and interpret statistics, rather than the math behind the statistics. This text then demonstrates how to use data effectively and efficiently to construct models, predict outcomes, visualize data, and make decisions. Accompanying digital resources provide code and datasets for instructors and learners to perform a wide range of data science tasks.  

About the Authors
Introduction: Data Science, Many Skills
What Is Data Science?

The Steps in Doing Data Science

The Skills Needed to Do Data Science

Chapter 1 • About Data
Storing Data—Using Bits and Bytes

Combining Bytes Into Larger Structures

Creating a Data Set in R

Chapter 2 • Identifying Data Problems
Talking to Subject Matter Experts

Looking for the Exception

Exploring Risk and Uncertainty

Chapter 3 • Getting Started With R
Installing R

Using R

Creating and Using Vectors

Chapter 4 • Follow the Data
Understand Existing Data Sources

Exploring Data Models

Chapter 5 • Rows and Columns
Creating Dataframes

Exploring Dataframes

Accessing Columns in a Dataframe

Chapter 6 • Data Munging
Reading a CSV Text File

Removing Rows and Columns

Renaming Rows and Columns

Cleaning Up the Elements

Sorting Dataframes

Chapter 7 • Onward With RStudio®
Using an Integrated Development Environment

Installing RStudio

Creating R Scripts

Chapter 8 • What’s My Function?
Why Create and Use Functions?

Creating Functions in R

Testing Functions

Installing a Package to Access a Function

Chapter 9 • Beer, Farms, and Peas and the Use of Statistics
Historical Perspective

Sampling a Population

Understanding Descriptive Statistics

Using Descriptive Statistics

Using Histograms to Understand a Distribution

Normal Distributions

Chapter 10 • Sample in a Jar
Sampling in R

Repeating Our Sampling

Law of Large Numbers and the Central Limit Theorem

Comparing Two Samples

Chapter 11 • Storage Wars
Importing Data Using RStudio

Accessing Excel Data

Accessing a Database

Comparing SQL and R for Accessing a Data Set

Accessing JSON Data

Chapter 12 • Pictures Versus Numbers
A Visualization Overview

Basic Plots in R

Using ggplot2

More Advanced ggplot2 Visualizations

Chapter 13 • Map Mashup
Creating Map Visualizations With ggplot2

Showing Points on a Map

A Map Visualization Example

Chapter 14 • Word Perfect
Reading in Text Files

Using the Text Mining Package

Creating Word Clouds

Chapter 15 • Happy Words?
Sentiment Analysis

Other Uses of Text Mining

Chapter 16 • Lining Up Our Models
What Is a Model?

Linear Modeling

An Example—Car Maintenance

Chapter 17 • Hi Ho, Hi Ho—Data Mining We Go
Data Mining Overview

Association Rules Data

Association Rules Mining

Exploring How the Association Rules Algorithm Works

Chapter 18 • What’s Your Vector, Victor?
Supervised and Unsupervised Learning

Supervised Learning via Support Vector Machines

Support Vector Machines in R

Chapter 19 • Shiny® Web Apps
Creating Web Applications in R

Deploying the Application

Chapter 20 • Big Data? Big Deal!
What Is Big Data?

The Tools for Big Data



Student Study Site
    • Lab and homework assignments accompany chapter material and are downloadable as R source code.
    • R Code from the book, available as an R source file.
    • Multimedia content includes links to YouTube videos showing demos of using R, audio, data, and web resources.
Instructor Resouce Site

Password-protected Instructor Resources include the following:


    • Editable, chapter-specific Microsoft® PowerPoint® slides offer you complete flexibility in easily creating a multimedia presentation for your course. Highlight essential content and features.
    • Lab and homework assignments and their solutions accompany chapter material and are downloadable as R source code.
    • R Code from the book, available as an R source file
    • Multimedia content includes links to YouTube videos showing demos of using R, audio, data, and web resources that appeal to students with different learning styles and prompts classroom discussion.

The application of data science in decision-making and the introductory knowledge level in exposing students unfamiliar with data science to the world of decision-making using AI and ML.

Mr Ganiyu Otukogbe
Cardiff School of Management, Cardiff Metropolitan University
July 11, 2023

Sample Materials & Chapters


Chapter 6

For instructors

Select a Purchasing Option