Book contents
Work in progress

Book version 0.4

1 Welcome!

Terence Parr and Jeremy Howard

Copyright © 2018-2019 Terence Parr. All rights reserved.
Please don't replicate on web or redistribute in any way.
This book generated from markup+markdown+python+latex source with Bookish.

You can make comments or annotate this page by going to the annotated version of this page. You'll see existing annotated bits highlighted in yellow. They are PUBLICLY VISIBLE. Or, you can send comments, suggestions, or fixes directly to Terence.



This book is a primer on machine learning for programmers trying to get up to speed quickly. You'll learn how machine learning works and how to apply it in practice. We focus on just a few powerful models (algorithms) that are extremely effective on real problems, rather than presenting a broad survey of machine learning algorithms as many books do. Co-author Jeremy used these few models to become the #1 competitor for two consecutive years at This narrow approach leaves lots of room to cover the models, training, and testing in detail, with intuitive descriptions and full code implementations.

2 “Almost all of AI's recent progress is through one type, in which some input data (A) is used to quickly generate some simple response (B).” — Andrew Ng in What Artificial Intelligence Can and Can't Do Right Now Harvard Business Review November 9, 2016

Because of our focus, this book does not cover data science in general nor does it present any case studies of machine learning from a business perspective. We also completely ignore the problem of clustering (unsupervised learning) that attempts to group similar data records together (whereas supervised learning maps an input record to an output value, such as an image to a category like “it's a car”). We find that supervised methods are more often applicable in practice, as corroborated by well-known machine learning researcher Andrew Ng.2

There are a multitude of machine learning models and choosing an appropriate model requires experience. This book is opinionated in the sense that we have chosen a single powerful model for you, Random Forests (TM), that will work in the vast majority of cases. {TODO: analogy: It's best not to start out trying 10 different foreign languages; better to focus on one.} We'll also explore k-nearest neighbors, decision trees, and linear regression models along the way in order to explain and motivate our “power tools,” random forests and neural networks. Part of our goal with this book is to show that, while there are lots of details, the overall process of applying machine learning is pretty straightforward. In fact, we'll look at a broadly-applicable recipe in [workflow].

{TODO: people jump to trendy stuff like deep learning without getting the basics. To get started though it's important to understand about data preparation and get an intuition for how machine learning works.}

1.1 Is this book right for you?

Computer programming is required to do machine learning and programmers are the primary target audience of this book. In particular, you really need at least one solid year of programming experience, preferably in Python. Python is the most common programming language used in machine learning and so it is the language we use in this book. If you don't already know Python, those with sufficient experience can learn it on the fly while reading. Others should spend some time getting comfortable with basic Python before reading this book.

You might be worried that you don't have sufficient knowledge of mathematics to understand machine learning. Don't worry! You'll be fine if you can dredge up some high school level algebra and geometry. As a happy coincidence, the models and techniques we explore in this book require very little mathematics background, even when we get to deep learning (neural networks). Some models require sophisticated mathematics to understand (for example, “support vector machines” and “hidden Markoff models”) but they are used much less frequently than mathematically-simpler but still powerful models like “Random Forests.”

Advancement mathematics is most often necessary when we try to prove a model's correctness, define model performance bounds, compare different models abstractly, design new models, and so on. In other words, to become a machine learning researcher you need serious math chops. Our goal, on the other hand, is to cover only as much mathematics as necessary to get you applying machine learning effectively in your job. As we go along, we'll cover any mathematics and notation needed for each discussion.

3 “To Explain the World: The Discovery of Modern Science” — Steven Weinberg, Harper Collins 2015

Figure 1.1. Pythagorus' theorem was expressed geometrically initially not algebraically.

Even when mathematics is necessary to understand a model, it's important to remember that the math notation is really just a precise and concise way to express the results of someone's intuitive leap. Consider the Pythagorean theorem we all learned in middle school: . Pythagoras could never have expressed his theorem with such a formula. As Steven Weinberg points out, “The Greeks never learned to write and manipulate algebraic formulas.”3 (Terence highly recommends this book.) Instead, Pythagoras arranged three squares so their edges formed a right triangle (one angle is 90 degrees) then observed that the area of the square whose edge formed the hypotenuse had the same area as the combined area of the other two squares; see Figure 1.1. It's pretty hard to start from the mathematics and work backwards to the intuition, which is why we're going to spend most of our time with the ideas and mechanisms behind machine learning.

1.2 Tools of the trade

Anaconda, notebooks, pandas, numpy, sckit-learn, matplotlib. jupyter lab.

1.3 What you'll learn

1.4 Supporting materials online

github repository with source code referenced or given in the book; artifacts generated from the book automatically to ipython notebooks. short videos?

All of the code snippets you see in this book, even the ones to generate figures, can be found in the notebooks generated from this book.

Data used by the various chapters is also available in data.

1.5 Acknowledgements

We thank Yannet Interian, professor of data science at the University of San Francisco, for her conversations and contributions to the techniques described in this book.

We hope you enjoy this book and we wish you good luck on your machine learning journey!

Terence ParrJeremy Howard
University of San and University of San Francisco