Data Science

A 15-Week Course to Introduce Machine Learning and Intelligent Systems in R

lantz-ml-in-rEvery fall, I teach a survey course for advanced undergraduates that covers one of the most critical themes in data science: intelligent systems. According to the IEEE, these are “systems that perceive, reason, learn, and act intelligently.” While data science is focused on analyzing data (often quite a lot of it) to make effective data-driven decisions, intelligent systems use those decisions to accomplish goals. As more and more devices join the Internet of Things (IoT), collecting data and sharing it with other “things” to make even more complex decisions, the role of intelligent systems will become even more pronounced.

So by the end of my course, I want students to have some practical skills that will be useful in analyzing, specifying, building, testing, and using intelligent systems:

  • Know whether a system they’re building (or interacting with) is intelligent… and how it could be made more intelligent
  • Be sensitized to ethical, social, political, and legal aspects of building and using intelligent systems 
  • Use regression techniques to uncover relationships in data using R (including linear, nonlinear, and neural network approaches)
  • Use classification and clustering methods to categorize observations (neural networks, k-means/KNN, Naive Bayes, support vector machines)
  • Be able to handle structured and unstructured data, using both supervised and unsupervised approaches
  • Understand what “big data” is, know when (and when not) to use it, and be familiar with some tools that help them deal with it

My course uses Brett Lantz’s VERY excellent book, Machine Learning with R (which is now also available in Kindle format), which I provide effusive praise for at

One of the things I like the MOST about my class is that we actually cover the link between how your brain works and how neural networks are set up. (Other classes and textbooks typically just show you a picture of a neuron superimposed with inputs, a summation, an activation, and outputs, implying that “See? They’re pretty much the same!”) But it goes much deeper than this… we actually model error-correction learning and observational learning through the different algorithms we employ. To make this point real, we have an amazing guest lecture every year by Dr. Anne Henriksen, who is also a faculty member in the Department of Integrated Science and Technology at JMU. She also does research in neuroscience at the University of Virginia. After we do an exercise where we use a spreadsheet to iteratively determine the equation for a single layer perceptron’s decision boundary, we watch a video by Dr. Mark Gluck that shows how what we’re doing is essentially error-correction learning… and then he explains the chemistry that supports the process. We’re going to videotape Anne’s lecture this fall so you can see it!

Here is the syllabus I am using for Fall 2015. Please feel free to use it (in full or in part) if you are planning a similar class… but do let me know!

3 replies »

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s