Tag Archives: machine learning

Supplier Quality Management: Seeking Test Data

Do you have, or have you had, a supplier selection problem to solve? I have some algorithms I’ve been working on to help you make better decisions about what suppliers to choose — and how to monitor performance over time. I’d like to test and refine them on real data. If anyone has data that you’ve used to select suppliers in the past 10 years, or have data that you’re working with right now to select suppliers, or have a colleague who may be able to share this data — that’s what I’m interested in sourcing.

Because this data can sometimes be proprietary and confidential, feel free to blind the names or identifying information for the suppliers — or I can do this myself (no suppliers, products, or parts will be named when I publish the results). I just need to be able to tell them apart. Tags like Supplier A or Part1SupplierA are fine. I’d prefer if you blinded the data, but I can also write scripts to do this and have you check them before I move forward.

Desired data format is CSV or Excel. Text files are also OK, as long as they clearly identify the criteria that you used for supplier selection. Email me at myfirstname dot mylastname at gmail if you can help out — and maybe I can help you out too! Thanks.

 

Value Propositions for Quality 4.0

In previous articles, we introduced Quality 4.0, the pursuit of performance excellence as an integral part of an organization’s digital transformation. It’s one aspect of Industry 4.0 transformation towards intelligent automation: smart, hyperconnected(*) agents deployed in environments where humans and machines cooperate and leverage data to achieve shared goals.

Automation is a spectrum: an operator can specify a process that a computer or intelligent agent executes, the computer can make decisions for an operator to approve or adjust, or the computer can make and execute all decisions. Similarly, machine intelligence is a spectrum: an algorithm can provide advice, take action with approvals or adjustments, or take action on its own. We have to decide what value is generated when we introduce various degrees of intelligence and automation in our organizations.

How can Quality 4.0 help your organization? How can you improve the performance of your people, projects, products, and entire organizations by implementing technologies like artificial intelligence, machine learning, robotic process automation, and blockchain?

A value proposition is a statement that explains what benefits a product or activity will deliver. Quality 4.0 initiatives have these kinds of value propositions:

  1. Augment (or improve upon) human intelligence
  2. Increase the speed and quality of decision-making
  3. Improve transparency, traceability, and auditability
  4. Anticipate changes, reveal biases, and adapt to new circumstances and knowledge
  5. Evolve relationships and organizational boundaries to reveal opportunities for continuous improvement and new business models
  6. Learn how to learn; cultivate self-awareness and other-awareness as a skill

Quality 4.0 initiatives add intelligence to monitoring and managing operations – for example, predictive maintenance can help you anticipate equipment failures and proactively reduce downtime. They can help you assess supply chain risk on an ongoing basis, or help you decide whether to take corrective action. They can also improve help you improve cybersecurity: documenting and benchmarking processes can provide a basis for detecting anomalies, and understanding expected performance can help you detect potential attacks.


(*) Hyperconnected = (nearly) always on, (nearly) always accessible.

What is Quality 4.0?

Image Credit: Doug Buckley of http://hyperactive.to

My first post of the year addresses an idea that’s just starting to gain traction – one you’ll hear a lot more about from me in 2018: Quality 4.0.  It’s not a fad or trend, but a reminder that the business environment is changing, and that performance excellence in the future will depend on how well you adapt, change, and transform in response. Although we started building community around this concept at the ASQ Quality 4.0 Summit on Disruption, Innovation, and Change, held in November 2017 in Dallas, the truly revolutionary work is yet to come.

The term “Quality 4.0” comes from “Industry 4.0” – referring to the “fourth industrial revolution” – originally addressed at the Hannover (Germany) Fair in 2011. That meeting emphasized the increasing intelligence and interconnectedness in “smart” manufacturing systems and reflected on the newest technological innovations in historical context.

In the first industrial revolution (late 1700’s), steam and water power made it possible for production facilities to scale up and expanded the potential locations for production. By the late 1800’s, the discovery of electricity and development of associated infrastructure enabled the development of machines for mass production. In the US, the expansion of railways made it easier to obtain supplies and deliver finished goods. The availability of power also sparked a renaissance in computing, and digital computing emerged from its analog ancestor. The third industrial revolution came at the end of the 1960’s, with the invention of the Programmable Logic Controller (PLC). This made it possible to automate processes like filling and reloading tanks of liquids, turning engines on and off, and controlling sequences of events based on changing environmental conditions.

Although the growth and expansion of the internet accelerated innovation in the late 1990’s and 2000’s, we are just now poised for another industrial revolution. What’s changing?

  • Production & Availability of Information: More information is available because people and devices are producing it at greater rates than ever before. Falling costs of enabling technologies like sensors and actuators are catalyzing innovation in these areas.
  • Connectivity: In many cases, and from many locations, that information is instantly accessible over the internet. Improved network infrastructure is expanding the extent of connectivity, making it more widely available and more robust. (And unlike the 80’s and 90’s, there are far fewer communications protocols that are commonly encountered so it’s a lot easier to get one device to talk to another device on your network.)
  • Intelligent Processing: Affordable computing capabilities (and computing power!) are available to process that information so it can be incorporated into decision making. High-performance software libraries for advanced processing and visualization of data are easy to find, and easy to use. (In the past, we had to write our own… now we can use open-source solutions that are battle tested.
  • New Modes of Interaction: The way in which we can acquire and interact with information are also changing, in particular through new interfaces like Augmented Reality (AR) and Virtual Reality (VR), which expand possibilities for training and navigating a hybrid physical-digital environment with greater ease.
  • New Modes of Production: 3D printing, nanotechnology, and gene editing (CRISPR) are poised to change the nature and means of production in several industries. Technologies for enhancing human performance (e.g. exoskeletons, brain-computer interfaces, and even autonomous vehicles) will also open up new mechanisms for innovation in production. (Roco & Bainbridge (2002) describe many of these, and their prescience is remarkable.) New technologies like blockchain have the potential to change the nature of production as well, by challenging ingrained perceptions of trust, control, consensus, and value.

If the first industrial revolution was characterized by steam-powered machines, the second was characterized by electricity and assembly lines. Innovations in computing and industrial automation defined the third industrial revolution.  The fourth industrial revolution is one of intelligence: smart, hyperconnected cyber-physical systems in environments where humans and machines cooperate to achieved shared goals, and use data to generate value.

These enabling technologies originate in the physical, digital, and biological domains, and include the following:

  • Information
    • Affordable Sensors and Actuators
    • Big Data infrastructure (e.g. MapReduce, Hadoop, NoSQL databases)
  • Connectivity
    • 5G Networks
    • IPv6 Addresses (which expand the number of devices that can be put online)
    • Internet of Things (IoT)
    • Cloud Computing
  • Processing
    • Predictive Analytics
    • Artificial Intelligence
    • Machine Learning (incl. Deep Learning)
    • Data Science
  • Interaction
    • Augmented Reality (AR)
    • Mixed Reality (MR)
    • Virtual Reality (VR)
    • Diminished Reality (DR)
  • Construction
    • 3D Printing
    • Additive Manufacturing
    • Smart Materials
    • Nanotechnology
    • Gene Editing
    • Automated (Software) Code Generation
    • Robotic Process Automation (RPA)
    • Blockchain

Today’s quality profession was born during the middle of the second industrial revolution, when methods were needed to ensure that assembly lines ran smoothly – that they produced artifacts to specifications, that the workers knew how to engage in the process, and that costs were controlled. As industrial production matured, those methods grew to encompass the design of processes which were built to produce to specifications. In the 1980’s and 1990’s, organizations in the US started to recognize the importance of human capabilities and active engagement in quality as essential, and TQM, Lean, and Six Sigma gained in popularity. 

How will these methods evolve in an adaptive, intelligent environment? The question is largely still open, and that’s the essence of Quality 4.0.

Roco, M. C., & Bainbridge, W. S. (2002). Converging technologies for improving human performance: Integrating from the nanoscale. Journal of nanoparticle research4(4), 281-295. (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.465.7221&rep=rep1&type=pdf)

A Simple Intro to Q-Learning in R: Floor Plan Navigation

This example is drawn from “A Painless Q-Learning Tutorial” at http://mnemstudio.org/path-finding-q-learning-tutorial.htm which explains how to manually calculate iterations using the updating equation for Q-Learning, based on the Bellman Equation (image from https://www.is.uni-freiburg.de/ressourcen/business-analytics/13_reinforcementlearning.pdf):

The example explores path-finding through a house:

The question to be answered here is: What’s the best way to get from Room 2 to Room 5 (outside)? Notice that by answering this question using reinforcement learning, we will also know how to find optimal routes from any room to outside. And if we run the iterative algorithm again for a new target state, we can find out the optimal route from any room to that new target state.

Since Q-Learning is model-free, we don’t need to know how likely it is that our agent will move between any room and any other room (the transition probabilities). If you had observed the behavior in this system over time, you might be able to find that information, but it many cases it just isn’t available. So the key for this problem is to construct a Rewards Matrix that explains the benefit (or penalty!) of attempting to go from one state (room) to another.

Assigning the rewards is somewhat arbitrary, but you should give a large positive value to your target state and negative values to states that are impossible or highly undesirable. Here’s the guideline we’ll use for this problem:

  • -1 if “you can’t get there from here”
  • 0 if the destination is not the target state
  • 100 if the destination is the target state

We’ll start constructing our rewards matrix by listing the states we’ll come FROM down the rows, and the states we’ll go TO in the columns. First, let’s fill the diagonal with -1 rewards, because we don’t want our agent to stay in the same place (that is, move from Room 1 to Room 1, or from Room 2 to Room 2, and so forth). The final one gets a 100 because if we’re already in Room 5, we want to stay there.

Next, let’s move across the first row. Starting in Room 0, we only have one choice: go to Room 4. All other possibilities are blocked (-1):

Now let’s fill in the row labeled 1. From Room 1, you have two choices: go to Room 3 (which is not great but permissible, so give it a 0) or go to Room 5 (the target, worth 100)!

Continue moving row by row, determining if you can’t get there from here (-1), you can but it’s not the target (0), or it’s the target(100). You’ll end up with a final rewards matrix that looks like this:

Now create this rewards matrix in R:

R <- matrix(c(-1, -1, -1, -1, 0, -1,
       -1, -1, -1, 0, -1, 100,
       -1, -1, -1, 0, -1, -1, 
       -1, 0, 0, -1, 0, -1,
        0, -1, -1, 0, -1, 100,
       -1, 0, -1, -1, 0, 100), nrow=6, ncol=6, byrow=TRUE)

And run the code. Notice that we’re calling the target state 6 instead of 5 because even though we have a room labeled with a zero, our matrix starts with a 1s so we have to adjust:

source("https://raw.githubusercontent.com/NicoleRadziwill/R-Functions/master/qlearn.R")

results <- q.learn(R,10000,alpha=0.1,gamma=0.8,tgt.state=6) 
> round(results)
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    0    0    0    0   80    0
[2,]    0    0    0   64    0  100
[3,]    0    0    0   64    0    0
[4,]    0   80   51    0   80    0
[5,]   64    0    0   64    0  100
[6,]    0   80    0    0   80  100

You can read this table of average value to obtain policies. A policy is a “path” through the states of the system:

  • Start at Room 0 (first row, labeled 1): Choose Room 4 (80), then from Room 4 choose Room 5 (100)
  • Start at Room 1: Choose Room 5 (100)
  • Start at Room 2: Choose Room 3 (64), from Room 3 choose Room 1 or Room 4 (80); from 1 or 4 choose 5 (100)
  • Start at Room 3: Choose Room 1 or Room 4 (80), then Room 5 (100)
  • Start at Room 4: Choose Room 5 (100)
  • Start at Room 5: Stay at Room 5 (100)

To answer the original problem, we would take route 2-3-1-5 or 2-3-4-5 to get out the quickest if we started in Room 2. This is easy to see with a simple map, but is much more complicated when the maps get bigger.

Reinforcement Learning: Q-Learning with the Hopping Robot

Overview: Reinforcement learning uses “reward” signals to determine how to navigate through a system in the most valuable way. (I’m particularly interested in the variant of reinforcement learning called “Q-Learning” because the goal is to create a “Quality Matrix” that can help you make the best sequence of decisions!) I found a toy robot navigation problem on the web that was solved using custom R code for reinforcement learning, and I wanted to reproduce the solution in different ways than the original author did. This post describes different ways that I solved the problem described at http://bayesianthink.blogspot.com/2014/05/hopping-robots-and-reinforcement.html

The Problem: Our agent, the robot, is placed at random on a board of wood. There’s a hole at s1, a sticky patch at s4, and the robot is trying to make appropriate decisions to navigate to s7 (the target). The image comes from the blog post linked above.

To solve a problem like this, you can use MODEL-BASED approaches if you know how likely it is that the robot will move from one state to another (that is, the transition probabilities for each action) or MODEL-FREE approaches (you don’t know how likely it is that the robot will move from state to state, but you can figure out a reward structure).

  • Markov Decision Process (MDP) – If you know the states, actions, rewards, and transition probabilities (which are probably different for each action), you can determine the optimal policy or “path” through the system, given different starting states. (If transition probabilities have nothing to do with decisions that an agent makes, your MDP reduces to a Markov Chain.)
  • Reinforcement Learning (RL) – If you know the states, actions, and rewards (but not the transition probabilities), you can still take an unsupervised approach. Just randomly create lots of hops through your system, and use them to update a matrix that describes the average value of each hop within the context of the system.

Solving a RL problem involves finding the optimal value functions (e.g. the Q matrix in Attempt 1) or the optimal policy (the State-Action matrix in Attempt 2). Although there are many techniques for reinforcement learning, we will use Q-learning because we don’t know the transition probabilities for each action. (If we did, we’d model it as a Markov Decision Process and use the MDPtoolbox package instead.) Q-Learning relies on traversing the system in many ways to update a matrix of average expected rewards from each state transition. This equation that it uses is from https://www.is.uni-freiburg.de/ressourcen/business-analytics/13_reinforcementlearning.pdf:

For this to work, all states have to be visited a sufficient number of times, and all state-action pairs have to be included in your experience sample. So keep this in mind when you’re trying to figure out how many iterations you need.

Attempt 1: Quick Q-Learning with qlearn.R

  • Input: A rewards matrix R. (That’s all you need! Your states are encoded in the matrix.)
  • Output: A Q matrix from which you can extract optimal policies (or paths) to help you navigate the environment.
  • Pros: Quick and very easy. Cons: Does not let you set epsilon (% of random actions), so all episodes are determined randomly and it may take longer to find a solution. Can take a long time to converge.

Set up the rewards matrix so it is a square matrix with all the states down the rows, starting with the first and all the states along the columns, starting with the first:

hopper.rewards <- c(-10, 0.01, 0.01, -1, -1, -1, -1,
         -10, -1, 0.1, -3, -1, -1, -1,
         -1, 0.01, -1, -3, 0.01, -1, -1,
         -1, -1, 0.01, -1, 0.01, 0.01, -1,
         -1, -1, -1, -3, -1, 0.01, 100,
         -1, -1, -1, -1, 0.01, -1, 100,
         -1, -1, -1, -1, -1, 0.01, 100)

HOP <- matrix(hopper.rewards, nrow=7, ncol=7, byrow=TRUE) 
> HOP
     [,1]  [,2]  [,3] [,4]  [,5]  [,6] [,7]
[1,]  -10  0.01  0.01   -1 -1.00 -1.00   -1
[2,]  -10 -1.00  0.10   -3 -1.00 -1.00   -1
[3,]   -1  0.01 -1.00   -3  0.01 -1.00   -1
[4,]   -1 -1.00  0.01   -1  0.01  0.01   -1
[5,]   -1 -1.00 -1.00   -3 -1.00  0.01  100
[6,]   -1 -1.00 -1.00   -1  0.01 -1.00  100
[7,]   -1 -1.00 -1.00   -1 -1.00  0.01  100

Here’s how you read this: the rows represent where you’ve come FROM, and the columns represent where you’re going TO. Each element 1 through 7 corresponds directly to S1 through S7 in the cartoon above. Each cell contains a reward (or penalty, if the value is negative) if we arrive in that state.

The S1 state is bad for the robot… there’s a hole in that piece of wood, so we’d really like to keep it away from that state. Location [1,1] on the matrix tells us what reward (or penalty) we’ll receive if we start at S1 and stay at S1: -10 (that’s bad). Similarly, location [2,1] on the matrix tells us that if we start at S2 and move left to S1, that’s also bad and we should receive a penalty of -10. The S4 state is also undesirable – there’s a sticky patch there, so we’d like to keep the robot away from it. Location [3,4] on the matrix represents the action of going from S3 to S4 by moving right, which will put us on the sticky patch

Now load the qlearn command into your R session:

qlearn <- function(R, N, alpha, gamma, tgt.state) {
# Adapted from https://stackoverflow.com/questions/39353580/how-to-implement-q-learning-in-r
  Q <- matrix(rep(0,length(R)), nrow=nrow(R))
  for (i in 1:N) {
    cs <- sample(1:nrow(R), 1)
    while (1) {
      next.states <- which(R[cs,] > -1)  # Get feasible actions for cur state
      if (length(next.states)==1)        # There may only be one possibility
        ns <- next.states
      else
        ns <- sample(next.states,1) # Or you may have to pick from a few 
      if (ns > nrow(R)) { ns <- cs }
      # NOW UPDATE THE Q-MATRIX
      Q[cs,ns] <- Q[cs,ns] + alpha*(R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]) - Q[cs,ns])
      if (ns == tgt.state) break
      cs <- ns
    }
  }
  return(round(100*Q/max(Q)))
}

Run qlearn with the HOP rewards matrix, a learning rate of 0.1, a discount rate of 0.8, and a target state of S7 (the location to the far right of the wooden board). I did 10,000 episodes (where in each one, the robot dropped randomly onto the wooden board and has to get to S7):

r.hop <- qlearn(HOP,10000,alpha=0.1,gamma=0.8,tgt.state=7) 
> r.hop
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]    0   51   64    0    0    0    0
[2,]    0    0   64    0    0    0    0
[3,]    0   51    0    0   80    0    0
[4,]    0    0   64    0   80   80    0
[5,]    0    0    0    0    0   80  100
[6,]    0    0    0    0   80    0  100
[7,]    0    0    0    0    0   80  100

The Q-Matrix that is presented encodes the best-value solutions from each state (the “policy”). Here’s how you read it:

  • If you’re at s1 (first row), hop to s3 (biggest value in first row), then hop to s5 (go to row 3 and find biggest value), then hop to s7 (go to row 5 and find biggest value)
  • If you’re at s2, go right to s3, then hop to s5, then hop to s7
  • If you’re at s3, hop to s5, then hop to s7
  • If you’re at s4, go right to s5 OR hop to s6, then go right to s7
  • If you’re at s5, hop to s7
  • If you’re at s6, go right to s7
  • If you’re at s7, stay there (when you’re in the target state, the value function will not be able to pick out a “best action” because the best action is to do nothing)

Alternatively, the policy can be expressed as the best action from each of the 7 states: HOP, RIGHT, HOP, RIGHT, HOP, RIGHT, (STAY PUT)

Attempt 2: Use ReinforcementLearning Package

I also used the ReinforcementLearning package by Nicholas Proellochs (6/19/2017) described in https://cran.r-project.org/web/packages/ReinforcementLearning/ReinforcementLearning.pdf.

  • Input: 1) a definition of the environment, 2) a list of states, 3) a list of actions, and 4) control parameters alpha (the learning rate; usually 0.1), gamma (the discount rate which describes how important future rewards are; often 0.9 indicating that 90% of the next reward will be taken into account), and epsilon (the probability that you’ll try a random action; often 0.1)
  • Output: A State-Action Value matrix, which attaches a number to how good it is to be in a particular state and take an action. You can use it to determine the highest value action from each state. (It contains the same information as the Q-matrix from Attempt 1, but you don’t have to infer the action from the destination it brings you to.)
  • Pros: Relatively straightforward. Allows you to specify epsilon, which controls the proportion of random actions you’ll explore as you create episodes and explore your environment. Cons: Requires manual setup of all state transitions and associated rewards.

First, I created an “environment” that describes 1) how the states will change when actions are taken, and 2) what rewards will be accrued when that happens. I assigned a reward of -1 to all actions that are not special, e.g. landing on S1, landing on S4, or landing on S7. To be perfectly consistent with Attempt 1, I could have used 0.01 instead of -1, but the results will be similar. The values you choose for rewards are sort of arbitrary, but you do need to make sure there’s a comparatively large positive reward at your target state and “negative rewards” for states you want to avoid or are physically impossible.

my.env <- function(state,action) {
   next_state <- state
   if (state == state("s1") && action == "right")  { next_state <- state("s2") }
   if (state == state("s1") && action == "hop")    { next_state <- state("s3") }

   if (state == state("s2") && action == "left")  {
	next_state <- state("s1"); reward <- -10 }
   if (state == state("s2") && action == "right") { next_state <- state("s3") }
   if (state == state("s2") && action == "hop")   {
	next_state <- state("s4"); reward <- -3 }

   if (state == state("s3") && action == "left")  { next_state <- state("s2") }
   if (state == state("s3") && action == "right") {
	next_state <- state("s4"); reward <- -3 }
   if (state == state("s3") && action == "hop")   { next_state <- state("s5") }

   if (state == state("s4") && action == "left")  { next_state <- state("s3") }
   if (state == state("s4") && action == "right") { next_state <- state("s5") }
   if (state == state("s4") && action == "hop")   { next_state <- state("s6") }

   if (state == state("s5") && action == "left")  {
	next_state <- state("s4"); reward <- -3 }
   if (state == state("s5") && action == "right") { next_state <- state("s6") }
   if (state == state("s5") && action == "hop")   {
	next_state <- state("s7"); reward <- 10 }

   if (state == state("s6") && action == "left")  { next_state <- state("s5") }
   if (state == state("s6") && action == "right") {
	next_state <- state("s7"); reward <- 10 }

   if (next_state == state("s7") && state != state("s7")) {
        reward <- 10
   } else {
	reward <- -1
   }
   out <- list(NextState = next_state, Reward = reward)
   return(out)
}

Next, I installed and loaded up the ReinforcementLearning package and ran the RL simulation:

install.packages("ReinforcementLearning")
library(ReinforcementLearning)
states <- c("s1", "s2", "s3", "s4", "s5", "s6", "s7")
actions <- c("left","right","hop")
data <- sampleExperience(N=3000,env=my.env,states=states,actions=actions)
control <- list(alpha = 0.1, gamma = 0.8, epsilon = 0.1)
model <- ReinforcementLearning(data, s = "State", a = "Action", r = "Reward", 
      s_new = "NextState", control = control)

Now we can see the results:

> print(model)
State-Action function Q
         hop     right      left
s1  2.456741  1.022440  1.035193
s2  2.441032  2.452331  1.054154
s3  4.233166  2.469494  1.048073
s4  4.179853  4.221801  2.422842
s5  6.397159  4.175642  2.456108
s6  4.217752  6.410110  4.223972
s7 -4.602003 -4.593739 -4.591626

Policy
     s1      s2      s3      s4      s5      s6      s7
  "hop" "right"   "hop" "right"   "hop" "right"  "left" 

Reward (last iteration)
[1] 223

The recommended policy is: HOP, RIGHT, HOP, RIGHT, HOP, RIGHT, (STAY PUT)

If you tried this example and it didn’t produce the same response, don’t worry! Model-free reinforcement learning is done by simulation, and when you used the sampleExperience function, you generated a different set of state transitions to learn from. You may need more samples, or to tweak your rewards structure, or both.)

A Newbie’s Install of Keras & Tensorflow on Windows 10 with R

This weekend, I decided it was time: I was going to update my Python environment and get Keras and Tensorflow installed so I could start doing tutorials (particularly for deep learning) using R. Although I used to be a systems administrator (about 20 years ago), I don’t do much installing or configuring so I guess that’s why I’ve put this task off for so long. And it wasn’t unwarranted: it took me the whole weekend to get the install working. Here are the steps I used to get things running on Windows 10, leveraging clues in about 15 different online resources — and yes (I found out the hard way), the order of operations is very important. I do not claim to have nailed the order of operations here, but definitely one that works.

Step 0: I had already installed the tensorflow and keras packages within R, and had been wondering why they wouldn’t work. “Of course!” I finally realized, a few weeks later. “I don’t have Python on this machine, and both of these packages depend on a Python install.” Turns out they also depend on the reticulate package, so install.packages(“reticulate”) if you have not already.

Step 1: Installed Anaconda3 to C:/Users/User/Anaconda3 (from https://www.anaconda.com/download/)

Step 2: Opened “Anaconda Prompt” from Windows Start Menu. First, to create an “environment” specifically for use with tensorflow and keras in R called “tf-keras” with a 64-bit version of Python 3.5 I typed:

conda create -n tf-keras python=3.5 anaconda

… and then after it was done, I did this:

activate tf-keras

Step 3: Install TensorFlow from Anaconda prompt. Using the instructions at https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-1.1.0-cp35-cp35m-win_amd64.whl I typed this:

pip install --ignore-installed --upgrade

I didn’t know whether this worked or not — it gave me an error saying that it “can not import html5lib”, so I did this next:

conda install -c conda-forge html5lib

I tried to run the pip command again, but there was an error so I consulted https://www.tensorflow.org/install/install_windows. It told me to do this:

pip install --ignore-installed --upgrade tensorflow

This failed, and told me that the pip command had an error. I searched the web for an alternative to that command, and found this, which worked!!

conda install -c conda-forge tensorflow

 

Step 4: From inside the Anaconda prompt, I opened python by typing “python”. Next, I did this, line by line:

import tensorflow as tf
 hello = tf.constant('Hello, TensorFlow!')
 sess = tf.Session()
 print(sess.run(hello))

It said “b’Hello, TensorFlow!'” which I believe means it works. (Ctrl-Z then Enter will then get you out of Python and back to the Anaconda prompt.) This means that my Python installation of TensorFlow was functional.

Step 5: Install Keras. I tried this:

pip install keras

…but I got the same error message that pip could not be installed or found or imported or something. So I tried this, which seemed to work:

conda install -c conda-forge keras

 

Step 6: Load them up from within R. First, I opened a 64-bit version of R v3.4.1 and did this:

library(tensorflow)
install_tensorflow(conda="tf=keras")

It took a couple minutes but it seemed to work.

library(keras)

 

Step 7: Try a tutorial! I decided to go for https://www.analyticsvidhya.com/blog/2017/06/getting-started-with-deep-learning-using-keras-in-r/ which guides you through developing a classifier for the MNIST handwritten image database — a very popular data science resource. I loaded up my dataset and checked to make sure it loaded properly:

data <- data_mnist()
str(data)
List of 2
 $ train:List of 2
 ..$ x: int [1:60000, 1:28, 1:28] 0 0 0 0 0 0 0 0 0 0 ...
 ..$ y: int [1:60000(1d)] 5 0 4 1 9 2 1 3 1 4 ...
 $ test :List of 2
 ..$ x: int [1:10000, 1:28, 1:28] 0 0 0 0 0 0 0 0 0 0 ...
 ..$ y: int [1:10000(1d)] 7 2 1 0 4 1 4 9 5 9 ...

 

Step 8: Here is the code I used to prepare the data and create the neural network model. This didn’t take long to run at all.

trainx<-data$train$x
trainy<-data$train$y
testx<-data$test$x
testy<-data$test$y

train_x <- array(train_x, dim = c(dim(train_x)[1], prod(dim(train_x)[-1]))) / 255

test_x <- array(test_x, dim = c(dim(test_x)[1], prod(dim(test_x)[-1]))) / 255

train_y<-to_categorical(train_y,10)
test_y<-to_categorical(test_y,10)

model %>% 
layer_dense(units = 784, input_shape = 784) %>% 
layer_dropout(rate=0.4)%>%
layer_activation(activation = 'relu') %>% 
layer_dense(units = 10) %>% 
layer_activation(activation = 'softmax')

model %>% compile(
loss = 'categorical_crossentropy',
optimizer = 'adam',
metrics = c('accuracy')
)

 

Step 9: Train the network. THIS TOOK ABOUT 12 MINUTES on a powerful machine with 64GB high-performance RAM. It looks like it worked, but I don’t know how to find or evaluate the results yet.

model %>% fit(train_x, train_y, epochs = 100, batch_size = 128)
 loss_and_metrics <- model %>% evaluate(test_x, test_y, batch_size = 128)

str(model)
Model
___________________________________________________________________________________
Layer (type) Output Shape Param #
===================================================================================
dense_1 (Dense) (None, 784) 615440
___________________________________________________________________________________
dropout_1 (Dropout) (None, 784) 0
___________________________________________________________________________________
activation_1 (Activation) (None, 784) 0
___________________________________________________________________________________
dense_2 (Dense) (None, 10) 7850
___________________________________________________________________________________
activation_2 (Activation) (None, 10) 0
===================================================================================
Total params: 623,290
Trainable params: 623,290
Non-trainable params: 0

 

Step 10: Next, I wanted to try the tutorial at https://cran.r-project.org/web/packages/kerasR/vignettes/introduction.html. Turns out this uses the kerasR package, not the keras package:

X_train <- mnist$X_train
Y_train <- mnist$Y_train
X_test <- mnist$X_test
Y_test <- mnist$Y_test

> dim(X_train)
[1] 60000 28 28

X_train <- array(X_train, dim = c(dim(X_train)[1], prod(dim(X_train)[-1]))) / 255
X_test <- array(X_test, dim = c(dim(X_test)[1], prod(dim(X_test)[-1]))) / 255

To check and see what’s in any individual image, type:

image(X_train[1,,])

At this point, the to_categorical function stopped working. I was supposed to do this but got an error:

Y_train <- to_categorical(mnist$Y_train, 10)

So I did this instead:

mm <- model.matrix(~ Y_train)

Y_train <- to_categorical(mm[,2])

mod <- Sequential()  # THIS IS THE EXCITING PART WHERE YOU USE KERAS!! :)

But then I tried this, and it was clear I was stuck again — it wouldn’t work:

mod$add(Dense(units = 512, input_shape = dim(X_train)[2]))

Stack Overflow recommended grabbing a version of kerasR from GitHub, so that’s what I did next:

install.packages("devtools")
library(devtools)
devtools::install_github("statsmaths/kerasR")
library(kerasR)

I got an error in R which told me to go to the Anaconda prompt (which I did), and type this:

conda install m2w64-toolchain

Then I went back into R and this worked fantastically:

mod <- Sequential()

mod$Add would still not work though, and this is where my patience expired for the evening. I’m pretty happy though — Python is up, keras and tensorflow are up on Python, all three (keras, tensorflow, and kerasR) are up in R, and some tutorials seem to be working.

The Best Book Ever on Machine Learning (and Intelligent Systems) in R

lantz-ml-in-rDear Brett (Lantz),

In short: your book, Machine Learning with R, is the book I’ve been dreaming about for years. Everyone who applies machine learning techniques for their work, teaches applied machine learning at a university, or just loves R and wants to know more about these super cool algorithms should buy and use your book.

I’ve been teaching a course called “Intelligent Systems” (ISAT/CS 344 at JMU) for the past few years. I inherited a syllabus and course description from professors who had taught the course from the mid-1990’s until 2009, so I started out following their lead and broadly covering expert/knowledge-based systems, simple neural networks for regression, and some elements of robotics. We used a commercial package to build the expert systems (rather than a declarative language like Prolog), which was fine, but we also used a commercial package for the neural networks. I was unsatisfied for two reasons: first, I knew that far more “stuff” was going on in the world of intelligent systems which we weren’t sharing with our students, and second, I knew there were tons of free packages in the R Statistical Software that could perform the same tasks… and more. I started a yearlong process of soul-searching and creating new materials… determined to bring R to the classroom, along with neural networks for both classification and regression, classification using k-nearest neighbors and Naive Bayes approaches, clustering with k-means, and some text mining and analysis to show students what you could do with unstructured data.

I also wanted to compare and contrast neural network regression with simple linear regression, classification algorithms in general with logistic regression, and share how to evaluate and improve model performance using metrics like precision, recall, and F1. (I mean, who cares about developing an intelligent software system if you can’t evaluate and continually improve its performance?) In addition, I’ve dreamed about adding a module on decision trees, in particular focusing on the C5.0 algorithm. But I haven’t found the time to explore or create new course materials on this topic. So I knew it would be even harder to compile all of my course materials into a book for my students to reference.

But you, in the meantime, have saved my life. I’ve explored tons of books on machine learning and intelligent systems that focus more on the practical applications of the techniques rather than the theory… and I have not found one that meets my standards, until now. In a friendly and conversational manner (that’s not overfriendly, condescending, or flippant) you have managed to cover pretty much all of the topics I want to share in my intelligent systems class — in a way that I’m comfortable with.

Chapters 1 (Introduction to Machine Learning) and 2 (Managing and Understanding Data) provide a great, simplified introduction to what machine learning is all about and highlights the data structures and R commands that might be the most useful for these purposes. Chapters 3 and 4 cover classification… first with k-nearest neighbors, then with Naive Bayes. Chapter 5 covers decision trees and C5.0. Chapter 6 covers regression in general, but with applications to decision trees (yeah!) In Chapter 7 (Black Box Methods – Neural Networks and Support Vector Machines) there’s a great example based on Optical Character Recognition (which will pair nicely with the lab exercise I already use). Chapter 8 covers Apriori, Chapter 9 introduces clustering with k-means, and Chapters 10 and 11 specifically deal with evaluating and improving model performance.

As a cherry on top of the cake that is this book, Chapter 12 provides an overview of most-used ways to acquire data (e.g. using RCurl, XML, and JSON) and even introduces parallel computing.

I am eternally grateful to you for writing the book that’s been in my head in a way I (think I!) would have written it. It’s not PERFECT (I would have spent more time on concepts like overfitting, and maybe given examples… and maybe some prose on the Turing Test and Reverse Turing Test) — but I can easily use your book as a required text and then provide supplemental materials on the side.

Thank you Brett!

Sincerely and with a world of gratitude,

Nicole

« Older Entries