Reinforcement Learning: Q-Learning with the Hopping Robot

Overview: Reinforcement learning uses “reward” signals to determine how to navigate through a system in the most valuable way. (I’m particularly interested in the variant of reinforcement learning called “Q-Learning” because the goal is to create a “Quality Matrix” that can help you make the best sequence of decisions!) I found a toy robot navigation problem on the web that was solved using custom R code for reinforcement learning, and I wanted to reproduce the solution in different ways than the original author did. This post describes different ways that I solved the problem described at http://bayesianthink.blogspot.com/2014/05/hopping-robots-and-reinforcement.html

The Problem: Our agent, the robot, is placed at random on a board of wood. There’s a hole at s1, a sticky patch at s4, and the robot is trying to make appropriate decisions to navigate to s7 (the target). The image comes from the blog post linked above.

To solve a problem like this, you can use MODEL-BASED approaches if you know how likely it is that the robot will move from one state to another (that is, the transition probabilities for each action) or MODEL-FREE approaches (you don’t know how likely it is that the robot will move from state to state, but you can figure out a reward structure).

  • Markov Decision Process (MDP) – If you know the states, actions, rewards, and transition probabilities (which are probably different for each action), you can determine the optimal policy or “path” through the system, given different starting states. (If transition probabilities have nothing to do with decisions that an agent makes, your MDP reduces to a Markov Chain.)
  • Reinforcement Learning (RL) – If you know the states, actions, and rewards (but not the transition probabilities), you can still take an unsupervised approach. Just randomly create lots of hops through your system, and use them to update a matrix that describes the average value of each hop within the context of the system.

Solving a RL problem involves finding the optimal value functions (e.g. the Q matrix in Attempt 1) or the optimal policy (the State-Action matrix in Attempt 2). Although there are many techniques for reinforcement learning, we will use Q-learning because we don’t know the transition probabilities for each action. (If we did, we’d model it as a Markov Decision Process and use the MDPtoolbox package instead.) Q-Learning relies on traversing the system in many ways to update a matrix of average expected rewards from each state transition. This equation that it uses is from https://www.is.uni-freiburg.de/ressourcen/business-analytics/13_reinforcementlearning.pdf:

For this to work, all states have to be visited a sufficient number of times, and all state-action pairs have to be included in your experience sample. So keep this in mind when you’re trying to figure out how many iterations you need.

Attempt 1: Quick Q-Learning with qlearn.R

  • Input: A rewards matrix R. (That’s all you need! Your states are encoded in the matrix.)
  • Output: A Q matrix from which you can extract optimal policies (or paths) to help you navigate the environment.
  • Pros: Quick and very easy. Cons: Does not let you set epsilon (% of random actions), so all episodes are determined randomly and it may take longer to find a solution. Can take a long time to converge.

Set up the rewards matrix so it is a square matrix with all the states down the rows, starting with the first and all the states along the columns, starting with the first:

hopper.rewards <- c(-10, 0.01, 0.01, -1, -1, -1, -1,
         -10, -1, 0.1, -3, -1, -1, -1,
         -1, 0.01, -1, -3, 0.01, -1, -1,
         -1, -1, 0.01, -1, 0.01, 0.01, -1,
         -1, -1, -1, -3, -1, 0.01, 100,
         -1, -1, -1, -1, 0.01, -1, 100,
         -1, -1, -1, -1, -1, 0.01, 100)

HOP <- matrix(hopper.rewards, nrow=7, ncol=7, byrow=TRUE) 
> HOP
     [,1]  [,2]  [,3] [,4]  [,5]  [,6] [,7]
[1,]  -10  0.01  0.01   -1 -1.00 -1.00   -1
[2,]  -10 -1.00  0.10   -3 -1.00 -1.00   -1
[3,]   -1  0.01 -1.00   -3  0.01 -1.00   -1
[4,]   -1 -1.00  0.01   -1  0.01  0.01   -1
[5,]   -1 -1.00 -1.00   -3 -1.00  0.01  100
[6,]   -1 -1.00 -1.00   -1  0.01 -1.00  100
[7,]   -1 -1.00 -1.00   -1 -1.00  0.01  100

Here’s how you read this: the rows represent where you’ve come FROM, and the columns represent where you’re going TO. Each element 1 through 7 corresponds directly to S1 through S7 in the cartoon above. Each cell contains a reward (or penalty, if the value is negative) if we arrive in that state.

The S1 state is bad for the robot… there’s a hole in that piece of wood, so we’d really like to keep it away from that state. Location [1,1] on the matrix tells us what reward (or penalty) we’ll receive if we start at S1 and stay at S1: -10 (that’s bad). Similarly, location [2,1] on the matrix tells us that if we start at S2 and move left to S1, that’s also bad and we should receive a penalty of -10. The S4 state is also undesirable – there’s a sticky patch there, so we’d like to keep the robot away from it. Location [3,4] on the matrix represents the action of going from S3 to S4 by moving right, which will put us on the sticky patch

Now load the qlearn command into your R session:

qlearn <- function(R, N, alpha, gamma, tgt.state) {
# Adapted from https://stackoverflow.com/questions/39353580/how-to-implement-q-learning-in-r
  Q <- matrix(rep(0,length(R)), nrow=nrow(R))
  for (i in 1:N) {
    cs <- sample(1:nrow(R), 1)
    while (1) {
      next.states <- which(R[cs,] > -1)  # Get feasible actions for cur state
      if (length(next.states)==1)        # There may only be one possibility
        ns <- next.states
      else
        ns <- sample(next.states,1) # Or you may have to pick from a few 
      if (ns > nrow(R)) { ns <- cs }
      # NOW UPDATE THE Q-MATRIX
      Q[cs,ns] <- Q[cs,ns] + alpha*(R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]) - Q[cs,ns])
      if (ns == tgt.state) break
      cs <- ns
    }
  }
  return(round(100*Q/max(Q)))
}

Run qlearn with the HOP rewards matrix, a learning rate of 0.1, a discount rate of 0.8, and a target state of S7 (the location to the far right of the wooden board). I did 10,000 episodes (where in each one, the robot dropped randomly onto the wooden board and has to get to S7):

r.hop <- qlearn(HOP,10000,alpha=0.1,gamma=0.8,tgt.state=7) 
> r.hop
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]    0   51   64    0    0    0    0
[2,]    0    0   64    0    0    0    0
[3,]    0   51    0    0   80    0    0
[4,]    0    0   64    0   80   80    0
[5,]    0    0    0    0    0   80  100
[6,]    0    0    0    0   80    0  100
[7,]    0    0    0    0    0   80  100

The Q-Matrix that is presented encodes the best-value solutions from each state (the “policy”). Here’s how you read it:

  • If you’re at s1 (first row), hop to s3 (biggest value in first row), then hop to s5 (go to row 3 and find biggest value), then hop to s7 (go to row 5 and find biggest value)
  • If you’re at s2, go right to s3, then hop to s5, then hop to s7
  • If you’re at s3, hop to s5, then hop to s7
  • If you’re at s4, go right to s5 OR hop to s6, then go right to s7
  • If you’re at s5, hop to s7
  • If you’re at s6, go right to s7
  • If you’re at s7, stay there (when you’re in the target state, the value function will not be able to pick out a “best action” because the best action is to do nothing)

Alternatively, the policy can be expressed as the best action from each of the 7 states: HOP, RIGHT, HOP, RIGHT, HOP, RIGHT, (STAY PUT)

Attempt 2: Use ReinforcementLearning Package

I also used the ReinforcementLearning package by Nicholas Proellochs (6/19/2017) described in https://cran.r-project.org/web/packages/ReinforcementLearning/ReinforcementLearning.pdf.

  • Input: 1) a definition of the environment, 2) a list of states, 3) a list of actions, and 4) control parameters alpha (the learning rate; usually 0.1), gamma (the discount rate which describes how important future rewards are; often 0.9 indicating that 90% of the next reward will be taken into account), and epsilon (the probability that you’ll try a random action; often 0.1)
  • Output: A State-Action Value matrix, which attaches a number to how good it is to be in a particular state and take an action. You can use it to determine the highest value action from each state. (It contains the same information as the Q-matrix from Attempt 1, but you don’t have to infer the action from the destination it brings you to.)
  • Pros: Relatively straightforward. Allows you to specify epsilon, which controls the proportion of random actions you’ll explore as you create episodes and explore your environment. Cons: Requires manual setup of all state transitions and associated rewards.

First, I created an “environment” that describes 1) how the states will change when actions are taken, and 2) what rewards will be accrued when that happens. I assigned a reward of -1 to all actions that are not special, e.g. landing on S1, landing on S4, or landing on S7. To be perfectly consistent with Attempt 1, I could have used 0.01 instead of -1, but the results will be similar. The values you choose for rewards are sort of arbitrary, but you do need to make sure there’s a comparatively large positive reward at your target state and “negative rewards” for states you want to avoid or are physically impossible.

my.env <- function(state,action) {
   next_state <- state
   if (state == state("s1") && action == "right")  { next_state <- state("s2") }
   if (state == state("s1") && action == "hop")    { next_state <- state("s3") }

   if (state == state("s2") && action == "left")  {
	next_state <- state("s1"); reward <- -10 }
   if (state == state("s2") && action == "right") { next_state <- state("s3") }
   if (state == state("s2") && action == "hop")   {
	next_state <- state("s4"); reward <- -3 }

   if (state == state("s3") && action == "left")  { next_state <- state("s2") }
   if (state == state("s3") && action == "right") {
	next_state <- state("s4"); reward <- -3 }
   if (state == state("s3") && action == "hop")   { next_state <- state("s5") }

   if (state == state("s4") && action == "left")  { next_state <- state("s3") }
   if (state == state("s4") && action == "right") { next_state <- state("s5") }
   if (state == state("s4") && action == "hop")   { next_state <- state("s6") }

   if (state == state("s5") && action == "left")  {
	next_state <- state("s4"); reward <- -3 }
   if (state == state("s5") && action == "right") { next_state <- state("s6") }
   if (state == state("s5") && action == "hop")   {
	next_state <- state("s7"); reward <- 10 }

   if (state == state("s6") && action == "left")  { next_state <- state("s5") }
   if (state == state("s6") && action == "right") {
	next_state <- state("s7"); reward <- 10 }

   if (next_state == state("s7") && state != state("s7")) {
        reward <- 10
   } else {
	reward <- -1
   }
   out <- list(NextState = next_state, Reward = reward)
   return(out)
}

Next, I installed and loaded up the ReinforcementLearning package and ran the RL simulation:

install.packages("ReinforcementLearning")
library(ReinforcementLearning)
states <- c("s1", "s2", "s3", "s4", "s5", "s6", "s7")
actions <- c("left","right","hop")
data <- sampleExperience(N=3000,env=my.env,states=states,actions=actions)
control <- list(alpha = 0.1, gamma = 0.8, epsilon = 0.1)
model <- ReinforcementLearning(data, s = "State", a = "Action", r = "Reward", 
      s_new = "NextState", control = control)

Now we can see the results:

> print(model)
State-Action function Q
         hop     right      left
s1  2.456741  1.022440  1.035193
s2  2.441032  2.452331  1.054154
s3  4.233166  2.469494  1.048073
s4  4.179853  4.221801  2.422842
s5  6.397159  4.175642  2.456108
s6  4.217752  6.410110  4.223972
s7 -4.602003 -4.593739 -4.591626

Policy
     s1      s2      s3      s4      s5      s6      s7
  "hop" "right"   "hop" "right"   "hop" "right"  "left" 

Reward (last iteration)
[1] 223

The recommended policy is: HOP, RIGHT, HOP, RIGHT, HOP, RIGHT, (STAY PUT)

If you tried this example and it didn’t produce the same response, don’t worry! Model-free reinforcement learning is done by simulation, and when you used the sampleExperience function, you generated a different set of state transitions to learn from. You may need more samples, or to tweak your rewards structure, or both.)

A Newbie’s Install of Keras & Tensorflow on Windows 10 with R

This weekend, I decided it was time: I was going to update my Python environment and get Keras and Tensorflow installed so I could start doing tutorials (particularly for deep learning) using R. Although I used to be a systems administrator (about 20 years ago), I don’t do much installing or configuring so I guess that’s why I’ve put this task off for so long. And it wasn’t unwarranted: it took me the whole weekend to get the install working. Here are the steps I used to get things running on Windows 10, leveraging clues in about 15 different online resources — and yes (I found out the hard way), the order of operations is very important. I do not claim to have nailed the order of operations here, but definitely one that works.

Step 0: I had already installed the tensorflow and keras packages within R, and had been wondering why they wouldn’t work. “Of course!” I finally realized, a few weeks later. “I don’t have Python on this machine, and both of these packages depend on a Python install.” Turns out they also depend on the reticulate package, so install.packages(“reticulate”) if you have not already.

Step 1: Installed Anaconda3 to C:/Users/User/Anaconda3 (from https://www.anaconda.com/download/)

Step 2: Opened “Anaconda Prompt” from Windows Start Menu. First, to create an “environment” specifically for use with tensorflow and keras in R called “tf-keras” with a 64-bit version of Python 3.5 I typed:

conda create -n tf-keras python=3.5 anaconda

… and then after it was done, I did this:

activate tf-keras

Step 3: Install TensorFlow from Anaconda prompt. Using the instructions at https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-1.1.0-cp35-cp35m-win_amd64.whl I typed this:

pip install --ignore-installed --upgrade

I didn’t know whether this worked or not — it gave me an error saying that it “can not import html5lib”, so I did this next:

conda install -c conda-forge html5lib

I tried to run the pip command again, but there was an error so I consulted https://www.tensorflow.org/install/install_windows. It told me to do this:

pip install --ignore-installed --upgrade tensorflow

This failed, and told me that the pip command had an error. I searched the web for an alternative to that command, and found this, which worked!!

conda install -c conda-forge tensorflow

 

Step 4: From inside the Anaconda prompt, I opened python by typing “python”. Next, I did this, line by line:

import tensorflow as tf
 hello = tf.constant('Hello, TensorFlow!')
 sess = tf.Session()
 print(sess.run(hello))

It said “b’Hello, TensorFlow!'” which I believe means it works. (Ctrl-Z then Enter will then get you out of Python and back to the Anaconda prompt.) This means that my Python installation of TensorFlow was functional.

Step 5: Install Keras. I tried this:

pip install keras

…but I got the same error message that pip could not be installed or found or imported or something. So I tried this, which seemed to work:

conda install -c conda-forge keras

 

Step 6: Load them up from within R. First, I opened a 64-bit version of R v3.4.1 and did this:

library(tensorflow)
install_tensorflow(conda="tf=keras")

It took a couple minutes but it seemed to work.

library(keras)

 

Step 7: Try a tutorial! I decided to go for https://www.analyticsvidhya.com/blog/2017/06/getting-started-with-deep-learning-using-keras-in-r/ which guides you through developing a classifier for the MNIST handwritten image database — a very popular data science resource. I loaded up my dataset and checked to make sure it loaded properly:

data <- data_mnist()
str(data)
List of 2
 $ train:List of 2
 ..$ x: int [1:60000, 1:28, 1:28] 0 0 0 0 0 0 0 0 0 0 ...
 ..$ y: int [1:60000(1d)] 5 0 4 1 9 2 1 3 1 4 ...
 $ test :List of 2
 ..$ x: int [1:10000, 1:28, 1:28] 0 0 0 0 0 0 0 0 0 0 ...
 ..$ y: int [1:10000(1d)] 7 2 1 0 4 1 4 9 5 9 ...

 

Step 8: Here is the code I used to prepare the data and create the neural network model. This didn’t take long to run at all.

trainx<-data$train$x
trainy<-data$train$y
testx<-data$test$x
testy<-data$test$y

train_x <- array(train_x, dim = c(dim(train_x)[1], prod(dim(train_x)[-1]))) / 255

test_x <- array(test_x, dim = c(dim(test_x)[1], prod(dim(test_x)[-1]))) / 255

train_y<-to_categorical(train_y,10)
test_y<-to_categorical(test_y,10)

model %>% 
layer_dense(units = 784, input_shape = 784) %>% 
layer_dropout(rate=0.4)%>%
layer_activation(activation = 'relu') %>% 
layer_dense(units = 10) %>% 
layer_activation(activation = 'softmax')

model %>% compile(
loss = 'categorical_crossentropy',
optimizer = 'adam',
metrics = c('accuracy')
)

 

Step 9: Train the network. THIS TOOK ABOUT 12 MINUTES on a powerful machine with 64GB high-performance RAM. It looks like it worked, but I don’t know how to find or evaluate the results yet.

model %>% fit(train_x, train_y, epochs = 100, batch_size = 128)
 loss_and_metrics <- model %>% evaluate(test_x, test_y, batch_size = 128)

str(model)
Model
___________________________________________________________________________________
Layer (type) Output Shape Param #
===================================================================================
dense_1 (Dense) (None, 784) 615440
___________________________________________________________________________________
dropout_1 (Dropout) (None, 784) 0
___________________________________________________________________________________
activation_1 (Activation) (None, 784) 0
___________________________________________________________________________________
dense_2 (Dense) (None, 10) 7850
___________________________________________________________________________________
activation_2 (Activation) (None, 10) 0
===================================================================================
Total params: 623,290
Trainable params: 623,290
Non-trainable params: 0

 

Step 10: Next, I wanted to try the tutorial at https://cran.r-project.org/web/packages/kerasR/vignettes/introduction.html. Turns out this uses the kerasR package, not the keras package:

X_train <- mnist$X_train
Y_train <- mnist$Y_train
X_test <- mnist$X_test
Y_test <- mnist$Y_test

> dim(X_train)
[1] 60000 28 28

X_train <- array(X_train, dim = c(dim(X_train)[1], prod(dim(X_train)[-1]))) / 255
X_test <- array(X_test, dim = c(dim(X_test)[1], prod(dim(X_test)[-1]))) / 255

To check and see what’s in any individual image, type:

image(X_train[1,,])

At this point, the to_categorical function stopped working. I was supposed to do this but got an error:

Y_train <- to_categorical(mnist$Y_train, 10)

So I did this instead:

mm <- model.matrix(~ Y_train)

Y_train <- to_categorical(mm[,2])

mod <- Sequential()  # THIS IS THE EXCITING PART WHERE YOU USE KERAS!! :)

But then I tried this, and it was clear I was stuck again — it wouldn’t work:

mod$add(Dense(units = 512, input_shape = dim(X_train)[2]))

Stack Overflow recommended grabbing a version of kerasR from GitHub, so that’s what I did next:

install.packages("devtools")
library(devtools)
devtools::install_github("statsmaths/kerasR")
library(kerasR)

I got an error in R which told me to go to the Anaconda prompt (which I did), and type this:

conda install m2w64-toolchain

Then I went back into R and this worked fantastically:

mod <- Sequential()

mod$Add would still not work though, and this is where my patience expired for the evening. I’m pretty happy though — Python is up, keras and tensorflow are up on Python, all three (keras, tensorflow, and kerasR) are up in R, and some tutorials seem to be working.

Where Do Z-Score Tables Come From? (+ how to make them in R)

z-score-table-iconEvery student learns how to look up areas under the normal curve using Z-Score tables in their first statistics class. But what is less commonly covered, especially in courses where calculus is not a prerequisite, is where those Z-Score tables come from: by evaluating the integral of the equation for the bell-shaped normal curve, usually from -Inf to the z-score of interest. This is the same thing that the R command pnorm does when you provide it with a z-score. Here is the slide presentation I put together to explain the use and origin of the Z-Score table, and how it relates to pnorm and qnorm (the command that lets you input an area to find the z-score at which the area to the left is swiped out). It’s free to use under Creative Commons, and is part of the course materials that is available for use with this 2015 book.

One of the fun things I did was to make my own z-score table in R. I don’t know why anyone would WANT to do this — they are easy to find in books, and online, and if you know how to use pnorm and qnorm, you don’t need one at all. But, you can, and here’s how.

First, let’s create a z-score table just with left-tail areas. Using symmetry, we can also use this to get any areas in the right tail, because the area to the left of any -z is the same as any area to the right of any +z. Even though the z-score table contains areas in its cells, our first step is to create a table just of the z-scores that correspond to each cell:

c0 <- seq(-3.4,0,.1)
c1 <- seq(-3.41,0,.1)
c2 <- seq(-3.42,0,.1)
c3 <- seq(-3.43,0,.1)
c4 <- seq(-3.44,0,.1)
c5 <- seq(-3.45,0,.1)
c6 <- seq(-3.46,0,.1)
c7 <- seq(-3.47,0,.1)
c8 <- seq(-3.48,0,.1)
c9 <- seq(-3.49,0,.1)
z <- cbind(c0,c1,c2,c3,c4,c5,c6,c7,c8,c9)
z

 c0 c1 c2 c3 c4 c5 c6 c7 c8 c9
 [1,] -3.4 -3.41 -3.42 -3.43 -3.44 -3.45 -3.46 -3.47 -3.48 -3.49
 [2,] -3.3 -3.31 -3.32 -3.33 -3.34 -3.35 -3.36 -3.37 -3.38 -3.39
 [3,] -3.2 -3.21 -3.22 -3.23 -3.24 -3.25 -3.26 -3.27 -3.28 -3.29
 [4,] -3.1 -3.11 -3.12 -3.13 -3.14 -3.15 -3.16 -3.17 -3.18 -3.19
 [5,] -3.0 -3.01 -3.02 -3.03 -3.04 -3.05 -3.06 -3.07 -3.08 -3.09
 [6,] -2.9 -2.91 -2.92 -2.93 -2.94 -2.95 -2.96 -2.97 -2.98 -2.99
 [7,] -2.8 -2.81 -2.82 -2.83 -2.84 -2.85 -2.86 -2.87 -2.88 -2.89
 [8,] -2.7 -2.71 -2.72 -2.73 -2.74 -2.75 -2.76 -2.77 -2.78 -2.79
 [9,] -2.6 -2.61 -2.62 -2.63 -2.64 -2.65 -2.66 -2.67 -2.68 -2.69
[10,] -2.5 -2.51 -2.52 -2.53 -2.54 -2.55 -2.56 -2.57 -2.58 -2.59
[11,] -2.4 -2.41 -2.42 -2.43 -2.44 -2.45 -2.46 -2.47 -2.48 -2.49
[12,] -2.3 -2.31 -2.32 -2.33 -2.34 -2.35 -2.36 -2.37 -2.38 -2.39
[13,] -2.2 -2.21 -2.22 -2.23 -2.24 -2.25 -2.26 -2.27 -2.28 -2.29
[14,] -2.1 -2.11 -2.12 -2.13 -2.14 -2.15 -2.16 -2.17 -2.18 -2.19
[15,] -2.0 -2.01 -2.02 -2.03 -2.04 -2.05 -2.06 -2.07 -2.08 -2.09
[16,] -1.9 -1.91 -1.92 -1.93 -1.94 -1.95 -1.96 -1.97 -1.98 -1.99
[17,] -1.8 -1.81 -1.82 -1.83 -1.84 -1.85 -1.86 -1.87 -1.88 -1.89
[18,] -1.7 -1.71 -1.72 -1.73 -1.74 -1.75 -1.76 -1.77 -1.78 -1.79
[19,] -1.6 -1.61 -1.62 -1.63 -1.64 -1.65 -1.66 -1.67 -1.68 -1.69
[20,] -1.5 -1.51 -1.52 -1.53 -1.54 -1.55 -1.56 -1.57 -1.58 -1.59
[21,] -1.4 -1.41 -1.42 -1.43 -1.44 -1.45 -1.46 -1.47 -1.48 -1.49
[22,] -1.3 -1.31 -1.32 -1.33 -1.34 -1.35 -1.36 -1.37 -1.38 -1.39
[23,] -1.2 -1.21 -1.22 -1.23 -1.24 -1.25 -1.26 -1.27 -1.28 -1.29
[24,] -1.1 -1.11 -1.12 -1.13 -1.14 -1.15 -1.16 -1.17 -1.18 -1.19
[25,] -1.0 -1.01 -1.02 -1.03 -1.04 -1.05 -1.06 -1.07 -1.08 -1.09
[26,] -0.9 -0.91 -0.92 -0.93 -0.94 -0.95 -0.96 -0.97 -0.98 -0.99
[27,] -0.8 -0.81 -0.82 -0.83 -0.84 -0.85 -0.86 -0.87 -0.88 -0.89
[28,] -0.7 -0.71 -0.72 -0.73 -0.74 -0.75 -0.76 -0.77 -0.78 -0.79
[29,] -0.6 -0.61 -0.62 -0.63 -0.64 -0.65 -0.66 -0.67 -0.68 -0.69
[30,] -0.5 -0.51 -0.52 -0.53 -0.54 -0.55 -0.56 -0.57 -0.58 -0.59
[31,] -0.4 -0.41 -0.42 -0.43 -0.44 -0.45 -0.46 -0.47 -0.48 -0.49
[32,] -0.3 -0.31 -0.32 -0.33 -0.34 -0.35 -0.36 -0.37 -0.38 -0.39
[33,] -0.2 -0.21 -0.22 -0.23 -0.24 -0.25 -0.26 -0.27 -0.28 -0.29
[34,] -0.1 -0.11 -0.12 -0.13 -0.14 -0.15 -0.16 -0.17 -0.18 -0.19
[35,] 0.0 -0.01 -0.02 -0.03 -0.04 -0.05 -0.06 -0.07 -0.08 -0.09

Now that we have slots for all the z-scores, we can use pnorm to transform all those values into the areas that are swiped out to the left of that z-score. This part is easy, and only takes one line. The remaining three lines format and display the z-score table:

zscore.df <- round(pnorm(z),4)
row.names(zscore.df) <- sprintf("%.2f", c0)
colnames(zscore.df) <- seq(0,0.09,0.01)
zscore.df

 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.40 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.30 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.20 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.10 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
-3.00 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.90 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.80 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.70 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.60 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.50 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.40 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.30 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.20 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.10 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.00 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
-1.90 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
-1.80 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
-1.70 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
-1.60 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
-1.50 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
-1.40 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
-1.30 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
-1.20 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
-1.10 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
-1.00 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
-0.90 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
-0.80 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
-0.70 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
-0.60 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
-0.50 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
-0.40 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121
-0.30 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483
-0.20 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859
-0.10 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
0.00 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641

You can also draw a picture to go along with your z-score table, so that people remember which area they are looking up:

x <- seq(-4,4,0.1)
y <- dnorm(x)
plot(x,dnorm(x),type="l", col="black", lwd=3)
abline(v=-1,lwd=3,col="blue")
abline(h=0,lwd=3,col="black")
polygon(c(x[1:31],rev(x[1:31])), c(rep(0,31),rev(y[1:31])), col="lightblue")

It looks like this:

z-score-table-icon-small

In the slides, code to produce a giant-tail z-score table is also provided (where the areas are > 50%).

Taking a Subset of a Data Frame in R

I just wrote a new chapter for my students describing how to subset a data frame in R. The full text is available at https://docs.google.com/document/d/1K5U11-IKRkxNmitu_lS71Z6uLTQW_fp6QNbOMMwA5J8/edit?usp=sharing but here’s a preview:

Let’s load in ChickWeight, one of R’s built in datasets. This contains the weights of little chickens at 12 different times throughout their lives. The chickens are on different diets, numbered 1, 2, 3, and 4. Using the str command, we find that there are 578 observations in this data frame, and two different categorical variables: Chick and Diet.

> data(ChickWeight)
> head(ChickWeight)
  weight Time Chick Diet
1     42    0     1    1
2     51    2     1    1
3     59    4     1    1
4     64    6     1    1
5     76    8     1    1
6     93   10     1    1
> str(ChickWeight)
Classes ‘nfnGroupedData’, ‘nfGroupedData’, ‘groupedData’ and 'data.frame':      578 obs. of  4 variables:
 $ weight: num  42 51 59 64 76 93 106 125 149 171 ...
 $ Time  : num  0 2 4 6 8 10 12 14 16 18 ...
 $ Chick : Ord.factor w/ 50 levels "18"<"16"<"15"<..: 15 15 15 15 15 15 15 15 15 15 ...
 $ Diet  : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
 - attr(*, "formula")=Class 'formula' length 3 weight ~ Time | Chick
  .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> 
 - attr(*, "outer")=Class 'formula' length 2 ~Diet
  .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> 
 - attr(*, "labels")=List of 2
  ..$ x: chr "Time"
  ..$ y: chr "Body weight"
 - attr(*, "units")=List of 2
  ..$ x: chr "(days)"
  ..$ y: chr "(gm)"

Get One Column: Now that we have a data frame named ChickWeight loaded into R, we can take subsets of these 578 observations. First, let’s assume we just want to pull out the column of weights. There are two ways we can do this: specifying the column by name, or specifying the column by its order of appearance. The general form for pulling information from data frames is data.frame[rows,columns] so you can get the first column in either of these two ways:

ChickWeight[,1]   		# get all rows, but only the first column
ChickWeight[,c("weight")]	# get all rows, and only the column named “weight”

Get Multiple Columns: If you want more than one column, you can specify the column numbers or the names of the variables that you want to extract. If you want to get the weight and diet columns, you would do this:

ChickWeight[,c(1,4)]   		# get all rows, but only 1st and 4th columns
ChickWeight[,c("weight","Diet")]	# get all rows, only “weight” & “Diet” columns

If you want more than one column and those columns are next to each other, you can do this:

ChickWeight[,c(1:3)]

Get One Row: You can get the first row similarly to how you got the first column, and any other row the same way:

ChickWeight[1,]   		# get first row, and all columns
ChickWeight[82,]   		# get 82nd row, and all columns

Get Multiple Rows: If you want more than one row, you can specify the row numbers you want like this:

> ChickWeight[c(1:6,15,18,27),] 
   weight Time Chick Diet      
1      42    0     1    1   
2      51    2     1    1 
3      59    4     1    1    
4      64    6     1    1    
5      76    8     1    1 
6      93   10     1    1    
15     58    4     2    1    
18    103   10     2    1 
27     55    4     3    1    

What Protests and Revolutions Reveal About Innovation

The following book review will appear in an issue of the Quality Management Journal later this year:

The End of Protest: A New Playbook for Revolution.   2016.  Micah White.  Toronto, Ontario, Canada. Alfred A. Knopf Publishing.  317 pages.

You may wonder why I’m reviewing a book written by the creator of the Occupy movement for an audience of academics and practitioners who care about quality and continuous improvement in organizations, many of which are trying to not only sustain themselves but also (in many cases) to make a profit. The answer is simple: by understanding how modern social movements are catalyzed by decentralized (and often autonomous) interactive media, we will be better able to achieve some goals we are very familiar with. These include 1) capturing the rapidly changing “Voice of the Customer” and, in particular, gaining access to its silent or hidden aspects, 2) promoting deep engagement, not just in work but in the human spirit, and 3) gaining insights into how innovation can be catalyzed and sustained in a truly democratic organization.

This book is packed with meticulously researched cases, and deeply reflective analysis. As a result, is not an easy read, but experiencing its modern insights in terms of the historical context it presents is highly rewarding. Organized into three sections, it starts by describing the events leading up to the Occupy movement, the experience of being a part of it, and why the author feels Occupy fell short of its objectives. The second section covers several examples of protests, from ancient history to modern times, and extracts the most important strategic insight from each event. Next, a unified theory of revolution is presented that reconciles the unexpected, the emotional, and the systematic aspects of large-scale change.

The third section speaks directly to innovation. Some of the book’s most powerful messages, the principles of revolution, are presented in Chapter 14. “Understanding the principles behind revolution,” this chapter begins, “allows for unending tactical innovation that shifts the paradigms of activism, creates new forms of protest, and gives the people a sudden power over their rulers.” If we consider that we are often “ruled” by the status quo, then these principles provide insight into how we can break free: short sprints, breaking patterns, emphasizing spirit, presenting constraints, breaking scripts, transposing known tactics to new environmental contexts, and proposing ideas from the edge. The end result is a masterful work that describes how to hear, and mobilize, the collective will.

 

Reviewed by

Dr. Nicole M. Radziwill

 

The Value of Defining Context

Image Credit: Doug Buckley of http://hyperactive.to

Image Credit: Doug Buckley of http://hyperactive.to

The most important stage of problem-solving in organizations is often one of the earliest: getting everyone on the same page by defining the concepts, processes, and desired outcomes that are central to understanding the problem and formulating a solution. (“Everyone” can be the individuals on a project team, or the individuals that contribute actions to a process, or both.) Too often, we assume that the others around us see and experience the world the same way we do. In many cases, our assessments are not too far apart, which is how most people can get away with making this assumption on a regular basis.

In fact, some people experience things so differently that they don’t even “picture” anything in their minds. Can you believe it?

I first realized this divergence in the work context a few years ago, when a colleague and I were advising a project at a local social services office. We asked our students to document the process that was being used to process claims. There were nearly ten people who were part of this claims-processing activity, and our students interviewed all of them, discovering that each person had a remarkably different idea about the process that they were all engaged in! No wonder the claims processing time was nearly two months long.

We helped them all — literally — get onto the same page, and once they all had the same mental map of the process, time-in-system for each claim dropped to 10 days. (This led us to the quantum-esque conclusion that there is no process until it is observed.)

Today, I read about how mathematician Keith Devlin revolutionized the process of intelligence gathering after 9/11 using this same approach… by going back to one of the first principles he learned in his academic training:

So what had I done? Nothing really — from my perspective. My task was to find a way of analyzing how context influences data analysis and reasoning in highly complex domains involving military, political, and social contexts. I took the oh-so-obvious (to me) first step. I need to write down as precise a mathematical definition as possible of what a context is. It took me a couple of days…I can’t say I was totally satisfied with it…but it was the best I could do, and it did at least give me a firm base on which to start to develop some rudimentary mathematical ideas.

The fairly large group of really smart academics, defense contractors, and senior DoD personnel spent the entire hour of my allotted time discussing that one definition. The discussion brought out that all the different experts had a different conception of what a context is — a recipe for disaster.

What I had given them was, first, I asked the question “What is a context?” Since each person in the room besides me had a good working concept of context — different ones, as I just noted — they never thought to write down a formal definition. It was not part of what they did. And second, by presenting them with a formal definition, I gave them a common reference point from which they could compare and contrast their own notions. There we had the beginnings of disaster avoidance.

Getting people to very precisely understand the definitions, concepts, processes, and desired outcomes that are central to a problem might take some time and effort, but it is always extremely valuable.

When you face a situation like this in mathematics, you spend a lot of time going back to the basics. You ask questions like, “What do these words mean in this context?” and, “What obvious attempts have already been ruled out, and why?” More deeply, you’d ask, “Why are these particular open questions important?” and, “Where do they see this line of inquiry leading?”

(You can read the full article about Devlin, and more important lessons from mathematical thinking, Here.)

View story at Medium.com

Using xda with googlesheets in R

Image Credit: Doug Buckley of http://hyperactive.to

Image Credit: Doug Buckley of http://hyperactive.to

Want to do a quick, exploratory data analysis in R of your data that’s stored in a spreadsheet on Google Drive? You’re in luck, because now you can use the new xda package in conjunction with Jenny Bryan‘s googlesheets. There are some quirks, though, and that’s what this post is all about.

Before proceeding, you should review this recent article from R-Bloggers called “Introducing xda”.

First, be sure to install the googlesheets and xda packages. Although googlesheets is on CRAN, xda is not, and you’ll have to bring it in directly from github. You can actually do the same for googlesheets if you like:

install.packages("devtools")
library(devtools)
install_github("jennybc/googlesheets")
install_github("ujjwalkarn/xda")
library(googlesheets)
library(xda)

Next, you’ll have to show R how to access your Google spreadsheet. While you are looking at your spreadsheet, go to File -> Publish to the Web. The URL that’s in the text box is the one you want to capture. Just to make sure it works, copy and paste it into a new browser address window and see if you can display your spreadsheet in your browser.

If you want to import the data at https://docs.google.com/spreadsheets/d/1DO0ksD8d-rn_j2Yn7DQKZDPBrhrvZTpgszewxokjWKU/pubhtml into R, for example, you’ll need to know the spreadsheet’s key. That’s the long string of unintelligible numbers and letters between the “d” and the “pubhtml”. So, my key would be “1DO0ksD8d-rn_j2Yn7DQKZDPBrhrvZTpgszewxokjWKU” — which you’ll see in this next block of code:

> my.gs <- gs_key("1DO0ksD8d-rn_j2Yn7DQKZDPBrhrvZTpgszewxokjWKU") 
> my.data <- gs_read(my.gs) # Retrieves data from googlesheets and places it into an R object. 
> my.df <- as.data.frame(my.data)  # Important! xda needs you to extract only the data in a data frame.

Now, you can access your data. Try head(my.df) to make sure you’ve imported it properly.

Next, it’s time for exploratory data analysis. There are three commands available:

  • numSummary – takes a data frame as an argument, provides descriptive statistics, quantiles, and missing data info for quantitative variables
  • charSummary – takes a data frame as an argument, provides counts, missing data info, and number of unique factors for quantitative variables
  • bivariate – takes a data frame and two quantitative variables as an argument, and performs a quick bivariate analysis (giving this categorical variables, or giving this one categorical and one quantitative variable, will throw an error)

Here’s what happens when you run those commands on the data you just loaded in from your Google spreadsheet:

> numSummary(my.df)
               n   mean    sd median   max  min  mode miss miss%    1%   5%   25%   50%   75%   90%   95%   99%
obs          200 100.50 57.88  100.5 200.0  1.0   1.0    0     0  2.99 11.0  50.8 100.5 150.2 180.1 190.0 198.0
heartrates   200  73.01  7.43   73.0  96.3 53.4  71.2    0     0 56.49 61.2  68.4  73.0  77.4  82.6  85.7  90.3
systolics    200 139.27 29.27  138.0 221.0 59.0 139.0    0     0 79.98 96.0 117.0 138.0 160.0 177.2 188.1 205.0
diastolics   200  87.76  9.74   87.7 116.4 62.4  85.2    0     0 66.01 72.2  81.9  87.7  93.7 100.3 104.5 108.3
bmis         200  25.53  3.06   25.0  33.1 18.4  24.7    0     0 19.00 21.0  23.5  25.0  27.7  29.6  31.2  32.8
ages         200  44.41 14.59   45.0  70.0 18.0  30.0    0     0 18.00 22.0  32.0  45.0  57.0  64.1  67.0  70.0
heartpm      200  72.26  3.55   72.2  83.8 64.2  74.2    0     0 64.72 66.2  69.8  72.2  74.2  76.4  78.7  81.4
fitnesslevel 200   2.62  1.17    3.0   4.0  1.0   4.0    0     0  1.00  1.0   2.0   3.0   4.0   4.0   4.0   4.0

> charSummary(my.df)
          n miss miss% unique
genders 200    0     0      2
smokers 200    0     0      2
group   200    0     0      4

> bivariate(my.df,'heartrates','bmis')
     bin_bmis min_heartrates max_heartrates mean_heartrates
1   (18.3,22]          53.40          85.60           72.80
2   (22,25.7]          55.70          90.70           72.87
3 (25.7,29.4]          60.30          96.30           73.45
4 (29.4,33.1]          56.50          90.30           72.46

Observations:

  • There is a fourth “Plot” command but I couldn’t get it to work on any googlesheetsdata. The xda package is looking for class(range) to be anything other than “function”, which it was for every sheet I attempted to load.
  • There really should be an extra column in xda that displays the enumeration of all the unique values for the factors. It felt great to know how many unique values there were, but I would love to be reminded of what they are too, unless there are too many of them.

Please share your experiences using xda & googlesheets together in the comments! Thanks!

« Older Entries Recent Entries »