Category Archives: JIT/Manufacturing

Value Propositions for Quality 4.0

In previous articles, we introduced Quality 4.0, the pursuit of performance excellence as an integral part of an organization’s digital transformation. It’s one aspect of Industry 4.0 transformation towards intelligent automation: smart, hyperconnected(*) agents deployed in environments where humans and machines cooperate and leverage data to achieve shared goals.

Automation is a spectrum: an operator can specify a process that a computer or intelligent agent executes, the computer can make decisions for an operator to approve or adjust, or the computer can make and execute all decisions. Similarly, machine intelligence is a spectrum: an algorithm can provide advice, take action with approvals or adjustments, or take action on its own. We have to decide what value is generated when we introduce various degrees of intelligence and automation in our organizations.

How can Quality 4.0 help your organization? How can you improve the performance of your people, projects, products, and entire organizations by implementing technologies like artificial intelligence, machine learning, robotic process automation, and blockchain?

A value proposition is a statement that explains what benefits a product or activity will deliver. Quality 4.0 initiatives have these kinds of value propositions:

  1. Augment (or improve upon) human intelligence
  2. Increase the speed and quality of decision-making
  3. Improve transparency, traceability, and auditability
  4. Anticipate changes, reveal biases, and adapt to new circumstances and knowledge
  5. Evolve relationships and organizational boundaries to reveal opportunities for continuous improvement and new business models
  6. Learn how to learn; cultivate self-awareness and other-awareness as a skill

Quality 4.0 initiatives add intelligence to monitoring and managing operations – for example, predictive maintenance can help you anticipate equipment failures and proactively reduce downtime. They can help you assess supply chain risk on an ongoing basis, or help you decide whether to take corrective action. They can also improve help you improve cybersecurity: documenting and benchmarking processes can provide a basis for detecting anomalies, and understanding expected performance can help you detect potential attacks.


(*) Hyperconnected = (nearly) always on, (nearly) always accessible.

What is Quality 4.0?

Image Credit: Doug Buckley of http://hyperactive.to

My first post of the year addresses an idea that’s just starting to gain traction – one you’ll hear a lot more about from me in 2018: Quality 4.0.  It’s not a fad or trend, but a reminder that the business environment is changing, and that performance excellence in the future will depend on how well you adapt, change, and transform in response. Although we started building community around this concept at the ASQ Quality 4.0 Summit on Disruption, Innovation, and Change, held in November 2017 in Dallas, the truly revolutionary work is yet to come.

The term “Quality 4.0” comes from “Industry 4.0” – referring to the “fourth industrial revolution” – originally addressed at the Hannover (Germany) Fair in 2011. That meeting emphasized the increasing intelligence and interconnectedness in “smart” manufacturing systems and reflected on the newest technological innovations in historical context.

In the first industrial revolution (late 1700’s), steam and water power made it possible for production facilities to scale up and expanded the potential locations for production. By the late 1800’s, the discovery of electricity and development of associated infrastructure enabled the development of machines for mass production. In the US, the expansion of railways made it easier to obtain supplies and deliver finished goods. The availability of power also sparked a renaissance in computing, and digital computing emerged from its analog ancestor. The third industrial revolution came at the end of the 1960’s, with the invention of the Programmable Logic Controller (PLC). This made it possible to automate processes like filling and reloading tanks of liquids, turning engines on and off, and controlling sequences of events based on changing environmental conditions.

Although the growth and expansion of the internet accelerated innovation in the late 1990’s and 2000’s, we are just now poised for another industrial revolution. What’s changing?

  • Production & Availability of Information: More information is available because people and devices are producing it at greater rates than ever before. Falling costs of enabling technologies like sensors and actuators are catalyzing innovation in these areas.
  • Connectivity: In many cases, and from many locations, that information is instantly accessible over the internet. Improved network infrastructure is expanding the extent of connectivity, making it more widely available and more robust. (And unlike the 80’s and 90’s, there are far fewer communications protocols that are commonly encountered so it’s a lot easier to get one device to talk to another device on your network.)
  • Intelligent Processing: Affordable computing capabilities (and computing power!) are available to process that information so it can be incorporated into decision making. High-performance software libraries for advanced processing and visualization of data are easy to find, and easy to use. (In the past, we had to write our own… now we can use open-source solutions that are battle tested.
  • New Modes of Interaction: The way in which we can acquire and interact with information are also changing, in particular through new interfaces like Augmented Reality (AR) and Virtual Reality (VR), which expand possibilities for training and navigating a hybrid physical-digital environment with greater ease.
  • New Modes of Production: 3D printing, nanotechnology, and gene editing (CRISPR) are poised to change the nature and means of production in several industries. Technologies for enhancing human performance (e.g. exoskeletons, brain-computer interfaces, and even autonomous vehicles) will also open up new mechanisms for innovation in production. (Roco & Bainbridge (2002) describe many of these, and their prescience is remarkable.) New technologies like blockchain have the potential to change the nature of production as well, by challenging ingrained perceptions of trust, control, consensus, and value.

If the first industrial revolution was characterized by steam-powered machines, the second was characterized by electricity and assembly lines. Innovations in computing and industrial automation defined the third industrial revolution.  The fourth industrial revolution is one of intelligence: smart, hyperconnected cyber-physical systems in environments where humans and machines cooperate to achieved shared goals, and use data to generate value.

These enabling technologies originate in the physical, digital, and biological domains, and include the following:

  • Information
    • Affordable Sensors and Actuators
    • Big Data infrastructure (e.g. MapReduce, Hadoop, NoSQL databases)
  • Connectivity
    • 5G Networks
    • IPv6 Addresses (which expand the number of devices that can be put online)
    • Internet of Things (IoT)
    • Cloud Computing
  • Processing
    • Predictive Analytics
    • Artificial Intelligence
    • Machine Learning (incl. Deep Learning)
    • Data Science
  • Interaction
    • Augmented Reality (AR)
    • Mixed Reality (MR)
    • Virtual Reality (VR)
    • Diminished Reality (DR)
  • Construction
    • 3D Printing
    • Additive Manufacturing
    • Smart Materials
    • Nanotechnology
    • Gene Editing
    • Automated (Software) Code Generation
    • Robotic Process Automation (RPA)
    • Blockchain

Today’s quality profession was born during the middle of the second industrial revolution, when methods were needed to ensure that assembly lines ran smoothly – that they produced artifacts to specifications, that the workers knew how to engage in the process, and that costs were controlled. As industrial production matured, those methods grew to encompass the design of processes which were built to produce to specifications. In the 1980’s and 1990’s, organizations in the US started to recognize the importance of human capabilities and active engagement in quality as essential, and TQM, Lean, and Six Sigma gained in popularity. 

How will these methods evolve in an adaptive, intelligent environment? The question is largely still open, and that’s the essence of Quality 4.0.

Roco, M. C., & Bainbridge, W. S. (2002). Converging technologies for improving human performance: Integrating from the nanoscale. Journal of nanoparticle research4(4), 281-295. (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.465.7221&rep=rep1&type=pdf)

Where is Quality Management Headed?

Image Credit: Doug Buckley of http://hyperactive.to

Image Credit: Doug Buckley of http://hyperactive.to

[This post is in response to ASQ’s February topic for the Influential Voices group, which asks: Where do you plan to take your career in 2016? What’s your view of careers in quality today—what challenges is this field facing? How can someone starting out in quality succeed?]

We are about to experience a paradigm shift in production, operations, and service: a shift that will have direct consequences on the principles and practice of design, development, and quality management. This “fourth industrial revolution” of cyber-physical systems will require more people in the workforce to understand quality principles associated with co-creation of value, and to develop novel business models. New technical skills will become critical for a greater segment of workers, including embedded software, artificial intelligence, data science, analytics, Big Data (and data quality), and even systems integration. 

Over the past 20 years, we moved many aspects of our work and our lives online. And in the next 20 years, the boundaries between the physical world and the online world will blur — to a point where the distinction may become unnecessary.

Here is a vignette to illustrate the kinds of changes we can anticipate. Imagine the next generation FitBit, the personalized exercise assistant that keeps track of the number of steps you walk each day. As early as 2020, this device will not only automatically track your exercise patterns, but will also automatically integrate that information with your personal health records. Because diet strategies have recently been shown to be predominantly unfounded, and now researchers like Kevin Hall, Eran Elinav, and Eran Siegal know that the only truly effective diets are the ones that are customized to your body’s nutritional preferences [1], your FitBit and your health records will be able to talk to your food manager application to design the perfect diet for you (given your targets and objectives). Furthermore, to make it easy for you, your applications will also autonomously communicate with your refrigerator and pantry (to monitor how much food you have available), your local grocery store, and your calendar app so that food deliveries will show up when and only when you need to be restocked. You’re amazed that you’re spending less on food, less of it is going to waste, and you never have to wonder what you’re going to make for dinner. Your local grocery store is also greatly rewarded, not only for your loyalty, but because it can anticipate the demand from you and everyone else in your community – and create specials, promotions, and service strategies that are targeted to your needs (rather than just what the store guesses you need).

Although parts of this example may seem futuristic, the technologies are already in place. What is missing is our ability to link the technologies together using development processes that are effective and efficient – and in particular, coordinating and engaging the people  who will help make it happen. This is a job for quality managers and others who study production and operations management

As the Internet of Things (IoT) and pervasive information become commonplace, the fundamental nature and character of how quality management principles are applied in practice will be forced to change. As Eric Schmidt, former Chairman of Google, explains:  “the new age of artificial intelligence is beginning, and it’s a big deal.” [2] Here are some ways that this shift will impact researchers and practitioners interested in quality:

  • Strategic deployment of IoT technologies will help us simultaneously improve our use of enterprise assets, reduce waste, promote sustainability, and coordinate people and machines to more effectively meet strategic goals and operational targets.
  • Smart materials, embedded in our production and service ecosystems, will change our views of objects from inert and passive to embedded and engaged. For example, MIT has developed a “smart band-aid” that communicates with a wound, provides visual indicators of the healing process, and delivers medication as needed. [3] Software developers will need to know how to make this communication seamless and reliable in a variety of operations contexts.
  • Our technologies will be able to proactively anticipate the Voice of the Customer, enabling us to meet not only their stated and implied needs, but also their emergent needs and hard-to-express desires. Similarly, will the nature of customer satisfaction change as IoT becomes more pervasive?
  • Cloud and IoT-driven Analytics will make more information available for powerful decision-making (e.g. real-time weather analytics), but comes with its own set of challenges: how to find the data, how to assess data quality, and how to select and store data with likely future value to decision makers. This will be particularly challenging since analytics has not been a historical focus among quality managers. [4]
  • Smart, demand-driven supply chains (and supply networks) will leverage Big Data, and engage in automated planning, automatic adjustment to changing conditions or supply chain disruptions like war or extreme weather events, and self-regulation.
  • Smart manufacturing systems will implement real time communication between people, machines, materials, factories and warehouses, supply chain partners, and logistics partners using cloud computing. Production systems will adapt to demand as well as environmental factors, like the availability of resources and components. Sustainability will be a required core capability of all organizations that produce goods.
  • Cognitive manufacturing will implement manufacturing and service systems capable of perception, judgment, and improving quality autonomously – without the delays associated with human decision-making or the detection of issues.
  • Cybersecurity will be recognized as a critical component of all of the above. For most (if not all) of these next generation products and production systems, quality will not be possible without addressing information security.
  • The nature of quality assurance will also change, since products will continue to learn (and not necessarily meet their own quality requirements) after purchase or acquisition, until the consumer has used them for a while. In a December 2015 article I wrote for Software Quality Professional, I ask “How long is the learning process for this technology, and have [product engineers] designed test cases to accommodate that process after the product has been released? The testing process cannot find closure until the end of the ‘burn-in’ period when systems have fully learned about their surroundings.” [5]
  • We will need new theories for software quality practice in an era where embedded artificial intelligence and technological panpsychism (autonomous objects with awareness, perception, and judgment) are the norm.

How do we design quality into a broad, adaptive, dynamically evolving ecosystem of people, materials, objects, and processes? This is the extraordinarily complex and multifaceted question that we, as a community of academics and practitioners, must together address.

Just starting out in quality? My advice is to get a technical degree (science, math, or engineering) which will provide you with a solid foundation for understanding the new modes of production that are on the horizon. Industrial engineering, operations research, industrial design, and mechanical engineering are great fits for someone who wants a career in quality, as are statistics, data science, manufacturing engineering, and telecommunications. Cybersecurity and intelligence will become increasingly more central to quality management, so these are also good directions to take. Or, consider applying for an interdisciplinary program like JMU’s Integrated Science and Technology where I teach. We’re developing a new 21-credit sector right now where you can study EVERYTHING in the list above! Also, certifications are a plus, but in addition to completing training programs be sure to get formally certified by a professional organization to make sure that your credentials are widely recognized (e.g. through ASQ and ATMAE).

 

References

[1] http://www.huffingtonpost.com/entry/no-one-size-fits-all-diet-plan_564d605de4b00b7997f94272
[2] https://www.washingtonpost.com/news/innovations/wp/2015/09/15/what-eric-schmidt-gets-right-and-wrong-about-the-future-of-artificial-intelligence/
[3] http://news.mit.edu/2015/stretchable-hydrogel-electronics-1207
[4] Evans, J. R. (2015). Modern Analytics and the Future of Quality and Performance Excellence. The Quality Management Journal22(4), 6.
[5] Radziwill, N. M., Benton, M. C., Boadu, K., & Perdomo, W., 2015: A Case-Based Look at Integrating Social Context into Software Quality. Software Quality Professional, December.

Free Speech in the Internet of Things (IoT)

Image Credit: from "Reclaim Democracy" at http://reclaimdemocracy.org/who-are-citizens-united/

IF YOUR TOASTER COULD TALK, IT WOULD HAVE THE RIGHT TO FREE SPEECH. Image Credit: from “Reclaim Democracy” at http://reclaimdemocracy.org/who-are-citizens-united/

By the end of 2016, Gartner estimates that over 6.4 BILLION “things” will be connected to one another in the nascent Internet of Things (IoT). As innovation yields new products, services, and capabilities that leverage this ecosystem, we will need new conceptual models to ensure quality and support continuous improvement in this environment.

I wasn’t thinking about quality or IoT this morning… but instead, was trying to understand why so many people on Twitter and Facebook are linking Justice Scalia’s recent death to Citizens United. (I’d heard of Citizens United, but quite frankly, thought it was a soccer team. Embarrassing, I know.) I was surprised to find out that instead, Citizens United is a conservative U.S. political organization best known for its role in the 2010 Supreme Court Case Citizens United v. FEC.

That case removed many restrictions on political spending. With the “super-rich donating more than ever before to individual campaigns plus the ‘enormous’ chasm in wealth has given the super-rich the power to steer the economic and political direction of the United States and undermine its democracy.” Interesting, sure… but what’s more interesting to me is that the Citizens United case, according to this source

  • Strengthened First Amendment protection for corporations, 
  • Affirmed that Money = Speech, and
  • Affirmed that Non-Persons have the right to free speech.

The article goes on to state that “if your underpants could talk, they would be protected by free speech.”

Not too long ago, a statement like this would just be silly. But today, with immersive IoT looming, this isn’t too far-fetched. 

  • What will the world look (and feel) like when everything you interact with has a “voice”?
  • How will the “Voice of the Customer” be heard when all of that customer’s stuff ALSO has a voice?
  • What IS the “Voice of the Customer” in a world like this?

If Japan Can, Why Can’t We? A Retrospective

if-japan-canJune 24, 1980 is kind of like July 4, 1776 for quality management… that’s the pivotal day that NBC News aired its one hour and 16 minute documentary called “If Japan Can, Why Can’t We?” introducing W. Edwards Deming and his methods to the American public. The video has been unavailable for years, but as of just last week, it’s been posted on YouTube. So my sophomore undergrads in Production & Operations Management took a step back in time to get a taste of the environment in the manufacturing industry in the late 1970’s, and watched it during class this week.

The last time I watched it was in 1997, in a graduate industrial engineering class. It didn’t feel quite as dated as it does now, nor did I have the extensive experience in industry as a lens to view the interviews through. But what did surprise me is that the core of the challenges they were facing aren’t that much different than the ones we face today — and the groundbreaking good advice from Deming is still good advice today.

  • Before 1980, it was common practice to produce a whole bunch of stuff and then check and see which ones were bad, and throw them out. The video provides a clear and consistent story around the need to design quality in to products and processes, which then reduces (or eliminates) the need to inspect bad quality out.
  • It was also common to tamper with a process that was just exhibiting random variation. As one of the line workers in the documentary said, “We didn’t know. If we felt like there might be a problem with the process, we would just go fix it.” Deming’s applications of Shewhart’s methods made it clear that there is no need to tamper with a process that’s exhibiting only random variation.
  • Both workers and managers seemed frustrated with the sheer volume of regulations they had to address, and noted that it served to increase costs, decrease the rate of innovation, and disproportionately hurt small businesses. They noted that there was a great need for government and industry to partner to resolve these issues, and that Japan was a model for making these interactions successful.
  • Narrator Lloyd Dobyns remarked that “the Japanese operate by consensus… we, by competition.” He made the point that one reason Japanese industrial reforms were so powerful and positive was that their culture naturally supported working together towards shared goals. He cautioned managers that they couldn’t just drop in statistical quality control and expect a rosy outcome: improving quality is a cultural commitment, and the methods are not as useful in the absence of buy-in and engagement.

The video also sheds light on ASQ’s November question to the Influential Voices, which is: “What’s the key to talking quality with the C-Suite?” Typical responses include: think at the strategic level; create compelling arguments using the language of money; learn the art of storytelling and connect your case with what it important to the executives.

But I think the answer is much more subtle. In the 1980 video, workers comment on how amazed their managers were when Deming proclaimed that management was responsible for improving productivity. How could that be??!? Many managers at that time were convinced that if a productivity problem existed, it was because the workers didn’t work fast enough, or with enough skill — or maybe they had attitude problems! Certainly not because the managers were not managing well. Implementing simple techniques like improving training programs and establishing quality circles (which demonstrated values like increased transparency, considering all ideas, putting executives on the factory floor so they could learn and appreciate the work being done, increasing worker participation and engagement, encouraging work/life balance, and treating workers with respect and integrity) were already demonstrating benefits in some U.S. companies. But surprisingly, these simple techniques were not widespread, and not common sense.

Just like Deming advocated, quality belongs to everyone. You can’t go to a CEO and suggest that there are quality issues that he or she does not care about. More likely, the CEO believes that he or she is paying a lot of attention to quality. They won’t like it if you accuse them of not caring, or not having the technical background to improve quality. The C-Suite is in a powerful position where they can, through policies and governance, influence not only the actions and operating procedures of the system, but also its values and core competencies — through business model selection and implementation. 

What you can do, as a quality professional, is acknowledge and affirm their commitment to quality. Communicate quickly, clearly, and concisely when you do. Executives have to find the quickest ways to decompose and understand complex problems in rapidly changing external environments, and then make decisions that affect thousands (and sometimes, millions!) of people. Find examples and stories from other organizations who have created huge ripples of impact using quality tools and technologies, and relate them concretely to your company.

Let the C-Suite know that you can help them leverage their organization’s talent to achieve their goals, then continually build their trust.

The key to talking quality with the C-suite is empathy.

 

You may also be interested in “Are Deming’s 14 Points Still Valid?” from Nov 19, 2012.

Control Charts in R: A Guide to X-Bar/R Charts in the qcc Package

xbar-chartStatistical process control provides a mechanism for measuring, managing, and controlling processes. There are many different flavors of control charts, but if data are readily available, the X-Bar/R approach is often used. The following PDF describes X-Bar/R charts and shows you how to create them in R and interpret the results, and uses the fantastic qcc package that was developed by Luca Scrucca. Please let me know if you find it helpful!

Creating and Interpreting X-Bar/R Charts in R

A Simple Intro to Bayesian Change Point Analysis

The purpose of this post is to demonstrate change point analysis by stepping through an example of the technique in R presented in Rizzo’s excellent, comprehensive, and very mathy book, Statistical Computing with R, and then showing alternative ways to process this data using the changepoint and bcp packages. Much of the commentary is simplified, and that’s on purpose: I want to make this introduction accessible if you’re just learning the method. (Most of the code is straight from Rizzo who provides a much more in-depth treatment of the technique. I’ve added comments in the code to make it easier for me to follow, and that’s about it.)

The idea itself is simple: you have a sample of observations from a Poisson (counting) process (where events occur randomly over a period of time). You probably have a chart that shows time on the horizontal axis, and how many events occurred on the vertical axis. You suspect that the rate at which events occur has changed somewhere over that range of time… either the event is increasing in frequency, or it’s slowing down — but you want to know with a little more certainty. (Alternatively, you could check to see if the variance has changed, which would be useful for process improvement work in Six Sigma projects.)

You want to estimate the rate at which events occur BEFORE the shift (mu), the rate at which events occur AFTER the shift (lambda), and the time when the shift happens (k). To do it, you can apply a Markov Chain Monte Carlo (MCMC) sampling approach to estimate the population parameters at each possible k, from the beginning of your data set to the end of it. The values you get at each time step will be dependent only on the values you computed at the previous timestep (that’s where the Markov Chain part of this problem comes in). There are lots of different ways to hop around the parameter space, and each hopping strategy has a fancy name (e.g. Metropolis-Hastings, Gibbs, “reversible jump”).

In one example, Rizzo (p. 271-277) uses a Markov Chain Monte Carlo (MCMC) method that applies a Gibbs sampler to do the hopping – with the goal of figuring out the change point in number of coal mine disasters from 1851 to 1962. (Looking at a plot of the frequency over time, it appears that the rate of coal mining disasters decreased… but did it really? And if so, when? That’s the point of her example.) She gets the coal mining data from the boot package. Here’s how to get it, and what it looks like:

library(boot)
data(coal)
y <- tabulate(floor(coal[[1]]))
y <- y[1851:length(y)]
barplot(y,xlab="years", ylab="frequency of disasters")

coalmine-freq

First, we initialize all of the data structures we’ll need to use:

# initialization
n <- length(y) # number of data elements to process
m <- 1000 # target length of the chain
L <- numeric(n) # likelihood fxn has one slot per year
k[1] <- sample(1:n,1) # pick 1 random year to start at
mu[1] <- 1
lambda[1] <- 1
b1 <- 1
b2 <- 1
# now set up blank 1000 element arrays for mu, lambda, and k
mu <- lambda <- k <- numeric(m)

Here are the models for prior (hypothesized) distributions that she uses, based on the Gibbs sampler approach:

  • mu comes from a Gamma distribution with shape parameter of (0.5 + the sum of all your frequencies UP TO the point in time, k, you’re currently at) and a rate of (k + b1)
  • lambda comes from a Gamma distribution with shape parameter of (0.5 + the sum of all your frequencies AFTER the point in time, k, you’re currently at) and a rate of (n – k + b1) where n is the number of the year you’re currently processing
  • b1 comes from a Gamma distribution with a shape parameter of 0.5 and a rate of (mu + 1)
  • b2 comes from a Gamma distribution with a shape parameter of 0.5 and a rate of (lambda + 1)
  • a likelihood function L is also provided, and is a function of k, mu, lambda, and the sum of all the frequencies up until that point in time, k

At each iteration, you pick a value of k to represent a point in time where a change might have occurred. You slice your data into two chunks: the chunk that happened BEFORE this point in time, and the chunk that happened AFTER this point in time. Using your data, you apply a Poisson Process with a (Hypothesized) Gamma Distributed Rate as your model. This is a pretty common model for this particular type of problem. It’s like randomly cutting a deck of cards and taking the average of the values in each of the two cuts… then doing the same thing again… a thousand times. Here is Rizzo’s (commented) code:

# start at 2, so you can use initialization values as seeds
# and go through this process once for each of your m iterations
for (i in 2:m) {
 kt <- k[i-1] # start w/random year from initialization
 # set your shape parameter to pick mu from, based on the characteristics
 # of the early ("before") chunk of your data
 r <- .5 + sum(y[1:kt]) 
 # now use it to pick mu
 mu[i] <- rgamma(1,shape=r,rate=kt+b1) 
 # if you're at the end of the time periods, set your shape parameter
 # to 0.5 + the sum of all the frequencies, otherwise, just set the shape
 # parameter that you will use to pick lambda based on the later ("after")
 # chunk of your data
 if (kt+1 > n) r <- 0.5 + sum(y) else r <- 0.5 + sum(y[(kt+1):n])
 lambda[i] <- rgamma(1,shape=r,rate=n-kt+b2)
 # now use the mu and lambda values that you got to set b1 and b2 for next iteration
 b1 <- rgamma(1,shape=.5,rate=mu[i]+1)
 b2 <- rgamma(1,shape=.5,rate=lambda[i]+1)
 # for each year, find value of LIKELIHOOD function which you will 
 # then use to determine what year to hop to next
 for (j in 1:n) {
 L[j] <- exp((lambda[i]-mu[i])*j) * (mu[i]/lambda[i])^sum(y[1:j])
 }
 L <- L/sum(L)
 # determine which year to hop to next
 k[i] <- sample(1:n,prob=L,size=1)
}

Knowing the distributions of mu, lambda, and k from hopping around our data will help us estimate values for the true population parameters. At the end of the simulation, we have an array of 1000 values of k, an array of 1000 values of mu, and an array of 1000 values of lambda — we use these to estimate the real values of the population parameters. Typically, algorithms that do this automatically throw out a whole bunch of them in the beginning (the “burn-in” period) — Rizzo tosses out 200 observations — even though some statisticians (e.g. Geyer) say that the burn-in period is unnecessary:

> b <- 201 # treats time until the 200th iteration as "burn-in"
> mean(k[b:m])
[1] 39.765
> mean(lambda[b:m])
[1] 0.9326437
> mean(mu[b:m])
[1] 3.146413

The change point happened between the 39th and 40th observations, the arrival rate before the change point was 3.14 arrivals per unit time, and the rate after the change point was 0.93 arrivals per unit time. (Cool!)
After I went through this example, I discovered the changepoint package, which let me run through a similar process in just a few lines of code. Fortunately, the results were very similar! I chose the “AMOC” method which stands for “at most one change”. Other methods are available which can help identify more than one change point (PELT, BinSeg, and SegNeigh – although I got an error message every time I attempted that last method).

> results <- cpt.mean(y,method="AMOC")
> cpts(results)
cpt 
 36 
> param.est(results)
$mean
[1] 3.2500000 0.9736842
> plot(results,cpt.col="blue",xlab="Index",cpt.width=4)

coalmine-changepoint

I decided to explore a little further and found even MORE change point analysis packages! So I tried this example using bcp (which I presume stands for “Bayesian Change Point”) and voila… the output looks very similar to each of the previous two methods!!!):

coalmine-bcp

It’s at this point that the HARD part of the data science project would begin… WHY? Why does it look like the rate of coal mining accidents decreased suddenly? Was there a change in policy or regulatory requirements in Australia, where this data was collected? Was there some sort of mass exodus away from working in the mines, and so there’s a covariate in the number of opportunities for a mining disaster to occur? Don’t know… the original paper from 1979 doesn’t reveal the true story behind the data.

There are also additional resources on R Bloggers that discuss change point analysis:

(Note: If I’ve missed anything, or haven’t explained anything right, please provide corrections and further insights in the comments! Thank you.

« Older Entries