Category Archives: Data Science

Top 10 Business Books You Should Read in 2020


I read well over a hundred books a year, and review many for Quality Management Journal and Software Quality Professional. Today, I’d like to bring you my TOP 10 PICKS out of all the books I read in 2019. First, let me affirm that I loved all of these books — it was really difficult to rank them. The criteria I used were:

  1. Is the topic related to quality or improvement? The book had to focus on making people, process, or technology better in some way. (So even though Greg Satell’s Cascades provided an amazing treatment of how to start movements, which is helpful for innovation, it wasn’t as closely related to the themes of quality and improvement I was targeting.)
  2. Did the book have an impact on me? In particular, did it transform my thinking in some way?
  3. Finally, how big is the audience that would be interested in this book? (Although some of my picks are amazing for niche audiences, they will be less amazing for people who are not part of that group; they were ranked lower.)
  4. Did I read it in 2019? (Unfortunately, several amazing books I read at the end of 2018 like Siva Vaidhyanathan’s Antisocial Media.)

#10 – Understanding Agile Values & Principles (Duncan)

Duncan, Scott. (2019). Understanding Agile Values & Principles. An Examination of the Agile Manifesto. InfoQ, 106 pp. Available from https://www.infoq.com/minibooks/agile-values-principles

The biggest obstacle in agile transformation is getting teams to internalize the core values, and apply them as a matter of habit. This is why you see so many organizations do “fake agile” — do things like introduce daily stand-ups, declare themselves agile, and wonder why the success isn’t pouring in. Scott goes back to the first principles of the Agile Manifesto from 2001 to help leaders and teams become genuinely agile.

#9 – Risk-Based Thinking (Muschara)

Muschara, T. (2018). Risk-Based Thinking: Managing the Uncertainty of Human Error in Operations. Routledge/Taylor & Francis: Oxon and New York. 287 pages.

Risk-based thinking is one of the key tenets of ISO 9001:2015, which became the authoritative version in September 2018. Although clause 8.5.3 from ISO 9001:2008 indirectly mentioned risk, it was not a driver for identifying and executing preventive actions. The new emphasis on risk depends upon the organizational context (clause 4.1) and the needs and expectations of “interested parties” or stakeholders (clause 4.2).

Unfortunately, the ISO 9001 revision does not provide guidance for how to incorporate risk-based thinking into operations, which is where Muschara’s new book fills the gap. It’s detailed and complex, but practical (and includes immediately actionable elements) throughout. For anyone struggling with the new focus of ISO 9001:2015, this book will help you bring theory into practice.

#8 – The Successful Software Manager (Fung)

Fung, H. (2019). The Successful Software Manager. Packt Publishing, Birmingham UK, 433 pp.

There lots of books on the market that provide technical guidance to software engineers and quality assurance specialists, but little information to help them figure out how (and whether) to make the transition from developer to manager. Herman Fung’s new release fills this gap in a complete, methodical, and inspiring way. This book will benefit any developer or technical specialist who wants to know what software management entails and how they can adapt to this role effectively. It’s the book I wish I had 20 years ago.

#7 – New Power (Heimans & Timms)

Heiman, J. & Timms, H. (2018). New Power: How Power Works in Our Hyperconnected World – and How to Make it Work For You. Doubleday, New York, 325 pp.

As we change technology, the technology changes us. This book is an engaging treatise on how to navigate the power dynamics of our social media-infused world. It provides insight on how to use, and think in terms of, “platform culture”.

#6 – A Practical Guide to the Safety Profession (Maldonado)

Maldonado, J. (2019). A Practical Guide to the Safety Profession: The Relentless Pursuit (CRC Focus). CRC Press: Taylor & Francis, Boca Raton FL, 154 pp.

One of the best ways to learn about a role or responsibility is to hear stories from people who have previously served in those roles. With that in mind, if you’re looking for a way to help make safety management “real” — or to help new safety managers in your organization quickly and easily focus on the most important elements of the job — this book should be your go-to reference. In contrast with other books that focus on the interrelated concepts in quality, safety, and environmental management, this book gets the reader engaged by presenting one key story per chapter. Each story takes an honest, revealing look at safety. This book is short, sweet, and high-impact for those who need a quick introduction to the life of an occupational health and safety manager.

# 5 – Data Quality (Mahanti)

Mahanti, R. (2018). Data Quality: Dimensions, Measurement, Strategy, Management and Governance. ASQ Quality Press, Milwaukee WI, 526 pp.

I can now confidently say — if you need a book on data quality, you only need ONE book on data quality. Mahanti, who is one of the Associate Editors of Software Quality Professional, has done a masterful job compiling, organizing, and explaining all aspects of data quality. She takes a cross-industry perspective, producing a handbook that is applicable for solving quality challenges associated with any kind of data.

Throughout the book, examples and stories are emphasized. Explanations supplement most concepts and topics in a way that it is easy to relate your own challenges to the lessons within the book. In short, this is the best data quality book on the market, and will provide immediately actionable guidance for software engineers, development managers, senior leaders, and executives who want to improve their capabilities through data quality.

#4 – The Innovator’s Book (McKeown)

McKeown, M. (2020). The Innovator’s Book: Rules for Rebels, Mavericks and Innovators (Concise Advice). LID Publishing, 128 pp.

Want to inspire your teams to keep innovation at the front of their brains? If so, you need a coffee table book, and preferably one where the insights come from actual research. That’s what you’ve got with Max’s new book. (And yes, it’s “not published yet” — I got an early copy. Still meets my criteria for 2019 recommendations.)

#3 – The Seventh Level (Slavin)

Slavin, A. (2019). The Seventh Level: Transform Your Business Through Meaningful Engagement with Customer and Employees. Lioncrest Publishing, New York, 250 pp.

For starters, Amanda is a powerhouse who’s had some amazing marketing and branding successes early in her career. It makes sense, then, that she’s been able to encapsulate the lessons learned into this book that will help you achieve better customer engagement. How? By thinking about engagement in terms of different levels, from Disengagement to Literate Thinking. By helping your customers take smaller steps along this seven step path, you can make engagement a reality.

#2 – Principle Based Organizational Structure (Meyer)

Meyer, D. (2019). Principle-Based Organizational Structure: A Handbook to Help You Engineer Entrepreneurial Thinking and Teamwork into Organizations of Any Size. NDMA, 420 pp.

This is my odds-on impact favorite of the year. It takes all the best practices I’ve learned over the past two decades about designing an organization for laser focus on strategy execution — and packages them up into a step-by-step method for assessing and improving organizational design. This book can help you fix broken organizations… and most organizations are broken in some way.

#1 Story 10x (Margolis)

Margolis, M. (2019). Story 10x: Turn the Impossible Into the Inevitable. Storied, 208 pp.

You have great ideas, but nobody else can see what you see. Right?? Michael’s book will help you cut through the fog — build a story that connects with the right people at the right time. It’s not like those other “build a narrative” books — it’s like a concentrated power pellet, immediately actionable and compelling. This is my utility favorite of the year… and it changed the way I think about how I present my own ideas.


Hope you found this list enjoyable! And although it’s not on my Top 10 for obvious reasons, check out my Introductory Statistics and Data Science with R as well — I released the 3rd edition in 2019.

KPIs vs Metrics: What’s the Difference? And Why Does it Matter?

Years ago I consulted for an organization that had an enticing mission, a dynamic and highly qualified workforce of around 200 people, and an innovative roadmap that was poised to make an impact — estimated to be ~$350-500M (yes really, that big). But there was one huge problem.

As engineers, the leadership could readily provide information about uptime and Service Level Agreements (SLAs). But they had no idea whether they were on track to meet strategic goals — or even whether they would be able to deliver key operations projects — at all! We recommended that they focus on developing metrics, and provided some guidelines for the types of metrics that might help them deliver their products and services — and satisfy their demanding customers.

Unfortunately, we made a critical mistake.

They were overachievers. When we came back six months later, they had nearly a thousand metrics. (A couple of the guys, beaming with pride, didn’t quite know how to interpret our non-smiling faces.)

“So tell us… what are your top three goals for the year, and are you on track to meet them?” we asked.

They looked at each other… then at us. They looked down at their papers. They glanced at each other again. It was in that moment they realized the difference between KPIs and metrics.

  • KPIs are KEY Performance Indicators. They have meaning. They are important. They are significant. And they relate to the overall goals of your business.
  • One KPI is associated with one or more metrics. Metrics are numbers, counts, percentages, or other values that provide insight about what’s happened in the past (descriptive metrics), what is happening right now (diagnostic metrics), what will happen (predictive metrics or forecasts), or what should happen (prescriptive metrics or recommendations).

For the human brain to be able to detect and respond to patterns in organizational performance, limit the number of KPIs!

A good rule of thumb is to select 3-5 KPIs (but never more than 8 or 9!) per logical division of your organization. A logical division can be a functional area (finance, IT, call center), a product line, a program or collection of projects, or a collection of strategic initiatives.

Or, use KPIs and metrics to describe product performance, process performance, customer satisfaction, customer engagement, workforce capability, workforce capacity, leadership performance, governance performance, financial performance, market performance, and how well you are executing on the action plans that drive your strategic initiatives (strategy performance). These logical divisions come from the Baldrige Excellence Framework.

Similarly, try to limit the number of projects and initiatives in each functional area — and across your organization. Work gets done more easily when people understand how all the parts of your organization relate to one another.

What happened to the organization from the story, you might ask? Within a year, they had boiled down their metrics into 8 functional areas, were working on 4 strategic initiatives, and had no more than 5 KPIs per functional area. They found it really easy to monitor the state of their business, and respond in an agile and capable way. (They were still collecting lots more metrics, but they only had to dig into them on occasion.)

Remember… metrics are helpful, but:

KPIs are KEY!!

You don’t have thousands of keys to your house… and you don’t want thousands of KPIs. Take a critical look at what’s most important to your business, and organize that information in a way that’s accessible. You’ll find it easier to manage everything — strategic initiatives, projects, and operations.

easyMTS: My First R Package (Story, and Results)

This weekend I decided to create my first R package… it’s here! easyMTS makes it possible to create and evaluate a Mahalanobis-Taguchi System (MTS) for pseudo-classification:

https://github.com/NicoleRadziwill/easyMTS

Although I’ve been using R for 15 years, developing a package has been the one thing slightly out of reach for me. Now that I’ve been through the process once, with a package that’s not completely done (but at least has a firm foundation, and is usable to some degree), I can give you some advice:

  • Make sure you know R Markdown before you begin.
  • Some experience with Git and Github will be useful. Lots of experience will be very, very useful.
  • Write the functions that will go into your package into a file that you can source into another R program and use. If your programs work when you run the code this way, you will have averted many problems early.

The process I used to make this happen was:

I hope you enjoy following along with my process, and that it helps you write packages too. If I can do it, so can you!

My First R Package (Part 3)

After refactoring my programming so that it was only about 10 lines of code, using 12 functions I wrote an loaded in via the source command, I went through all the steps in Part 1 of this blog post and Part 2 of this blog post to set up the R package infrastructure using testthis in RStudio. Then things started humming along with the rest of the setup:

> use_mit_license("Nicole Radziwill")
✔ Setting active project to 'D:/R/easyMTS'
✔ Setting License field in DESCRIPTION to 'MIT + file LICENSE'
✔ Writing 'LICENSE.md'
✔ Adding '^LICENSE\\.md$' to '.Rbuildignore'
✔ Writing 'LICENSE'

> use_testthat()
✔ Adding 'testthat' to Suggests field in DESCRIPTION
✔ Creating 'tests/testthat/'
✔ Writing 'tests/testthat.R'
● Call `use_test()` to initialize a basic test file and open it for editing.

> use_vignette("easyMTS")
✔ Adding 'knitr' to Suggests field in DESCRIPTION
✔ Setting VignetteBuilder field in DESCRIPTION to 'knitr'
✔ Adding 'inst/doc' to '.gitignore'
✔ Creating 'vignettes/'
✔ Adding '*.html', '*.R' to 'vignettes/.gitignore'
✔ Adding 'rmarkdown' to Suggests field in DESCRIPTION
✔ Writing 'vignettes/easyMTS.Rmd'
● Modify 'vignettes/easyMTS.Rmd'

> use_citation()
✔ Creating 'inst/'
✔ Writing 'inst/CITATION'
● Modify 'inst/CITATION'

Add Your Dependencies

> use_package("ggplot2")
✔ Adding 'ggplot2' to Imports field in DESCRIPTION
● Refer to functions with `ggplot2::fun()`
> use_package("dplyr")
✔ Adding 'dplyr' to Imports field in DESCRIPTION
● Refer to functions with `dplyr::fun()`

> use_package("magrittr")
✔ Adding 'magrittr' to Imports field in DESCRIPTION
● Refer to functions with `magrittr::fun()`
> use_package("tidyr")
✔ Adding 'tidyr' to Imports field in DESCRIPTION
● Refer to functions with `tidyr::fun()`

> use_package("MASS")
✔ Adding 'MASS' to Imports field in DESCRIPTION
● Refer to functions with `MASS::fun()`

> use_package("qualityTools")
✔ Adding 'qualityTools' to Imports field in DESCRIPTION
● Refer to functions with `qualityTools::fun()`

> use_package("highcharter")
Registered S3 method overwritten by 'xts':
  method     from
  as.zoo.xts zoo 
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
✔ Adding 'highcharter' to Imports field in DESCRIPTION
● Refer to functions with `highcharter::fun()`

> use_package("cowplot")
✔ Adding 'cowplot' to Imports field in DESCRIPTION
● Refer to functions with `cowplot::fun()`

Adding Data to the Package

I want to include two files, one data frame containing 50 observations of a healthy group with 5 predictors each, and another data frame containing 15 observations from an abnormal or unhealthy group (also with 5 predictors). I made sure the two CSV files I wanted to add to the package were in my working directory first by using dir().

> use_data_raw()
✔ Creating 'data-raw/'
✔ Adding '^data-raw$' to '.Rbuildignore'
✔ Writing 'data-raw/DATASET.R'
● Modify 'data-raw/DATASET.R'
● Finish the data preparation script in 'data-raw/DATASET.R'
● Use `usethis::use_data()` to add prepared data to package

> mtsdata1 <- read.csv("MTS-Abnormal.csv") %>% mutate(abnormal=1)
> usethis::use_data(mtsdata1)
✔ Creating 'data/'
✔ Saving 'mtsdata1' to 'data/mtsdata1.rda'

> mtsdata2 <- read.csv("MTS-Normal.csv") %>% mutate(normal=1)
> usethis::use_data(mtsdata2)
✔ Saving 'mtsdata2' to 'data/mtsdata2.rda'

Magically, this added my two files (in .rds format) into my /data directory. (Now, though, I don’t know why the /data-raw directory is there… maybe we’ll figure that out later.) I decided it was time to commit these to my repository again:

Following the instruction above, I re-knit the README.Rmd and then it was possible to commit everything to Github again. At which point I ended up in a fistfight with git, again saved only by my software engineer partner who uses Github all the time:

I think it should be working. The next test will be if anyone can install this from github using devtools. Let me know if it works for you… it works for me locally, but you know how that goes. The next post will show you how to use it 🙂

install.packages("devtools")
install_github("NicoleRadziwill/easyMTS")

SEE WHAT WILL BECOME THE easyMTS VIGNETTE –>

My First R Package (Part 2)

In Part 1, I set up RStudio with usethis, and created my first Minimum Viable R Package (MVRP?) which was then pushed to Github to create a new repository.

I added a README:

> use_readme_rmd()
✔ Writing 'README.Rmd'
✔ Adding '^README\\.Rmd$' to '.Rbuildignore'
● Modify 'README.Rmd'
✔ Writing '.git/hooks/pre-commit'

Things were moving along just fine, until I got this unkind message (what do you mean NOT an R package???!! What have I been doing the past hour?)

> use_testthat()
Error: `use_testthat()` is designed to work with packages.
Project 'easyMTS' is not an R package.

> use_mit_license("Nicole Radziwill")
✔ Setting active project to 'D:/R/easyMTS'
Error: `use_mit_license()` is designed to work with packages.
Project 'easyMTS' is not an R package.

Making easyMTS a Real Package

I sent out a tweet hoping to find some guidance, because Stack Overflow and Google and the RStudio community were coming up blank. As soon as I did, I discovered this button in RStudio:

The first time I ran it, it complained that I needed Rtools, but that Rtools didn’t exist for version 3.6.1. I decided to try finding and installing Rtools anyway because what could I possibly lose. I went to my favorite CRAN repository and found a link for Rtools just under the link for the base install:

I’m on Windows 10, so this downloaded an .exe which I quickly right-clicked on to run… the installer did its thing, and I clicked “Finish”, assuming that all was well. Then I went back into RStudio and tried to do Build -> Clean and Rebuild… and here’s what happened:

IT WORKED!! (I think!!!)

It created a package (top right) and then loaded it into my RStudio session (bottom left)! It loaded the package name into the package console (bottom right)!

I feel like this is a huge accomplishment for now, so I’m going to move to Part 3 of my blog post. We’ll figure out how to close the gaps that I’ve invariably introduced by veering off-tutorial.

GO TO PART 3 –>

My First R Package (Part 1)

(What does this new package do? Find out here.)

I have had package-o-phobia for years, and have skillfully resisted learning how to build a new R package. However, I do have a huge collection of scripts on my hard drive with functions in them, and I keep a bunch of useful functions up on Github so anyone who wants can source and use them. I source them myself! So, really, there’s no reason to package them up and (god forbid) submit them to CRAN. I’m doing fine without packages!

Reality check: NO. As I’ve been told by so many people, if you have functions you use a lot, you should write a package. You don’t even have to think about a package as something you write so that other people can use. It is perfectly fine to write a package for an audience of one — YOU.

But I kept making excuses for myself until very recently, when I couldn’t find a package to do something I needed to do, and all the other packages were either not getting the same answers as in book examples OR they were too difficult to use. It was time.

So armed with moral support and some exciting code, I began the journey of a thousand miles with the first step, guided by Tomas Westlake and Emil Hvitfeldt and of course Hadley. I already had some of the packages I needed, but did not have the most magical one of all, usethis:

install.packages("usethis")

library(usethis)
library(roxygen2)
library(devtools)

Finding a Package Name

First, I checked to see if the package name I wanted was available. It was not available on CRAN, which was sad:

> available("MTS")
Urban Dictionary can contain potentially offensive results,
  should they be included? [Y]es / [N]o:
1: Y
-- MTS -------------------------------------------------------------------------
Name valid: ✔
Available on CRAN: ✖ 
Available on Bioconductor: ✔
Available on GitHub:  ✖ 
Abbreviations: http://www.abbreviations.com/MTS
Wikipedia: https://en.wikipedia.org/wiki/MTS
Wiktionary: https://en.wiktionary.org/wiki/MTS

My second package name was available though, and I think it’s even better. I’ve written code to easily create and evaluate diagnostic algorithms using the Mahalanobis-Taguchi System (MTS), so my target package name is easyMTS:

> available("easyMTS")
-- easyMTS ------------------------------------------------------------
Name valid: ✔
Available on CRAN: ✔ 
Available on Bioconductor: ✔
Available on GitHub:  ✔ 
Abbreviations: http://www.abbreviations.com/easy
Wikipedia: https://en.wikipedia.org/wiki/easy
Wiktionary: https://en.wiktionary.org/wiki/easy
Sentiment:+++

Create Minimum Viable Package

Next, I set up the directory structure locally. Another RStudio session started up on its own; I’m hoping this is OK.

> create_package("D:/R/easyMTS")
✔ Creating 'D:/R/easyMTS/'
✔ Setting active project to 'D:/R/easyMTS'
✔ Creating 'R/'
✔ Writing 'DESCRIPTION'
Package: easyMTS
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R (parsed):
    * First Last <first.last@example.com> [aut, cre] (<https://orcid.org/YOUR-ORCID-ID>)
Description: What the package does (one paragraph).
License: What license it uses
Encoding: UTF-8
LazyData: true
✔ Writing 'NAMESPACE'
✔ Writing 'easyMTS.Rproj'
✔ Adding '.Rproj.user' to '.gitignore'
✔ Adding '^easyMTS\\.Rproj$', '^\\.Rproj\\.user$' to '.Rbuildignore'
✔ Opening 'D:/R/easyMTS/' in new RStudio session
✔ Setting active project to '<no active project>'

Syncing with Github

use_git_config(user.name = "nicoleradziwill", user.email = "nicole.radziwill@gmail.com")

browse_github_token()

This took me to a page on Github where I entered my password, and then had to go down to the bottom of the page to click on the green button that said “Generate Token.” They said I would never be able to see it again, so I gmailed it to myself for easy searchability. Next, I put this token where it is supposed to be locally:

edit_r_environ()

A blank file popped up in RStudio, and I added this line, then saved the file to its default location (not my real token):

GITHUB_PAT=e54545x88f569fff6c89abvs333443433d

Then I had to restart R and confirm it worked:

github_token()

This revealed my token! I must have done the Github setup right. Finally I could proceed with the rest of the git setup:

> use_github()
✔ Setting active project to 'D:/R/easyMTS'
Error: Cannot detect that project is already a Git repository.
Do you need to run `use_git()`?
> use_git()
✔ Initialising Git repo
✔ Adding '.Rhistory', '.RData' to '.gitignore'
There are 5 uncommitted files:
* '.gitignore'
* '.Rbuildignore'
* 'DESCRIPTION'
* 'easyMTS.Rproj'
* 'NAMESPACE'
Is it ok to commit them?

1: No
2: Yeah
3: Not now

Selection: use_github()
Enter an item from the menu, or 0 to exit
Selection: 2
✔ Adding files
✔ Commit with message 'Initial commit'
● A restart of RStudio is required to activate the Git pane
Restart now?

1: No way
2: For sure
3: Nope

Selection: 2

When I tried to commit to Github, it was asking me if the description was OK, but it was NOT. Every time I said no, it kicked me out. Turns out it wanted me to go directly into the DESCRIPTION file and edit it, so I did. I used Notepad because this was crashing RStudio. But this caused a new problem:

Error: Uncommited changes. Please commit to git before continuing.

This is the part of the exercise where it’s great to be living with a software engineer who uses git and Github all the time. He pointed me to a tiny little tab that said “Terminal” in the bottom left corner of RStudio, just to the right of “Console”. He told me to type this, which unstuck me:

THEN, when I went back to the Console, it all worked:

> use_git()
> use_github()
✔ Checking that current branch is 'master'
Which git protocol to use? (enter 0 to exit) 

1: ssh   <-- presumes that you have set up ssh keys
2: https <-- choose this if you don't have ssh keys (or don't know if you do)

Selection: 2
● Tip: To suppress this menu in future, put
  `options(usethis.protocol = "https")`
  in your script or in a user- or project-level startup file, '.Rprofile'.
  Call `usethis::edit_r_profile()` to open it for editing.
● Check title and description
  Name:        easyMTS
  Description: 
Are title and description ok?

1: Yes
2: Negative
3: No

Selection: 1
✔ Creating GitHub repository
✔ Setting remote 'origin' to 'https://github.com/NicoleRadziwill/easyMTS.git'
✔ Pushing 'master' branch to GitHub and setting remote tracking branch
✔ Opening URL 'https://github.com/NicoleRadziwill/easyMTS'

This post is getting long, so I’ll split it into parts. See you in Part 2.

GO TO PART 2 –>

« Older Entries