Author Archives: Nicole Radziwill

A Decade of PhD: What I’ve Learned from Academia and Industry

Today is Cinco de Mayo! It’s also the 10th Anniversary of my PhD defense…. something I carefully timed for late afternoon on this day in 2009. (I wanted to make sure I could celebrate the joyful occasion — or drown my sorrows — with 2-for-1 margaritas. Fortunately, the situation was liquid joy; unfortunately, I still got a hangover.)

I’m writing this post to share what I’ve learned about the value of getting a PhD (is there value?) and the applicability of PhD-level work to industry. If you’re considering more education, maybe this will help you decide whether it’s the right choice. If you’re in industry and trying to figure out whether to hire PhDs, some of what I write here might help. But first, some background!

I never even thought I’d get a PhD — it certainly didn’t happen out of intent or design. My family was poor and I studied ridiculously hard so I could “escape it.” I didn’t think I was smart enough for a PhD, even though I started college at 16 taking half undergrad classes and half grad classes in meteorology. I aced my grad classes and very maturely ignored my required classes, so I got kicked out. (At the same time, I wasn’t really fitting in with people… my roommate called me “Nerdcole”.) When I was let back in the department head wouldn’t let me take any grad classes so I got bored and burned out… not surprising since I was supporting myself, and working three jobs to make that happen. I quit school to work at an e-commerce startup when I was 18. A few months later, thanks to (good) peer pressure, I took 3 credit by exams to see if it would get me over the finish line, and thanks to some side skills I had picked up in vector calculus and statistics, it worked and I got the BS. But I was still left with a pretty bad GPA, and even worse self esteem, and I was convinced no one would ever let me into grad school.

I figured I’d focus on industry and help companies grow. There was no other choice.

The Back Story

After spending a couple years building web sites and storefronts (a huge feat in 1995 and 1996!) I took a job at a national lab as a systems analyst, supporting older scientists and engineers and helping them get work done. The main lesson I learned during this time was: Alignment between strategy and objectives doesn’t come for free (teams of people have to spend dedicated time on it), and most people are really disorganized. There had to be a better way to get work done.

A few years later, I was a traveling Solutions Architect, parachuted once or twice a month into CRM software implementation fiascos around the globe. My job was to figure out what to do to turn these jobs around — was it a people problem? An architecture problem? A training problem? A systems thinking problem? A little of everything? I had a couple weeks to make a recommendation, and then I was on to the next project (results were usually pretty good). But since this required evaluating technology decisions in the context of business and financial constraints, my boss suggested that I use the tuition benefits offered by my job to get an MBA. I had taken 9 credits of science and industrial engineering classes since I’d graduated, so I contacted two of the local MBA schools to see if they’d accept me and my credits. Sure enough, one of them did! I took evening classes for a year and a half, and eventually ended up with an MBA. But I never thought I could (or would) go farther — I’m not that smart, I’d tell myself. Also, it’s expensive. Also, a PhD would probably make me less marketable. (All lies, spoken by a lack of confidence.)

Shortly thereafter, the travel started to get to me (I was flying at least three days a week), so I looked for an opportunity to grow and cultivate a software development organization. (That’s how I ended up in Data Management at NRAO.) A little management led to a lot of management. A few years later one of the organization’s leaders said it was “too bad I didn’t have a PhD” — because in a highly scientific and technical organization, it would give me more credibility and make me a better leader.

“Will you pay for it?” I asked. “Sure,” they said. I just had to find a suitable program that wouldn’t require me to go full time. I’ve always loved learning, and I couldn’t resist the temptation of free education — even if it meant I’d have to balance the demands of a challenging full-time job and a first-time baby at the same time. That’s how much I love learning, just for learning’s sake! I still didn’t think a PhD had that much value, unless you were studying to be a lab scientist or you were dead set on becoming a historian and teaching for the rest of your life. None of these personas was me, but the free education thing sold me, and I didn’t really think about how relevant this step was to my career direction until much later.

Fortunately, I found the perfect program for me — a hybrid academic/practitioner PhD that would help me develop the research and analytical skills to solve practical problems in business and industrial technology.

The next few years were pretty rough, and by the time I got my PhD, I was in my 14th year of post-college professional employment. First lesson learned: it’s probably not the best move to start PhD coursework when you have a three-month old. I have no idea how I made it through.

Shortly after graduating, the impacts of the financial crisis hit our federally funded organization and I was able to segue into a second career as a college professor, teaching data science and manufacturing/EHSQ classes. For the past year, I’ve been back in industry (maybe permanently; we’ll see) and have a better sense of the value of PhDs in industry.

Value of Getting a PhD

There are lots of reasons I’m happy with the time I spent getting a PhD, other than the fact that it helped me get an entirely new job when the economy was down:

  • First and foremost, I’m a better critical thinker. It’s now my nature to look at all parts of a problem, examine the interactions between them, and make sure I have all the required background information needed to start working on a problem.
  • I’m a better writer too. I look at reports and presentations I wrote years ago, and can see all the holes and places where I made assumptions that weren’t valid.
  • I developed a new appreciation for clarity. Researchers want to make sure their messages, methodologies, and models are clear and unambiguous… through the contrast, I was able to recognize that in industry, there’s often pressure to skip due diligence and move fast to perform. This pressure leads to ambiguity, which tends towards what I call “intellectual waste” – people assuming that they see a problem or a project in the same way because they haven’t taken the time to guarantee clarity.
  • It’s easier for me to quickly determine whether information might be true or false, or whether there are gaps that need to be closed before moving forward. (It’s possible that this skill is more from grading and evaluating student work… something that’s orders of magnitude harder than it seems.)
  • I realized that words matter. Really thinking about how one person will respond to a word or phrase, and whether it conveys the meaning that you intend, is a craft — that’s enhanced by working with collaborators.
  • And although I knew this one prior to the PhD, I found that data matters. Where did your data come from? Can you access the original? What kind of people (or instruments) gathered it? Can you trust them? The quality of your data — and the suitability of the methods you choose — will impact the quality and integrity of the conclusions you generate from it. Awareness of these factors is essential.

Value of Caution

One of the biggest lessons was the most surprising. Early on in the PhD program I was told that my opinion didn’t count — regardless of how many years of experience I had. Every statement I made had to be backed up and cited, preferably using material that had been peer-reviewed by other qualified people. At first I was kind of offended by this… didn’t these academics have any sense of the value of actual real-world employment? Apparently not.

But something funny happened as I developed the habit of looking for solid references, distilling their messages, and citing them accurately: I became more careful. And in the evolution of my caution and attention to detail, the quality of my work — ANY work — improved tremendously. I was able to learn from what other people had discovered, and anticipate (and resolve) problems in advance. I learned that “standing on the shoulders of giants” actually means figuring out when solved problems already exist so you don’t waste time reinventing wheels.

Something else funny happened as soon as I graduated: all of a sudden, people were asking me for my opinion. But the habit of due diligence was so ingrained that I couldn’t express my opinion… I was compelled to back it up with facts!

(I think this was the point all along. Go figure.)

The beauty of going through the entire messy process of PhD coursework and comps and research and defense and editing — the entire end-to-end process, not cutting out in the middle anywhere — it gave me the discipline and process to root out accurate and complete answers to problems. Or at the very least, to be able to call out the gaps to get there.

There’s a lot of pressure in industry to move fast, but due diligence is still critical for accurate self-assessment and effective cross-functional communication. Slowing down and figuring out how you know what you know — and making sure everyone is literally on the same page — can help your organization achieve its goals more quickly.

Value of PhDs to Industry

So employers (especially in tech) — should you hire PhDs? Yes. Here’s why:

  • PhDs are trained to find gaps in knowledge and understanding. Is your strategic plan grounded in reality, or is it just wishful thinking? Are your Project Charters well scoped, budgeted, and planned out? Is your workforce prepared to carry out your strategic initiatives?
  • Many PhDs with experience teaching undergrads are great at making complex topics accessible to other audiences. This is fantastic for training, cross-training, and marketing.
  • PhDs love research and writing, and can help you with gathering and interpreting data and content marketing.
  • PhDs love learning. Want to be on the cutting edge? They’re great in R&D… they can help you distill new insights from research papers and interpret and apply them accurately.
  • If you want to do AI or machine learning, or anything that uses Big Data, make sure you have at least one PhD statistician with practical analytical experience. They can prevent you from spending millions on dead ends and help you apply Occam’s Razor to avoid unnecessary complexity (the kind that can lead to technical debt later).

Will there be drawbacks? Sure. The habit of caution may need to be tempered somewhat — you don’t have to probe to the bottom of an issue to generate useful information that a business can use to make progress.

Bottom line… don’t be afraid of PhDs! We are mere mortals who just happen to have spent several years trying to figure out how to get to the core — the fundamental truth — of a complex problem. As a result we know how to approach problems like this — problems that many businesses have lots of. (We are not overqualified at all… we just have an extra skill set in something you desperately need, but may not realize you need it.)

Getting a PhD was challenging, frustrating, and maddening at times (especially the final part of getting your camera-ready text ready for ProQuest). I never planned to do it, but I’d totally do it again. I think my only regret is that I got a PhD in a hybrid business/industrial engineering discipline… it allowed me the freedom to pursue my interests, but if I was at the same crossroads now, I’d get a PhD in statistics to complement my MBA. Overall, this is a pretty tiny regret.

The Connected, Intelligent, Automated Industry 4.0 Supply Chain

ASQ’s March Influential Voices Roundtable asks this question: “Investopedia defines end-to-end supply chain (or ‘digital supply chain’) as a process that refers to the practice of including and analyzing each and every point in a company’s supply chain – from sourcing and ordering raw materials to the point where the good reaches the end consumer. Implementing this practice can increase process speed, reduce waste, and decrease costs.

In your experience, what are some best practices for planning and implementing this style of supply chain to ensure success?

Supply chains are the lifeblood of any business, impacting everything from the quality, delivery, and costs of a business’s products and services to customer service and satisfaction to ultimately profitability and return on assets.

Stank, T., Scott, S. & Hazen, B. (2018, April). A SAVVY GUIDE TO THE DIGITAL SUPPLY CHAIN: HOW TO EVALUATE AND LEVERAGE TECHNOLOGY TO BUILD A SUPPLY CHAIN FOR THE DIGITAL AGE. Whitepaper, Haslam School of Business, University of Tennessee.

Industry 4.0 enabling technologies like affordable sensors, more ubiquitous internet connectivity and 5G networks, and reliable software packages for developing intelligent systems have started fueling a profound digital transformation of supply chains. Although the transformation will be a gradual evolution, spanning years (and perhaps decades), the changes will reduce or eliminate key pain points:

  • Connected: Lack of visibility keeps 84% of Chief Supply Chain Officers up at night. More sources of data and enhanced connectedness to information will alleviate this issue.
  • Intelligent: 87% of Chief Supply Chain Officers say that managing supply chain disruptions proactively is a huge challenge. Intelligent algorithms and prescriptive analytics can make this more actionable.
  • Automated: 80% of all data that could enable supply chain visibility and traceability is “dark” or siloed. Automated discovery, aggregation, and processing will ensure that knowledge can be formed from data and information.

Since the transformation is just getting started, best practices are few and far between — but recommendations do exist. Stank et al. (2018) created a digital supply chain maturity rubric, with highest levels that reflect what they consider recommended practices. I like these suggestions because they span technical systems and management systems:

  • Gather structured and unstructured data from customers, suppliers, and the market using sensors and crowdsourcing (presumably including social media)
  • Use AI & ML to “enable descriptive, predictive, and prescriptive insights simultaneously” and support continuous learning
  • Digitize all systems that touch the supply chain: strategy, planning, sourcing, manufacturing, distribution, collaboration, and customer service
  • Add value by improving efficiency, visibility, security, trust, authenticity, accessibility, customization, customer satisfaction, and financial performance
  • Use just-in-time training to build new capabilities for developing the smart supply chain

One drawback of these suggestions is that they provide general (rather than targeted) guidance.

A second recommendation is to plan initiatives that align with your level of digital supply chain maturity. Soosay & Kannusamy (2018) studied 360 firms in the Australian food industry and found four different stages. They are:

  • Stage 1 – Computerization and connectivity. Sharing data across they supply chain ecosystem requires that it be stored in locations that are accessible by partners. Cloud-based systems are one option. Make sure authentication and verification are carefully implemented.
  • Stage 2 – Visibility and transparency. Adding new sensors and making that data accessible provides new visibility into the supply chain. Key enabling technologies include GPS, time-temperature integrators and data loggers.
  • Stage 3 – Predictive capability. Access to real-time data from supply chain partners will increase the reliability and resilience of the entire network. Enterprise Resource Planning (ERP), Manufacturing Execution Systems (MES), and radio frequency (RFID) tagging are enablers at this stage.
  • Stage 4 – Adaptability and self-learning. At this stage, partners plan and execute the supply chain collaboratively. Through Vendor Managed Inventory (VMI), responsibility for replenishment can even be directly assumed by the supplier.

Traceability is also gaining prominence as a key issue, and permissioned blockchains provide one way to make this happen with sensor data and transaction data. Recently, the IBM Food Trust has demonstrated the practical value provided by the Hyperledger blockchain infrastructure for this purpose. Their prototypes have helped to identify supply chain bottlenecks that might not otherwise have been detected.

What should you do in your organization? Any way to enhance information sharing between members of the supply chain ecosystem — or more effectively synthesize and interpret it — should help your organization shift towards the end-to-end vision. Look for opportunities in both categories.


References for Connected, Intelligent, Automated stats:
  1. IBM. (2018, February). Global Chief Supply Chain Officer Study. Available from this URL
  2. Geriant, J. (2015, October). The Changing Face of Supply Chain Risk Management. SCM World.
  3. IBM & IDC. (2017, March). The Thinking Supply Chain. Available from this URL

Imperfect Action is Better Than Perfect Inaction: What Harry Truman Can Teach Us About Loss Functions (with an intro to ggplot)

One of the heuristics we use at Intelex to guide decision making is former US President Truman’s advice that “imperfect action is better than perfect inaction.” What it means is — don’t wait too long to take action, because you don’t want to miss opportunities. Good advice, right?

When I share this with colleagues, I often hear a response like: “that’s dangerous!” To which my answer is “well sure, sometimes, but it can be really valuable depending on how you apply it!” The trick is: knowing how and when.

Here’s how it can be dangerous. For example, statistical process control (SPC) exists to keep us from tampering with processes — from taking imperfect action based on random variation, which will not only get us nowhere, but can exacerbate the problem we were trying to solve. The secret is to apply Truman’s heuristic based on an understanding of exactly how imperfect is OK with your organization, based on your risk appetite. And this is where loss functions can help.

Along the way, we’ll demonstrate how to do a few important things related to plotting with the ggplot package in R, gradually adding in new elements to the plot so you can see how it’s layered, including:

  • Plot a function based on its equation
  • Add text annotations to specific locations on a ggplot
  • Draw horizontal and vertical lines on a ggplot
  • Draw arrows on a ggplot
  • Add extra dots to a ggplot
  • Eliminate axis text and axis tick marks

What is a Loss Function?

A loss function quantifies how unhappy you’ll be based on the accuracy or effectiveness of a prediction or decision. In the simplest case, you control one variable (x) which leads to some cost or loss (y). For the case we’ll examine in this post, the variables are:

  • How much time and effort you put in to scoping and characterizing the problem (x); we assume that time+effort invested leads to real understanding
  • How much it will cost you (y); can be expressed in terms of direct costs (e.g. capex + opex) as well as opportunity costs or intangible costs (e.g. damage to reputation)

Here is an example of what this might look like, if you have a situation where overestimating (putting in too much x) OR underestimating (putting in too little x) are both equally bad. In this case, x=10 is the best (least costly) decision or prediction:

plot of a typical squared loss function
# describe the equation we want to plot
parabola <- function(x) ((x-10)^2)+10  

# initialize ggplot with a dummy dataset
library(ggplot)
p <- ggplot(data = data.frame(x=0), mapping = aes(x=x)) 

p + stat_function(fun=parabola) + xlim(-2,23) + ylim(-2,100) +
     xlab("x = the variable you can control") + 
     ylab("y = cost of loss ($$)")

In regression (and other techniques where you’re trying to build a model to predict a quantitative dependent variable), mean square error is a squared loss function that helps you quantify error. It captures two facts: the farther away you are from the correct answer the worse the error is — and both overestimating and underestimating is bad (which is why you square the values). Across this and related techniques, the loss function captures these characteristics:

From http://www.cs.cornell.edu/courses/cs4780/2015fa/web/lecturenotes/lecturenote10.html

Not all loss functions have that general shape. For classification, for example, the 0-1 loss function tells the story that if you get a classification wrong (x < 0) you incur all the penalty or loss (y=1), whereas if you get it right (x > 0) there is no penalty or loss (y=0):

# set up data frame of red points
d.step <- data.frame(x=c(-3,0,0,3), y=c(1,1,0,0))

# note that the loss function really extends to x=-Inf and x=+Inf
ggplot(d.step) + geom_step(mapping=aes(x=x, y=y), direction="hv") +
     geom_point(mapping=aes(x=x, y=y), color="red") + 
     xlab("y* f(x)") + ylab("Loss (Cost)") +  
     ggtitle("0-1 Loss Function for Classification")

Use the Loss Function to Make Strategic Decisions

So let’s get back to Truman’s advice. Ideally, we want to choose the x (the amount of time and effort to invest into project planning) that results in the lowest possible cost or loss. That’s the green point at the nadir of the parabola:

p + stat_function(fun=parabola) + xlim(-2,23) + ylim(-2,100) + 
     xlab("Time Spent and Information Gained (e.g. person-weeks)") + ylab("$$ COST $$") +
     annotate(geom="text", x=10, y=5, label="Some Effort, Lowest Cost!!", color="darkgreen") +
     geom_point(aes(x=10, y=10), colour="darkgreen")

Costs get higher as we move up the x-axis:

p + stat_function(fun=parabola) + xlim(-2,23) + ylim(-2,100) + 
     xlab("Time Spent and Information Gained (e.g. person-weeks)") + ylab("$$ COST $$") +
     annotate(geom="text", x=10, y=5, label="Some Effort, Lowest Cost!!", color="darkgreen") +
     geom_point(aes(x=10, y=10), colour="darkgreen") +
     annotate(geom="text", x=0, y=100, label="$$$$$", color="green") +
     annotate(geom="text", x=0, y=75, label="$$$$", color="green") +
     annotate(geom="text", x=0, y=50, label="$$$", color="green") +
     annotate(geom="text", x=0, y=25, label="$$", color="green") +
     annotate(geom="text", x=0, y=0, label="$ 0", color="green")

And time+effort grows as we move along the x-axis (we might spend minutes on a problem at the left of the plot, or weeks to years by the time we get to the right hand side):

p + stat_function(fun=parabola) + xlim(-2,23) + ylim(-2,100) + 
     xlab("Time Spent and Information Gained (e.g. person-weeks)") + ylab("$$ COST $$") +
     annotate(geom="text", x=10, y=5, label="Some Effort, Lowest Cost!!", color="darkgreen") +
     geom_point(aes(x=10, y=10), colour="darkgreen") +
     annotate(geom="text", x=0, y=100, label="$$$$$", color="green") +
     annotate(geom="text", x=0, y=75, label="$$$$", color="green") +
     annotate(geom="text", x=0, y=50, label="$$$", color="green") +
     annotate(geom="text", x=0, y=25, label="$$", color="green") +
     annotate(geom="text", x=0, y=0, label="$ 0", color="green") +
     annotate(geom="text", x=2, y=0, label="minutes\nof effort", size=3) +
     annotate(geom="text", x=20, y=0, label="months\nof effort", size=3)

Planning too Little = Planning too Much = Costly

What this means is — if we don’t plan, or we plan just a little bit, we incur high costs. We might make the wrong decision! Or miss critical opportunities! But if we plan too much — we’re going to spend too much time, money, and/or effort compared to the benefit of the solution we provide.


p + stat_function(fun=parabola) + xlim(-2,23) + ylim(-2,100) + 
     xlab("Time Spent and Information Gained (e.g. person-weeks)") + ylab("$$ COST $$") +
     annotate(geom="text", x=10, y=5, label="Some Effort, Lowest Cost!!", color="darkgreen") +
     geom_point(aes(x=10, y=10), colour="darkgreen") +
     annotate(geom="text", x=0, y=100, label="$$$$$", color="green") +
     annotate(geom="text", x=0, y=75, label="$$$$", color="green") +
     annotate(geom="text", x=0, y=50, label="$$$", color="green") +
     annotate(geom="text", x=0, y=25, label="$$", color="green") +
     annotate(geom="text", x=0, y=0, label="$ 0", color="green") +
     annotate(geom="text", x=2, y=0, label="minutes\nof effort", size=3) +
     annotate(geom="text", x=20, y=0, label="months\nof effort", size=3) +
     annotate(geom="text",x=3, y=85, label="Little (or no) Planning\nHIGH COST", color="red") +
     annotate(geom="text", x=18, y=85, label="Paralysis by Planning\nHIGH COST", color="red") +
     geom_vline(xintercept=0, linetype="dotted") + geom_hline(yintercept=0, linetype="dotted")

The trick is to FIND THAT CRITICAL LEVEL OF TIME and EFFORT invested to gain information and understanding about your problem… and then if you’re going to err, make sure you err towards the left — if you’re going to make a mistake, make the mistake that costs less and takes less time to make:

arrow.x <- c(10, 10, 10, 10)
arrow.y <- c(35, 50, 65, 80)
arrow.x.end <- c(6, 6, 6, 6)
arrow.y.end <- arrow.y
d <- data.frame(arrow.x, arrow.y, arrow.x.end, arrow.y.end)

p + stat_function(fun=parabola) + xlim(-2,23) + ylim(-2,100) + 
     xlab("Time Spent and Information Gained (e.g. person-weeks)") + ylab("$$ COST $$") +
     annotate(geom="text", x=10, y=5, label="Some Effort, Lowest Cost!!", color="darkgreen") +
     geom_point(aes(x=10, y=10), colour="darkgreen") +
     annotate(geom="text", x=0, y=100, label="$$$$$", color="green") +
     annotate(geom="text", x=0, y=75, label="$$$$", color="green") +
     annotate(geom="text", x=0, y=50, label="$$$", color="green") +
     annotate(geom="text", x=0, y=25, label="$$", color="green") +
     annotate(geom="text", x=0, y=0, label="$ 0", color="green") +
     annotate(geom="text", x=2, y=0, label="minutes\nof effort", size=3) +
     annotate(geom="text", x=20, y=0, label="months\nof effort", size=3) +
     annotate(geom="text",x=3, y=85, label="Little (or no) Planning\nHIGH COST", color="red") +
     annotate(geom="text", x=18, y=85, label="Paralysis by Planning\nHIGH COST", color="red") +
     geom_vline(xintercept=0, linetype="dotted") + 
     geom_hline(yintercept=0, linetype="dotted") +
     geom_vline(xintercept=10) +
     geom_segment(data=d, mapping=aes(x=arrow.x, y=arrow.y, xend=arrow.x.end, yend=arrow.y.end),
     arrow=arrow(), color="blue", size=2) +
     annotate(geom="text", x=8, y=95, size=2.3, color="blue",
     label="we prefer to be\non this side of the\nloss function")

Moral of the Story

The moral of the story is… imperfect action can be expensive, but perfect action is ALWAYS expensive. Spend less to make mistakes and learn from them, if you can! This is one of the value drivers for agile methodologies… agile practices can help improve communication and coordination so that the loss function is minimized.

## FULL CODE FOR THE COMPLETELY ANNOTATED CHART ##
# If you change the equation for the parabola, annotations may shift and be in the wrong place.
parabola <- function(x) ((x-10)^2)+10

my.title <- expression(paste("Imperfect Action Can Be Expensive. But Perfect Action is ", italic("Always"), " Expensive."))

arrow.x <- c(10, 10, 10, 10)
arrow.y <- c(35, 50, 65, 80)
arrow.x.end <- c(6, 6, 6, 6)
arrow.y.end <- arrow.y
d <- data.frame(arrow.x, arrow.y, arrow.x.end, arrow.y.end)

p + stat_function(fun=parabola) + xlim(-2,23) + ylim(-2,100) + 
     xlab("Time Spent and Information Gained (e.g. person-weeks)") + ylab("$$ COST $$") +
     annotate(geom="text", x=10, y=5, label="Some Effort, Lowest Cost!!", color="darkgreen") +
     geom_point(aes(x=10, y=10), colour="darkgreen") +
     annotate(geom="text", x=0, y=100, label="$$$$$", color="green") +
     annotate(geom="text", x=0, y=75, label="$$$$", color="green") +
     annotate(geom="text", x=0, y=50, label="$$$", color="green") +
     annotate(geom="text", x=0, y=25, label="$$", color="green") +
     annotate(geom="text", x=0, y=0, label="$ 0", color="green") +
     annotate(geom="text", x=2, y=0, label="minutes\nof effort", size=3) +
     annotate(geom="text", x=20, y=0, label="months\nof effort", size=3) +
     annotate(geom="text",x=3, y=85, label="Little (or no) Planning\nHIGH COST", color="red") +
     annotate(geom="text", x=18, y=85, label="Paralysis by Planning\nHIGH COST", color="red") +
     geom_vline(xintercept=0, linetype="dotted") + 
     geom_hline(yintercept=0, linetype="dotted") +
     geom_vline(xintercept=10) +
     geom_segment(data=d, mapping=aes(x=arrow.x, y=arrow.y, xend=arrow.x.end, yend=arrow.y.end),
     arrow=arrow(), color="blue", size=2) +
     annotate(geom="text", x=8, y=95, size=2.3, color="blue",
     label="we prefer to be\non this side of the\nloss function") +
     ggtitle(my.title) +
     theme(axis.text.x=element_blank(), axis.ticks.x=element_blank(),
     axis.text.y=element_blank(), axis.ticks.y=element_blank()) 

Now sometimes you need to make this investment! (Think nuclear power plants, or constructing aircraft carriers or submarines.) Don’t get caught up in getting your planning investment perfectly optimized — but do be aware of the trade-offs, and go into the decision deliberately, based on the risk level (and regulatory nature) of your industry, and your company’s risk appetite.

Designing Experiences for Authentic Engagement: The Design for STEAM Canvas

As Industry 4.0 and Digital Transformation efforts bear their first fruits, capabilities, business models, and the organizations that embody them are transforming. A century ago, we thought of organizations as machines to be rigidly designed and controlled. In the latter part of the 20th century, organizations were thought of as knowledge to be cultivated, shared, and expanded. But “as intelligent systems gain traction, we are once again at a crossroads – where organizations must create complete and meaningful experiences” for their customers, stakeholders, and employees.

Read our new paper in the STEAM Journal

How do you design those complete, meaningful, and radically engaging experiences? To provide a starting point, check out “Design for Steam: Creating Participatory Art with Purpose” by my former student Nick Kamienski and me. It was just published today by the STEAM Journal.

“Participatory Art” doesn’t just mean creating things that are pretty to look at in your office lobby or tradeshow booth. It means finding ways to connect with your audience in ways that help them find meaning, purpose, and self-awareness – the ultimate ingredient for authentic engagement.

Designing experiences to make this happen is challenging, but totally within reach. Learn more in today’s new article!

Object of Type Closure is Not Subsettable

I started using R in 2004. I started using R religiously on the day of the annular solar eclipse in Madrid (October 3, 2005) after being inspired by David Hunter’s talk at ADASS.

It took me exactly 4,889 days to figure out what this vexing error means, even though trial and error helped me move through it most every time it happened! I’m so moved by what I learned (and embarrassed that it took so long to make sense), that I have to share it with you.

This error happens when you’re trying to treat a function like a list, vector, or data frame. To fix it, start treating the function like a function.

Let’s take something that’s very obviously a function. I picked the sampling distribution simulator from a 2015 blog post I wrote. Cut and paste it into your R console:

sdm.sim <- function(n,src.dist=NULL,param1=NULL,param2=NULL) {
   r <- 10000  # Number of replications/samples - DO NOT ADJUST
   # This produces a matrix of observations with  
   # n columns and r rows. Each row is one sample:
   my.samples <- switch(src.dist,
	"E" = matrix(rexp(n*r,param1),r),
	"N" = matrix(rnorm(n*r,param1,param2),r),
	"U" = matrix(runif(n*r,param1,param2),r),
	"P" = matrix(rpois(n*r,param1),r),
        "B" = matrix(rbinom(n*r,param1,param2),r),
	"G" = matrix(rgamma(n*r,param1,param2),r),
	"X" = matrix(rchisq(n*r,param1),r),
	"T" = matrix(rt(n*r,param1),r))
   all.sample.sums <- apply(my.samples,1,sum)
   all.sample.means <- apply(my.samples,1,mean)   
   all.sample.vars <- apply(my.samples,1,var) 
   par(mfrow=c(2,2))
   hist(my.samples[1,],col="gray",main="Distribution of One Sample")
   hist(all.sample.sums,col="gray",main="Sampling Distribution\nof
	the Sum")
   hist(all.sample.means,col="gray",main="Sampling Distribution\nof the Mean")
   hist(all.sample.vars,col="gray",main="Sampling Distribution\nof
	the Variance")
}

The right thing to do with this function is use it to simulate a bunch of distributions and plot them using base R. You write the function name, followed by parenthesis, followed by each of the four arguments the function needs to work. This will generate a normal distribution with mean of 20 and standard deviation of 3, along with three sampling distributions, using a sample size of 100 and 10000 replications:

sdm.sim(100, src.dist="N", param1=20, param2=3)

(You should get four plots, arranged in a 2×2 grid.)

But what if we tried to treat sdm.sim like a list, and call the 3rd element of it? Or what if we tried to treat it like a data frame, and we wanted to call one of the variables in the column of the data frame?

> sdm.sim[3]
Error in sdm.sim[3] : object of type 'closure' is not subsettable

> sdm.sim$values
Error in sdm.sim$values : object of type 'closure' is not subsettable

SURPRISE! Object of type closure is not subsettable. This happens because sdm.sim is a function, and its data type is (shockingly) something called “closure”:

> class(sdm.sim)
[1] "function"

> typeof(sdm.sim)
[1] "closure"

I had read this before on Stack Overflow a whole bunch of times, but it never really clicked until I saw it like this! And now that I had a sense for why the error was occurring, turns out it’s super easy to reproduce with functions in base R or functions in anyfamiliar packages you use:

> ggplot$a
Error in ggplot$a : object of type 'closure' is not subsettable

> table$a
Error in table$a : object of type 'closure' is not subsettable

> ggplot[1]
Error in ggplot[1] : object of type 'closure' is not subsettable

> table[1]
Error in table[1] : object of type 'closure' is not subsettable

As a result, if you’re pulling your hair out over this problem, check and see where in your rogue line of code you’re treating something like a non-function. Or maybe you picked a variable name that competes with a function name. Or maybe you got your operations out of order. In any case, change your notation so that your function is doing function things, and your code should start working.

But as Luke Smith pointed out, this is not true for functional sequences (which you can also write as functions). Functional sequences are those chains of commands you write when you’re in tidyverse mode, all strung together with %>% pipes:

Luke’s code that you can cut and paste (and try), with the top line that got cut off by Twitter, is:

fun1 <- . %>% 
    group_by(col1) %>% 
    mutate(n=n+h) %>% 
    filter(n==max(n)) %>% 
    ungroup() %>% 
    summarize(new_col=mean(n))

fun2 <- fun1[-c(2,5)] 

Even though Luke’s fun1 and fun2 are of type closure, they are subsettable because they contain a sequence of functions:

> typeof(fun1)
[1] "closure"

> typeof(fun2)
[1] "closure"

> fun1[1]
Functional sequence with the following components:

 1. group_by(., col1)

Use 'functions' to extract the individual functions. 

> fun1[[1]]
function (.) 
group_by(., col1)

Don’t feel bad! This problem has plagued all of us for many, many hours (me: years), and yet it still happens to us more often than we would like to admit. Awareness of this issue will not prevent you from attempting things that give you this error in the future. It’s such a popular error that there have been memes about it and sad valentines written about it:

SCROLL DOWN PAST STEPH’S TWEET TO SEE THE JOKE!!

(Also, if you’ve made it this far, FOLLOW THESE GOOD PEOPLE ON TWITTER: @stephdesilva @djnavarro @lksmth – They all share great information on data science in general, and R in particular. Many thanks too to all the #rstats crowd who shared in my glee last night and didn’t make me feel like an idiot for not figuring this out for ALMOST 14 YEARS. It seems so simple now.

Steph is also a Microsoft whiz who you should definitely hire if you need anything R+ Microsoft. Thanks to all of you!)

Lack of Alignment is an Organizational Disease. Here are the Symptoms.

Streamlines on a field. Created using the pracma package in R.

Like a champion rowing team, your organization needs to make sure everyone is working together, engaged in synchronized work and active collaboration, and not working at cross-purposes.

But like risk management, working on alignment can seem like a luxury. No one really has time to slow down and make sure everyone’s moving in the same direction. And besides, alignment just happens naturally if each functional area knows what they’re supposed to be working on… right?

Neither of these statements are, of course, true. Synchronizing people and processes – and making sure they’re aware of the needs and desires of real customers instead of cardboard personas – takes dedicated effort and a commitment from senior leaders. There are other critical impacts too: lack of alignment negatively impacts not only project outcomes – but also professional relationships and the bottom line.

An Example of Diagnosing Misalignment

Although alignment is a many-to-many problem, and requires you to look at relationships between people in all your functional areas, a January 2018 survey from Altify examined one part of the organizational puzzle: alignment between sales and marketing. This is a big one, because sales teams use marketing materials to understand and sell the product or service your company offers. Their survey of 422 enterprise-level executives and sales leaders showed that:

  • 74% of marketers think they understood customer needs, but only 44% of sales people in their organizations agreed
  • 71% of marketers think sales and marketing are aligned, but only 59% of sales people in their organizations agreed

These differences may seem small, but they reveal a lack of alignment between sales and marketing. One group thinks they “get it” – while people in the other group are just shaking their heads.

Symptoms of Misalignment

…include things like:

  • Vague Feelings of Fear. Your organization has a strategic plan (knows WHAT it wants to do), but there is little to no coordination regarding HOW people across the organization will accomplish strategic objectives. You know what KPIs you’re supposed to deliver on, but you don’t know how exactly you’re supposed to work with anything in your power or control to “move the needle.”
  • Ivory Tower Syndrome. You’re in a meeting and get the visceral sense that things aren’t clear, or that different people have different expectations for a project or initiative. But you’re too nervous or uncertain to ask for clarification – or maybe you do ask, but you get an equally unclear answer. Naturally, you assume that everyone in the room is smarter than you (particularly the managers) so you shut up and hope that it makes sense later. The reality is that you may be picking up on a legitimate problem that’s going to be problematic for the organization later on.
  • Surprises. A department committed you to a task, but you weren’t part of that decision. Once you find out about it, the task just may not get done. Alternatively, you’ll have to adjust your workload and reset expectations with the stakeholders who will now be disappointed that you can’t meet their needs according to the original schedule. Or maybe work evenings and weekends to get the job done on time. Either way, it’s not pleasant for anyone.
  • Emergencies. How often are you called on to respond to something that’s absolutely needed by close of business today? How often are you expected to drop everything and take care of it? How often do you have to work nights and weekends to make sure you don’t fall behind?
  • Lead Balloons. In this scenario, key stakeholders are called into projects at the 11th hour, when they are unable to guide or influence the direction of an initiative. The initiative becomes a “dead man walking” that’s doomed to an untimely end, but since the organization has sunk time and effort into it, people will push ahead anyway.
  • Cut Off at the Pass. Have you ever been working on a project and find out – somewhere in the middle of doing it – that some other person or team has been working on the same thing? Or maybe they’ve been working on a different project, but it’s ultimately at cross purposes with yours. Whatever way this situation works out, your organization ends up with a pile of waste and potential rework.
  • Not Writing Things Down.You have to make sure everyone is literally on the same page, seeing the world in a similar enough way to know they are pursuing the same goals and objectives. If you don’t write things down, you may be at the mercy of cognitive biases later. How do you know that your goals and objectives are aligned with your overall company strategy? Can you review written minutes after key meetings? Are your organization’s strategic initiatives written and agreed to by decision makers? Do you implement project charters that all stakeholders have to sign off on before work can commence? What practices do you use to get everyone on the same page?

How do you fix it?

That’s the subject for more blog posts that will be coming this spring – as well as what causes misalignment in the first place (hint: it’s individual behaviors on an organizational scale). The good news is – misalignment can be fixed, and the degree of alignment can be measured and continuously improved. Sign up to follow this blog so you don’t miss the rest of the story.

What other symptoms of misalignment have you experienced?

Selecting Pages from a PDF to Make a New (Smaller) PDF

Sometimes small, simple tasks perplex me. Today’s challenge: I’m on Windows 10, and have a 53 page PDF of a journal. I need to make a NEW PDF that only contains pages 26 to 43 (my article) so I can send my article to a researcher who is requesting it. I know you can do this with Acrobat, but I don’t have Acrobat, and still would like to figure out how to make the smaller PDF. Here’s what I learned how to do today:

The Easy Way

  1. Go to https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
  2. Find the GREEN button that says “Download PDFtk FREE” and click it
  3. After the download finishes, right click on the .exe file and Run it
  4. When the installer starts, click on all the default options all the way through “Finish”
  5. When installation is finished, go to the search box in the bottom left of your screen
  6. Type “cmd” and hit Enter to open the terminal window
  7. Navigate to the directory that contains your original PDF. I first typed D: to get to my auxiliary hard drive, and then cd Scratch to get to D:\Scratch where my full journal PDF was stored.
  8. Use this code:
    pdftk yourlargefilename.pdf cat 26-43 output youroutputfilename.pdf

    (replacing YOUR filenames and YOUR starting and ending page numbers instead of 26 and 43)

  9. Launch a File Explorer window and navigate to the directory you used in Step 7 above. Open the PDF file and check to make sure it contains only the pages you expect.
  10. There are LOTS more things you can do with PDFtk from the Windows command line, like you did in Step 8. Lots of other options are described at https://www.pdflabs.com/docs/pdftk-cli-examples/

After I finished, I stumbled upon another way that didn’t require downloading a free program, and only uses Google Chrome. (Sometimes those free programs bother me. What a great way to infiltrate computers… offer a totally useful utility completely for free. Consequently, my advice to you is to download it at your own risk. Including these instructions is in no way a guarantee from me that PDFtk is safe.)

 

The Even Easier Way

  1. Open Google Chrome
  2. Type Ctrl-O (that’s the letter O, not the number zero)
  3. Select the large PDF file that you want to snip
  4. Your PDF will open in the browser… click on the beginning and ending pages, and capture the page numbers
  5. Click the print icon in the far upper right corner of your browser
  6. Click the “Change” button to change destination to “Microsoft Print to PDF”
  7. Click the second radio button under Pages, and specify the start and end pages separated by a dash (for me, 26-43)
  8. Click Print, and select a filename for your new, snipped file
  9. After the PDF is generated, navigate to the directory you saved it in during Step 8. Open the file and check it to make sure the pages are as you expect.

 

The Sad News

I tried to use the staplr package in R to snip my PDFs, but I couldn’t get it to work. Will try again some other time 🙁

« Older Entries