Dirk de Bruyn Ouboter, Statistically Speaking

Dirk de Bruyn Ouboter

Platform Lead for Data Capture, Syngenta

Meg Hermes

Meg Hermes

User Reference Manager, JMP

Dirk de Bruyn Ouboter is Platform Lead for Data Capturing at Syngenta’s R&D site in Stein, Switzerland. After completing a PhD in chemistry at the University of Basel, Dirk worked as a postdoctoral fellow at Swiss biotechnology company Actelion (now a part of Johnson & Johnson) where his research centered on the production and evaluation of API nanoformulations using analytical methods.

In 2012, he joined Syngenta in measurement science and process performance, and has gone on to lead process technology initiatives driving the digital transformation first of Syngenta’s chemical process technology and now at the physical to digital interface in R&D IT.

Meg: I’m curious about the point wherein you realized that analytics could add a lot of value to your work – the point that ultimately led you to where you are now: playing an active role in Syngenta’s analytics transformation.

Dirk: When I was a child, I always wanted to become an inventor, so I’m really an explorer by nature. But looking back, I think there was also a point in my career when I realized that if I could gather lots of data and find some correlates, I could learn a lot and do so relatively quickly. I became really fascinated with what I was finding, and once I started making discoveries with data, I wanted to share those discoveries with other people – even if only to share my excitement!

Having data, seeing something in it, recognizing the hidden patterns, drawing conclusions, and then sharing that fascination with others – not only the findings but also the approach! I wanted to show them how easy it is to find gold nuggets in their data. And practically speaking, I knew that there was so much more information out there that I wouldn’t be able to process by myself. I wanted other people to also feel that sense of excitement about using the information they would discover.

Meg: Did anyone raise objections? Were there hurdles?

Dirk: Not when it came to sharing the excitement or results – my colleagues and our management have always been very much on board with the value analytics brings to the organization. The main challenge was just getting access to the right tools. When I was in my second month with Syngenta, someone told me that I should get a JMP license, and I couldn't. Our internal corporate processes made it difficult, and I had to wait for almost a full year. I looked into who in Switzerland had licenses and then realized there were already a few hundred users at Syngenta globally. So I requested a license grouping for Switzerland. In getting my own license, I ended up getting licenses for many others as well – ha!  

Once I had JMP … [statistics] just felt so different. It wasn’t about the equations anymore. I was so much more motivated to really explore one excitement after the other. And another benefit was that the process of acquiring a license brought me into contact with many other users.

Of course, not everyone was on board at the outset – we’d had Minitab since the 1990s, and with STAVEX, there was also a proprietary design of experiments platform developed by Syngenta’s predecessor. And of course, there were also Excel users. But I asked them, “Have you ever tried to do a histogram in Excel?” They would say, "Yeah, I can do that. Give me 5 or 10 minutes." Then I’d show them JMP and how they could now do that very same thing in 3 seconds. On the design of experiments side, I had one colleague who said, "Yeah, I started doing DOE 30 years ago, and I can still do it all in Excel. I can give it to you tomorrow, don't worry." I just [shook my head].

I think what all these people have in common is that they weren't curious yet. But whenever there was someone excited about seeing and learning something new, they immediately jumped on it.

Meg: Were there any early proofs of concept that helped you demonstrate the business case for analytics initiatives?

Dirk: At the time when I was very intensively using JMP, I had lots of access to data from manufacturing sites with historians. We were storing time series data – every 10 seconds if you wanted it. There were always simple examples related to drying or temperatures – something anyone could understand even if they're not an engineer or statistician.

But what I realized is that it is usually best to choose an example close to the user. If I want to convince someone in chemistry, in research, I need to take and show them something from chemistry. If it was someone from formulation working on milling, I know there's a good milling example.

Of course this doesn’t always work … and there are people who will continue to think [analytics] is unnecessary. And to be fair, the conclusions aren’t often that surprising. They may just confirm what you already know.

I often also share my own story: When I did my PhD, I didn't use stats apart from maybe sometimes just summarizing statistics. It was more math than the stats behind it. Then someone came to me and asked, “What's your result?" I replied, "I got 90% yield." They said: "Fantastic. How often did you repeat it?" I replied, "Three times." "When did you get 90%?" “The third time," I said. "OK, but if you had gotten 90% the first time, how often would you have repeated it? Once? And would that be right?"

During my PhD studies, I hardly used statistics at all. I didn't have a plan for it. I didn't know the true power of statistics yet.

Meg: If you could go back to that period in your career, what would you have done differently?

Dirk: If I had only known statistics was so easy, that's the first thing I would have done! JMP really changed stats for me.

At university, you often start with three years of stats. At first, you focus on how to calculate and do averages … then comes hypothesis testing, which is usually when you lose interest! It's not visual. I wish that, at the beginning of my studies, someone had shown me a visual approach – introducing that concept of discovery.

There was a great example [one of my mentors shared with me]: He said, "If you go buy 6 eggs, how many do you look at?" I replied, “Well, I look at each of them to make sure they’re not cracked.” “And if you go buy 150 grapes, how many do you taste?” I was like, "Maybe 1 or 2." It just goes to show how important it is to think about sample size and how many you should be testing.

Meg: When you were still working on your PhD, were you aware of the industry standards for applied statistics?

Dirk: Yes, absolutely. I studied statistical design of experiments in my second year of graduate studies. But the only thing I remember was those cubes and that they were supposed to help me get more results with less experiments. Everything else I didn't understand because the course was so focused on equations and heavy math. Of course, I studied chemical engineering, so it's not that we weren’t good with math but [stats just wasn’t that interesting]. I knew there was an Excel plugin, but I wasn't aware of the existence of a specialized software.

I was also teaching stats to some extent during my PhD – teaching things like error propagation, which was quite important, for example. The hardest thing was always explaining to students why they shouldn’t use the standard bars you get as error bars in Excel. And I think this actually does a lot of damage at universities as well, using error bars in Excel. It completely skews the concept of understanding the statistics behind it.

Meg: You talked a little bit about how getting access to JMP was something of a turning point in your career. I’m wondering if you could speak more about how you’ve gotten value from the resources and relationship with JMP – beyond the software itself.

Dirk: The annual highlight for me is always Discovery Summit, because when I’m there I’m amongst lots of people who share the excitement of discovering information in data and learning about new use cases, possibilities, and ideas. I usually come back from Discovery Summit with a book full of new ideas to test with my data, or data I haven't used yet but know the use cases for. That's definitely the highlight.

One thing I realized when I stepped into a coaching role here at Syngenta is that there are, of course, people who don’t know JMP exists, but even worse is that there are quite a few people know it exists but don't know about all the support they could be getting. They don't know that there's so much more to JMP … it’s not just a software off the shelf.

Meg: What do you think is the most important advice to give someone who is just starting off with JMP?

Dirk: I always tell people when they start [using JMP for the first time] to spend that first hour discovering the magic behind the red triangles. If you miss that, you miss the point! Of course, there’s also that one-hour "Getting Started" on-demand training, which also motivated me to look into the Mastering JMP series.

Mastering JMP is what got me really excited about applying what I was learning to my own data. As for the Community, I usually just use it as a look-up resource. It's a good way to find out how others did things [when faced with a similar data challenge].

So: Discover the magic about red triangles and pick at least three sessions from the Mastering JMP series. That’s a good place to start.

For me now as an ambassador or advocate at Syngenta, I’m facing Karpman’s Drama Triangle. When someone comes to me asking whether something can be done in JMP, I always must decide: Am I going to be the rescuer, or am I going to be the coach? And my decision – whether I'm going to do it for them or coach them to do it themself – influences whether they are going to be the victim who always needs help, or the survivor who learns and then can survive and propagate knowledge on their own.

Whenever possible, I try to take on the coaching role rather than that of the rescuer. But it's sometimes really hard when you know something is just two clicks away and it’s taking someone three weeks to discover it!

Meg: What advice would you give to students hoping to pursue a career in industry as you've done?

Dirk: Forget the equations – they're done in the background – and focus on riding the data wave in JMP. Go with the flow.

Maybe I’ll also add one more thing. I would really recommend that students take the time to go through the Statistical Thinking for Industrial Problem Solving (STIPS) program because as a student, you're still used to having a series of lessons that might only make sense at the end. And running through that course is a good investment to not only get trained but to also shape the way you think. For a student, this is almost a must-do.