Phil Kay

Phil Kay

Phil Kay is a learning manager for JMP Statistical Discovery, a subsidiary of SAS. His job is to understand the science and engineering challenges and provide guidance on data analytic solutions for industrial organisations around the world.

Previously, Phil was a key scientist in the development of numerous processes for the manufacture of colorants for digital printing at FujiFilm Imaging Colorants. Phil has a master’s degree in applied statistics with a dissertation on Design of Experiments. He also has a master’s and PhD in chemistry.

He is a Fellow of the Royal Statistical Society, a Chartered Chemist, and a member of the committee for the Process Chemistry and Technology Group with the Royal Society of Chemistry.

Phil loves showing people how data analytics enables better science. Follow Phil Kay, Evangelist for Data Analytics, on LinkedIn.

Chemists will always need to adapt their skills to new ways of doing science. When I was an undergraduate, I was taught how to make melting point tubes by drawing out glass capillaries over a Bunsen burner flame. I remember this because I was really bad at it; the sharps bin was full of my handiwork. Thankfully, the exercise was already outdated even then and there was no need to test my manual dexterity beyond that.

At one time, most chemists would have at least needed rudimentary glassblowing skills. But since most of their glassware requirements were common to many thousands of other scientists, companies soon emerged to serve that market with off-the-shelf products. Today, it would be absurd to make your own Liebig condenser before you can carry out a distillation, or your own reagents or instruments for that matter.

The job of the chemist will be less about making samples and more about generating data.

There is a modern trend in research skills that I feel is equally absurd: that it will be essential to learn coding for the future of science. This notion is currently doing the rounds as more science moves from in vitro to in silico. Coding is certainly useful, but it is not a prerequisite for doing science, any more than glassblowing. This notion is also unhelpful because it emphasises the wrong part of research: focusing on how to make something, rather than why to make it. Perhaps most importantly, it ignores the much more important need for data skills.

On a need-to-code basis

As R&D becomes more digital, there will be less hands-on lab work, the argument goes. In the lab of the future, routine practical tasks will be automated and researchers will instead spend their time coding those machines and their data workflows. Hence, we should send everyone on coding bootcamps.

I know a lot of scientists that love the idea of learning new digital skills, including coding – I’m one of them. I’ve learned enough to do useful things with coding at various times in my career. But only when I have had to.

Recently, I was interested in a novel machine learning method called self-validated ensemble modeling (SVEM), which promises to be uniquely useful for analysing the smaller datasets that we typically produce in industrial R&D experiments. The algorithm involves looping through hundreds of cycles of an analysis routine, which you would not want to do manually. I spent a few hours writing some code to do it and the metaphorical broken code bin was soon full. Yet it was a fun way to gain a deeper understanding of SVEM, just as you might gain a greater appreciation of the form and function of flasks and funnels by trying to make your own.

Digital transformation remains one of the biggest challenges that organisations face today.

From my limited explorations it looks like SVEM is useful. But I don’t think anyone should write their own code for it. You could risk using mine, but even I don’t use it anymore. That’s because proper software developers have since done a much better job: the most recent version of JMP Pro contains a simple interface that lets anyone analyse their data with SVEM in just a few clicks . The great thing about code is that one person can produce something once that can be endlessly re-used by any number of other people.

Focus on skills that add value

Most of the tasks that I needed to write code for in the past can now be done in this point-and-click manner and there has been an explosion in the number of commercial no-code or low-code lab automation and data software solutions in recent years. Just as with companies that mass produce glassware, software companies have been set up by people that understand the challenges and, working with the scientific community, their dedicated developers have created easy-to-use tools specifically to meet these needs. Learning how to build your own digital tools will be a waste of time for most people.

Yet digital transformation – and developing the skills needed to achieve it – remains one of the biggest challenges that organisations face today. I talk to scientists from companies big and small from all around the world and nobody has yet cracked it. One approach that I have seen work is for organisations to develop a small number of enthusiasts into in-house experts that can code bespoke solutions to streamline data workflows for their colleagues. However, this is only valuable as a later step, after the more important groundwork has been done to build the foundations for a data-driven culture.

The first step is to recognise the need for better data skills across the scientific workforce. The job of the chemist in the future will be less about making samples and more about generating data that can be turned into useful insight. The most successful organisations are already raising the baseline data literacy of all their staff and focussing on the key skills and software tools that will help their scientists to adapt to this change in the paradigm.

The chemist of the future does not need to be a coder any more than the chemist of today needs to be a glassblower. They will need to be skilled in visualisation to help them quickly explore their data and communicate insights. They will need an understanding of statistical modelling and the fundamentals of machine learning to extract maximum insight from both small and large data. And they will need to use statistical design of experiments to produce the most valuable data. JMP have created a free online training resource, Statistical Thinking for Industrial Problem Solving, that will give you an introduction to all of these topics. Find out more in the Chemistry World Design of Experiments collection in partnership with JMP.

Let's stay connected!

You may contact me by email regarding news, events and offers from JMP. I understand I can withdraw my consent at any time.


JMP Statistical Discovery LLC. Your information will be handled in accordance with our Privacy Statement.