Jean-Vincent Le Bé @ Nestlé | Data Science Hangout
We were recently joined by Jean-Vincent Le Bé, Data Science Expert at Nestlé to chat about enabling the emergence of knowledge and discoveries from data by the use of analytics, predictive modeling, and visualization. Jean-Vincent Le Bé is a physicist with a PhD in Neuroscience and an analytical mind who can rapidly adapt to a variety of new fields and techniques. As a Data Scientist at the Nestlé System Technology Centre, he took part in the development and industrialization of food and beverage systems (that involve dispensing machines) with design of experiments, machine learning and computer simulations. In my position at Nestlé Research, Jean-Vincent develops artificial intelligence and machine learning methodologies for product development and food science. _______ ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co LinkedIn: / posit-software Twitter: / posit_pbc To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We'd love to see you!) Thanks for hanging out with us!
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Welcome to the Data Science Hangout, everyone. The first one since PositConf. For all those that I met, it was great meeting you. This is your first Data Science Hangout. Welcome. If this is your 100th or more, welcome. We're happy to have you. I'm filling in for Rachel Dempsey today. She's ordinarily the host. She's taking a well-deserved vacation. So I'll do my best.
This is a casual, open environment for data scientists, data science leaders, interns, students, what have you. So we want it to be a free-flowing conversation, super welcoming. So use your best judgment. Be nice. There's lots of different ways to participate. You can chat in. Hannah will share a Slido link where you can ask anonymous questions. You can raise your hand. You can turn your video on. You can just sort of speak. So it's super casual, whatever works for you.
I'm really happy to introduce our leader today, Jean-Vincent from Nestlé. So I'll hand it over to him to do an intro, and we'll get started. Okay, great. Thank you very much for the nice intro. And yeah, just to build on what you just said.
So I've been assisting to, I would say, some of these Hangouts. Honestly, it's also sometimes collapsing, but it's at the end of the day here because I'm in Europe. So that's why sometimes they're a bit difficult to attend. But I just want to reiterate that it's really a pleasure for us to be here. And to Ann-Honoré, I have been asked to share my experience with you. And I'm sharing, then, this experience as a person. So just to make sure that everyone is also aligned on this, I'm not here representing Nestlé as a Nestlé employee, I would say, but I'm happy to share with my experience throughout my different companies also, and more recently, Nestlé, of course.
Jean-Vincent's background
So speaking of which, I actually medicated as a physicist. So I did physics in the Swiss Federal Institute of Technology in Lausanne. And then after that, I did a PhD in neuroscience, where I was exploring the connection between neurons doing more biophysics, and that would be through electrophysiology. That's where I started to do some data analytics, or rather heavily involved in both analyzing the signal to get some relevant features out of that, and then doing statistics to get what would be the message or the learnings to get from that.
Basically, I was exploring how neurons were connecting and disconnecting over a 12-hour period with actual slices of tissue. Then after that, I moved to the industry, and I started to work as a process engineer in Valtronic Technology, which is a company in Switzerland, with also a production site in Solon in the US, so in Ohio. I worked there as a process engineer in macroelectronics to do assembly for medical devices. We were working on outside medical device, but also some implants, and that was a bit more challenging, and that's where we have this neuroscience also that was interesting for the company with neural simulators.
I've been there successively process engineer, then head of engineering and development, and then the head of technology, taking care of the portfolio of technology and interaction with customers. Valtronic is a service company, so we didn't have any direct product, but we would develop and industrialize products for customers. Then after seven years in that company, I moved to Nestle as a data scientist.
Then there, I worked in different projects in the system technology center, so it's located in Orb, Switzerland, where this is the center where we develop all the so-called systems. System is the conjunction of the machine, dispensing machine, the packaging, which is functional to some extent, more or less, but sometimes more than less, and the product. One of the most known is Nespresso. I would say the Nespresso system with the virtual line that you have in the US, or we also have Dolce Gusto and different things like this that are around us.
I worked there really in different projects as a data scientist with design of experiments, with statistical process control, with also some automating data analysis that were coming from the labs, where the lab would generate a lot of different files, and we would need to have some apps that would then, and that wouldn't be shiny apps, that would be using and analyzing the data to extract the knowledge of it, and then drive the development.
Aside, or say part of my activities also in Nestlé, and that could be interesting, and that's why I'm sharing it also with you, because there might be some questions also around that. I very early in Nestlé joined the network, so we have knowledge networks in Nestlé, in R&D, where we actually connect different people from different sites, just to give you an idea, I don't know exactly how many sites, but I think we have sites all over the globe, and then sometimes the scientists are there, one of their kind, or maybe two of them, and it's good to connect so that we can share on experience, we can share on tricks, we can share on knowledge, and also get some coaching and mentoring also from each other.
Size of the data science group
So, what is the size of the data science group at Nestlé? Well, I didn't mention, so that's a good question. I would say that it depends how you define data science. Typically the networks, I will start from my close community, or my closer to me, the network, I mean, we are typically around 50. That would be for R&D, 50, 60, but not everyone is sort of completely exhaustive. And they more recently did data science.
So, and these people, I would say, would be more really in the development and kind of hard coders, I would say, or some people either using R or Python or both. But then when you extend also to a lot of analysts that would then be also using so more less coding or no code solutions, it goes bigger. And when you look at also beyond R&D, I would say a number that came recently up is something like around 1,000. 1,000 knowing that Nestlé is 300,000 people around the world, including all the factories, of course.
And then you can have a very diverse approaches because we have people in the business who would be more looking at, you know, dashboards. I would say it's really around the data visualization, which is already teaching a lot. I mean, basically having just showing what is there. I mean, how many times I've got people coming to me with an Excel sheet and say, oh, we see this and this and this on the numbers. And I say, yeah, well, try plotting your data. And that's one of the things we say also in the course we give is really plot the data. That's the first thing to do.
So this is already very valuable and here typically using Power BI type of solutions. So we have many people in the business doing this. There is also a lot of activity in the supply chain that I've heard of for all the demand planning and this kind of thing. So there have been things like this and in various fields of Nestlé. So that's why when you extend it beyond R&D it goes there.
No, it's primarily the 50, I would say, because it's really much more around doing the same activity. Well, we've seen here in the Hangouts. So we have some commonalities across what the different activities of the data scientists. But we also have might be quite different. And I think that when you are treating some, I don't know, data for supply chain or human resource data or data coming from mass spectrometer, it's not exactly the same. So you have somehow the same thing is behind.
But also where we are more than just machine or calculating machines is people. So we are having some knowledge about the domain. Typically when I was, you know, I know now I think a lot about coffee also because I worked for coffee projects. And when I was there, you know, releasing the connecting the dots sometimes because you're like people, they come with a request and they have a certain perspective from their project. And then also as a data scientist, you can say, well, look, I understand what you would like to do. But I can understand because I also know what is the context and what is the science behind to some extent. I'm not pretending I'm at the level, but to some extent, we know this.
Nestlé as a company
Well, so as I've just said, to give a number, it's around 300,000 people around the world. And then when it has this particularity, as I said, that it's it's kind of so it's a Swiss based. The Swiss originally from the headquarters are in Verve in Switzerland by Lake Geneva, which is called Lake Le Mans. And it's Swiss spirit in the sense that it is centralized, even though there is a center in Verve. But there is always a will to have products that are adapted to the local market.
So there is some sort of autonomy that is also given to the different markets. So what we call a market is not necessarily a country. But if it's a big country, it's just a country, but can be also a group of some of some countries. And basically, well, it's I think the first food and beverage company and our main products that are around coffee. We are we are producing a lot of Nescafe typically is one of the one brand. You may also have Nespresso, as I mentioned, so Purina in the in the pet food. There is also all the chocolate, chocolate.
But there are also very there are products that are typical to certain regions. Typically, I think Milo is South American. So then there are many, sometimes many brands that I'm I'm even still discovering that existed. So it's pretty good. So but really supplying food beverage around the world, I would say.
Head of technology vs. data scientist
I'm doing more hands on things as a data scientist. That was one of the goal. You know, when you are head of technology, it's the activity was a lot around supervising people, not supervising in terms of people management, but more in terms of technical outcome that they were producing. And there was also a lot of activity around interaction with the customers or with the prospects. So I would then work with the salespeople.
And in that regard, I'm not doing this anymore at all as a data scientist in this way because I'm involved in the project, I would say, really much more hands on. And I might then that there is no interaction with, I would say, customer who would have a specific project. It would be more internal customers and internal project that are being shaped and developed.
Using Shiny in the enterprise
Yeah, so my question was around, I think you mentioned something about collecting some data files from the lab using Shiny. So I was just wondering if you could talk about the process around identifying Shiny as the correct tool to do this and then whether you had any interaction with the stakeholder to get approval for using it and then as an add on to that, what development challenges did you meet, if any, when using Shiny in enterprise?
So basically, why using Shiny? It's because we're using R to start with. So basically, the data analysis that I've been doing, we have an internal package that is called Nestat that I participated to in developing some code around design of experiment analysis. And so just to say that we have a rather intense activity using R as the software for doing our analysis, especially when it comes to, you know, sometimes from the lab, you can have some more and more automatic or semi-automatic equipment that would generate a lot of data like photo samplers, like robotic systems and things like this.
And when it comes to that, you have a lot of different files or a lot of files, which are not so different. They are the same structure, but it's a lot of them. And, you know, copy pasting it in Excel and trying to do something out of that is kind of painful. And so very quickly when I come to this situation, I would then start coding in R to do the loading, automatic data running, put things together, have a nice graph with nice colors using ggplot and so on. So I'm a tidyverse fan, by the way.
And then with this, you have your code. And then you get your colleague from the lab or from the project management saying, Hey, that's very nice. But now I have new files and I would like to do it again. And the same analysis and the same display and then over and over and over. So after some time you say, Hey, look, I can just package that into a nice web interface. And then you can do it all on your own. And then if there is anything going wrong, then you can just let me know.
Other things that could happen, typically to give you an example of why we've been doing this is, so while you sort of secret that a coffee machine is pushing water through a system and then with water you have also some pressure. And then so you have some curves of pressure against time or temperature against time, this kind of thing. And then we are interested in what is the maximum here, what is the plateau here, what is the local minimum, what is the local thing and what is the variability. So all these features extracted that are then directed into that. And then they would run their own analysis on the features. But you need to extract these features. And a typical shiny app would be something where they could upload the raw data and then get back a CSV file where they have the features analyzed and extracted.
So now how did we manage to have this in the company? So I would say a key thing is to have a good relationship with IT because they are the guardians of the compliance for IT and it's really important and for very good reasons, especially in a company as exposed as big companies, I would say any big company.
So I would say a key thing is to have a good relationship with IT because they are the guardians of the compliance for IT and it's really important and for very good reasons, especially in a company as exposed as big companies, I would say any big company.
And while it went, so the first thing that's, I mean at some point I had also, you can also have shiny apps that you just like locally deploy in the computer and then you can also have your colleague having R installed and then you can just, we started by this sometimes, by the way, to have kind of a batch file to just start the process and then double click on it and then it would open automatically R and shiny behind and then they would have the page. But then it's sitting on their computer and if we want to update anything then we need to go to their computer again.
So there were some discussion also already from the center where people were saying, you know, that would be interesting and then, you know, connecting with IT, you manage to put it into the sufficiently secret and closed environment but still available in the intranet. And then more recently we moved with Azure and then it goes into Azure with Azure solutions. And then that's also where you can manage the access, you can manage the visibility of the virtual machine. So now we choose the server, it's about the virtual machine.
And, yeah, so the thing is to have our IT counterparts that are aligned with this. So basically what I was looking at, so I could describe the needs that we had as programmers and data scientists and then they would, they came back with some proposal and then we moved together to have something that's up and running.
Yeah, we said it's just having the right people but it's also a matter of good compromise, I would say, because of course you are not in the world of open source and or you would not, typically our code is versioned into a Git system but inside, in-house, we would not put our code on GitHub because it's a private company. So you need to have this understanding and then also explaining what are the needs that we have. Typically in R&D we have some needs that are inherent to research and development. We cannot predict what we will discover because it's research and development.
Building a good relationship with IT
I would say, yeah, the good relationship with IT is I would say like any human relationship, it's understanding also the other's perspective. It's really, you know, most of the time, really in the very vast majority when someone is bothering us with some constraints, it's not by pleasure or just to make our life worse. It's because they need to deliver on something and they are also accountable for something and then we need to make sure that we also speak the same language but at least we have a common ground of understanding. I would say that would be one of the key things.
I remember, for instance, yeah, there might be things that you are not aware of typically on some of the data access or some of the systems access. You say, you know, I would like to have it open in the words of sharing, sharing knowledge, sharing science, sharing everything. And then they say, yeah, but you see, this and this can happen and actually it did already happen and you didn't know that. So listen to the stories also that they have to tell us where it comes from, all these frame that was put in place.
And afterwards, it's even better because once it's well set, then you can really work in a confident manner in a safe environment. So, yeah, I would say understand and listen to the other, but also be able to explain your perspective and your needs in a most objective way, I would say.
Career paths and supervision
That's a good question because that's exactly one of the reasons I moved also to Nestle. So I think it's inevitable to do some sort of supervision. Now, the great thing is I have the chance to have in Nestle that we have a path where we can develop expertise. And that's what I've chosen. I'm not developing into managing more and more people, having a group and a department and an institute or whatever. I'm more developing the expertise in data science.
But still, with a certain level of seniority, people come to you and ask you for help, ask you for feedback, ask you for having some, you know, can you have a look at what I've done and then tell me? And then also people ask you, well, from your perspective, what strategic decision would you recommend? At some point, you take this also span when you grow in the career. So that's why I say it's inevitable to do some sort of management.
So I would say I don't have direct people as of now, but I have, you know, indirect or dotted line type of relationship also. I may supervise sometimes some students or in case like this. But it's more, yeah, something I would say like coaching and counselor type of activity. Still with the responsibility that goes with it because I've got the experience sometimes that I've got some feedback where, yeah, senior leader would ask some project manager who is presenting the project. So these are the conclusions. And then, oh, did you check it with Jean-Vincent? And then if they say yes, then you say, oh, OK. That's right. So you're kind of the validation of the expertise.
Coaching beginners to code
Yeah, sure. So first I would say the key thing is that the person starting should be, well, I think if you are interested into coding, then that's a very good first step. There is the first barrier is people, some people are scared by the code and some are not. So if you are scared, I would recommend you do something else. But if you are not, then it's a good thing.
And then I would say, yeah, maybe two or three tricks, but maybe they're kind of obvious. You can have some example code. Typically you never, you rarely start from nowhere. So you always have more or less good quality. I mean, well, so I started also coding in my life. I didn't, I was not born coding. So sometimes some of the other people would give me some, some part of code that was okay. And then I could start reading it and understanding it and then adapting it. So you can have a first thing is that this is your example code and then, and then adapt it.
And then something that is, that is really, I would say, helping is to do not hesitate to have regular sessions like weekly, where you just review the code. And even if you have a question, then you can go, yes. Stack Overflow or yeah. I'm not sure that that chat GPT is doing a very good job. I've had some hallucinations. So double check with the internet as well.
Then it's good to have, it's good to have also this session. Let's take an hour or two hours sometime in the week. We sit together, explain me your code. That's where the other different intentions come into the code. I would say also don't hesitate to put comments even for yourself. This is what I'm doing here. This is what I'm doing here. And this is, you know, just a few, a few notes.
And then with someone more experienced can then propose some alternative way of doing the analysis or alternative way of or more efficient way. Because yeah, sometimes it happened to me that, you know, I get, I get to review a code with someone and then you say, yeah, but you can do it a lot more efficient this way. So it's not only a matter of being, you know, competition efficient. Sometimes he has, because he takes several minutes or hours to compute. So you, you better have something more efficient that is really dramatically reducing.
But sometimes also, it helps also the code to be clearer. And then if you have a elegant coding, then in the end, you also have a better understanding of what is happening. And, and you, you keep track and you, you avoid hidden mistakes because in the code, you can have cases where the code doesn't work and it gives you an error. That's kind of easy. You will have a capital letter somewhere that he was not there. You don't have the right package. Okay. That's kind of quickly fixed. But I would say the most prominent error is when he does everything and then need out something that is not supposed to output. And if you cannot spot that, that's, that's pretty complicated. So what that's when you have a clear code and an elegant code, you can, you can more easily follow. I would say what is going on.
And sometimes what can be good also is to set some challenges. And I set myself at some point. I said. Now from now on my code in our, we have zero for loop. I did everything without follow. First is more efficient. And sometimes it helps you. So think a bit differently. But with apply. Typically with everything around applying maps.
Tools, freedom, and working with IT
How much freedom do you have regarding the tools that you use? As much as I can justify the need. I would say. It's true. We cannot use anything. We won't. We need to use approved tools. I have a lot of them. So I have a lot of tools and I have. I would say more than if enough tools to use.
Sometimes I come across things that I would like to implement. And then, yeah, the process is we can come with a proposal, but then we first need to come with these two answers my needs because of this and this and this. And then IT would come back and tell you, well, maybe you can use this which we already have in house that I wasn't. And sometimes it happened. I was not aware of something. And then I have a new tool I can use that is actually answering the needs.
Coming back to IT security. I would really come to first talk to IT representative to say I wouldn't come saying I have this new tool. That is great. Please diet. Because that don't work. I would say, Hey, you know, I've come across this challenge in my in my work. Do you know anything about that? I have that by the way. And then in the conversation, you can maybe suggest it. And see whether they know about it or not. So, because it's a lot more, you know, okay. And you can say, I may have a potential solution, but this is really what I need to have as first to be, to be answered in terms of need.
Well, first, some of these free tools. He's doing the job in terms of scouting and being aware of what is going on typically with we have the example. And you know what happened with, with Samsung. So very early, we had also some communication internally saying we know there is this coming out. Please be careful about this. I think it's okay if it's not, if it's an open source and free tool that doesn't require admin rights to install on your machine. You can do it, but then you're responsible for it. I think this freedom also always comes with responsibility.
I think this freedom also always comes with responsibility.
Aspirations for the future
No, it's really, I think I'm really interested into, you know, basically trying to get knowledge, you know, extract knowledge and understanding of the phenomenon from, from the data and we have more and more data. So we say really going in and being also aware of what is, what is what is emerging. I mean, recently we have, you know, we have having some, some statistics and then doing some linear regression. And now we have transformer architecture that can be applied beyond text in some cases. You know, getting more of these things. And trying to connect the dots between the different things that are available or that emerge also from, from the academic research.
Then by doing this, it's in terms of personal development, typically in the expert expertise aspect, you can then start to also give lectures in some universities or in some schools. And then to see, to also share your knowledge here and get the feedback also from, from, from different, you know, from different people. So, so I would say that would be a natural evolution. You know, it's also to, to go to reach beyond the company for, for this. And which is by the way, aligned with the company.
Advice for data scientists and leaders
You have many different angles here, but if I had to pick one. I would say it's evolving very, very, very fast. Being curious on new things and always trying to, with moderation of course, because you need also to deliver and to be, to be efficient. So I would say yeah, being curious what is coming up and the new discoveries and then don't hesitate to test these new things on.
And also keep the right balance between having new things and still selecting the time for the people that you are maybe leading to learn about them. And to still delivering what they have to do in the everyday life. Because, you know, sometimes you have to do these new things that you want to try, but still the projects I'm moving on and. So I would say, you know, keep the balance between what you are doing and what you want to do as well. So there is a good balance to find between the how to deliver what you know already. Let's see how you miss curiosity and openness to try new things that may also open new opportunities.
Are there any particular avenues or channels that you use to stay up to date and learn new trends? I'm not registered to a specific journal. I would say it's more, you know, like this. Sometimes also there is no one channel. I would say it's really a therapy. Sometimes I get something on LinkedIn. Some other time we have a specific challenge to solve in a project. And then there is a company or that we can work with. Then that would be the spread. I would pull to, you know, to, to better, to better understand it. So I would say it's really more opportunistic and systematic way.
Thank you so much for your participation and leadership today. If folks want to ask you more questions or get in contact with you, what's the best way to connect with you. What I would say LinkedIn. Don't hesitate to join the positive talk Hangouts. Because, you know, I tend to to not accept contacts if I don't know the person. I haven't met, you know, it's kind of a rule I've said for myself, because, you know, otherwise you get overwhelmed. So don't hesitate to connect, but leave leave a word. And then then we can most probably we'll come back to you saying let's have a chat.
It was nice seeing everyone. Next week it'll be regularly scheduled program with with Rachel, but thank you so much. I hope everyone has a great rest of the day and week. Thanks so much. Thank you everyone for advancing your questions. See you all.