Interview with Data Scientist - Philippe Van Impe

Philippe Van Impe is a Founding Partner of the Brussels Data Science Community, a large open community of specialists in data and business. The community's activities aim to bridge the gap between businesses and academia, through regular meetups, training and ‘Data for Good’ projects, where members contribute their skills to work on projects with NGO’s, public institutions and startups.

Philippe is also the Founder of the European Data Innovation Hub. The Hub connects and supports data professionals throughout Europe to share and discuss best practices in open data, big data and data innovation.

Philippe is the CEO and Founder of DigitYser.org, the digital flagship of Brussels where communities gather to learn, co-create and implement digital solutions focused on Blockchain, AI, IOT & VR.

Philippe is a Social Entrepreneur simultaneously pursuing both a financial & a social return on investment.

Source: LinkedIn

Interview conducted by Alan Mitchell & Martin Ganchev.

Hi, Phillipe. Could you give us a brief introduction of yourself, what you do, what your background is, what you’re doing now?

Yes. My name is Phillipe Van Impe. I have been involved in data science for the last 4 years.

I started the ‘Data Science Community of Belgium’ in 2014 and we started by using meetup.com and organizing evening events to understand what data science was all about. After that, we grew very quickly and now we are over 3,700 data scientists. We have a LinkedIn group and we also organize a yearly ‘Data Innovation Summit’.

What does a Data Innovation Summit involve?

So, the first year we organised it was 2015 and the topic was ‘Data Science in Belgium’. A general topic and we had 500 people attending it. The year after, because feedback was that there were many men the first year, we organised a ‘Women in Tech’ data science event. At this event, we had 40 women speakers.

In 2017 we focused around ‘Data for Good’, the idea of using data to build a better world.

In 2019, the event took place on the 26th of June. We talked about AI and focused a bit more on educating management.

Today there’s no longer a lack of data science education. The only problem or shortage is more towards the people - corporate and management who need to understand how to use data science or how to use analytics or to use algorithms. If you want this transformation to work, the issue is not on the technical side anymore. Its more on the management side of things. I’m not talking about project management, I‘m talking about corporate people who are deciding what projects they will do next. What they lack is an understanding of analytics.

Where does this lack stem from, why is there a lack on this side of things?

The topic is quite new. Before, IT just automated everything that was already done by hand and there’s not a lot of added value in that anymore.

Now, the data science part and the AI part is much more valuable, the leverage effect that a data science project can have is much bigger. And that’s why decisions must be made on board level and that’s why in order to make a good decision on board level you have to understand what it’s all about.

And I think that’s the area that needs to be focused on at the moment. Our problem is that these people don’t know it because when they studied 20 or 30 years ago there was no data science. There was some statistics but that was not such an impactful subject and today they are all their 50s like I am, but they don’t know what is happening, they don’t know what is possible. One of our main challenges today is to inform directors, managers and senior managers of the value of doing analytical projects and the importance of this transformation.

What steps do you plan to help this process in the future?

We have our own space in Brussels, Belgium. It’s a place called DigitYser and it’s a meeting place for communities... for people involved in data, data science, big data, machine learning. It’s their meeting place.

We have a 2000sqm office location, we also have a lot of training rooms and we organise executive lunches to talk about these different topics.

Top management are very busy, these people will not be able to spend 3 or 4 days on training but can come for lunch to talk about one specific topic. For example, we recently held a lunch on the Ethics of algorithms. We did a lunch on data protection, data ERP. These are things we are doing to inform management.

We have a leaderboard and we select the 25 best and most active students and we put them in the Data Science Bootcamp.

Along with that, we have our training tracks where we have data science camps which take place each year. It’s a 6-month process starting in July with coding-camps which allow students to become certified in SAS, SQL, Python, R and Microsoft Azure. Within this, we have a leaderboard and we select the 25 best and most active students and we put them in the Data Science Bootcamp.

The data science bootcamp is an activity where they will only see projects from A to Z.

Each time, there is a 2-day project. One about logistics optimisation and one is about building recommendation engines. In the first steps they understand how to program and the second step they have to understand the business issues of it and how to set it up.

So, that’s the data science boot camp we are organizing.

As well, we are very active in allowing junior data scientists to come and work here in Brussels on different projects that we have in the pipeline. Each year we do a big ‘Data for Good’ project. Last year we did the ‘Dengue Hack’.

Dengue is a mosquito-borne disease and we have the denguehack.org organization that we set up. There were 7 workshops and one big hackathon to identify where the next dengue outbreak will be.

HIV drug resistance appears when somebody does not take their medication on a regular basis for different reasons: sometimes there are stock outages, sometimes they sell the medication to someone who has more money.

This year we are doing the ‘HIV Hack’ and you can find information on HIV Hack LinkedIn page. What we are going to do is work on identifying the areas where there is a lot of HIV drug resistance appearing.

HIV drug resistance appears when somebody does not take their medication on a regular basis for different reasons: sometimes there are stock outages, sometimes they sell the medication to someone who has more money. There are many reasons.

What we want to do with the ‘HIV Hack’ is to build a method to put all these sensitive areas on the map.

It’s going to be a visualization exercise where we want to build maps of different countries where we be able to see - ‘oh, in this region there is a lot of HIV drug resistance appearing’.

This is so that governments can act and understand why there is so much drug resistance appearing in that region. Then instead of having a general marketing campaign, have a focused marketing campaign in the different regions where the problems are greater.

We’ll be working on this for a year and we are launching on the 22nd of March. We will do workshops before the summer and on the last weekend of September, we are doing a 2-day hackathon. The interesting part is, at the end of the hackathon we will identify teams who have built the best method to visualize those areas and we are going to send those teams to the countries.

We are going one step further than last year, we are sending teams to the countries to train the local teams to work with this system and do their own local hackathon.

These are things we are doing on the data science side.

Other things - we are opening local data science communities in different universities of Belgium. In Ghent, Antwerp, Leuven and also Liege has been opened now.

So, we now have these centralised teams working on different data science topics of importance.

This weekend we are doing a hackathon using machine learning in the music area, a Belgian pop star is coming here. We will have a team of data scientists trying to see how we can use data science to make better music.

Well, that sounds fascinating. Can you tell me more about that?

It will be about understanding what makes good music and using machine learning in order to make better music or to make hits. I don’t even know how far they can possibly go with that. I know the best data scientists in Belgium have signed up, so I think it’s going to be a very important and very fruitful hackathon.

Through DigitYser you work closely with blockchain, how do you see the relationship between this and data science developing in the future?

The technology around blockchain is still very young and it requires a lot of manual work in order to make the smart contract. To build a smart contract, you would imagine it’s all about using your phone and shaking it and sending it to somebody but it’s not the case. Using a smart contract is more like that green screen programming we did in the 80s and it’s a very long process to build a smart contract.

I think that will be an area where data scientists will be involved. It’s an evolution as where today all the smart contracts are built in JavaScript, tomorrow they will evolve to use Python. So, we could very well see data science people and data analysts switch towards working with smart contracts and using Python maybe.

With a smart contract, we agree to do something at a certain time or to deliver something and a third party then confirms that thing has been done. There will be some data science there, and I see people from the data science community involved very much in the blockchain area. A lot of them are interesting in the IoT world and in the VR world as well.

Can you give us more information how you are using VR and AR?

The space here is so big we invited other communities to use it. That’s why the VR community came and the blockchain community was added and the IoT community came and now you see these people work together and start to build solutions together.

There is always that area where two technologies touch each other, and people can work together on these things.

We recently built a Smart Virtual meeting application where people could have the feeling that they were in a meeting with somebody else but it’s virtual. Behind that, there are some intelligence and some speech recognition systems so that the person could say ‘I want to attend that meeting in Brussels’. We did that for one of the big pharmaceutical companies. We did a hackathon on that.

There is always that area where two technologies touch each other, and people can work together on these things.

Which tools and software are essential for your team (coding languages, data management tools, visualization tools)?

We believe a lot in open tools. I see guys working more with Python and the guys who work on analytics are not letting go of the R environment. So, R and Python are still the driving software applications. I see them still working a lot on their local portable PC or Mac. I don’t see them working a lot in the cloud. Definitely, on the data science side, I haven’t seen them working in the cloud.

I see the importance of new technologies like allowing streaming data to be analysed. I see the need for nanosecond decision making in the media where somebody is on the website and the system decides in a nanosecond what kind of advertising it should present. That’s where it becomes interesting because you take this nanosecond decision of what advertising this person is going to see and if you also add to that the possibility for data protection and add that to a smart contract, it develops that you have data protection in a smart contract on streaming data. It’s going to be fun to get that working.

We know that blockchain technology is not a good platform for nanosecond transactions that’s an issue there.

In which industries do the companies you typically work with operate? Where is big data the ‘biggest’ and most challenging?

A lot of the companies you see there are consulting organisations like training partners across many industries. Mainly I see a big demand from the tech world where the business model is being transformed. Also, where companies are expecting much more data science magic to happen in order to make their new business processes more efficient. So, a lot of tech companies, insurance, banks are taking data science people.

Mainly I see a big demand from the tech world where the business model is being transformed.

So, when you look at it you could say any industry, where the business model is changing because it is becoming more virtual, would require data science experts. The more classical industries, the brick and mortar industries, they continue to work with brick and mortar and require less data science experts.

Can you think of a situation when you have worked with a given company and you felt especially proud of what you helped them achieve?

The 'Dengue hack' was a good example of that.

Also, we are training data scientists every year and you see them evolve into the data science world, coming from knowing a lot to know more and more and getting a job and coming back to visit us after years that’s another thing that makes me very proud.

Every time we conduct these interviews, we finish with some nerdery. What is the one nerdy thing you would like to share with the world. Doesn’t have to be data science related.

I don’t have a joke as such, but I will leave them with an open invitation here in Brussels and they are very welcome at DigitYser. Also, I would call out for help for the ‘HIV hack’.

Interview with Philippe Van Impe, Founder of DigitYser.org