We need answers! We have data!

Abstract: Data Science is even more relevant to the business when the economy takes a hit. You need to make informed decisions based on data. You need questions answered. You have more data than you know, and if you need more, no other field than Data Science is better positioned to get you useful data. This article covers preliminary steps where a Data Science consultancy can help you.

When the going gets tough

Andreas Weith CC-SA4

Every business on the planet operates on decisions, even in scenarios where the business is unaware it is making these decisions in Absentia.

In many cases, when the economy is doing well, companies tend to do well, even when sub-optimal decisions are made ("a rising tide lifts all boats", and all that). When the economy takes a downturn, however, business decisions and processes get more scrutiny.

What is hiding underneath it all? Questions get more defined, problems bubble up to the surface. People start saying:
"We need answers!"
When that happens, people will meet. Somebody might say "Data Science can solve this with Data", to which somebody else will reply:
"We don't have any data. I've heard on LinkedIn that you are not ready for Data Science if you don't have data."
Or maybe a variation on the theme. And perhaps you've wondered if this is a reasonable approach (I mean, no data, there is no way you can do Data Science, right? LOL!). Actually, it is not. Some articles that touch on this point are not talking about Data Science or Data Scientists. They are focusing only on Data Modelers or even more precisely, Machine Learning Engineers, one of the many facets of Data Science.

For more information on why you need end-to-end data science, see the Harvard Business Review article on the subject.

We (don't) have Data...

First, let's get this myth out of the way. It is highly unlikely you do not have any data at all. You have customers, you invoice then, you do have data. You have sales records, you do have data. You have a marketing website, it has logs, you do have data. You have PBX telephony logs, you do have data. You have emails with customer interactions, you do have data. You get invoices, you scan them, you do have data.

These are only a few examples, but I'm sure you get the point. Perhaps you didn't consider these as data sources. This is where a Data Science consultancy can help in identifying the data you already have. We've seen a lot of scenarios over the years...

But wait. Let's say you don't have any of these, or the data you have cannot be used to answer the business questions or solve the business problems you have?

We (can) have Data...

Data Science is not about algorithms or metrics. It is irrelevant to the business if the way to get the answer to a question is a Monte Carlo simulation, a linear regression, a Random Forest classifier, or a Convolutional Neural Network. It is about solving business problems. But it does require some data.

"Data Science is not about algorithms or metrics. It is about solving business problems - Francois Dion"

If your business doesn't have any data relevant to the problem at hand, a boutique Data Science consultancy is a good partner to advise you on the acquisition of relevant data. Too many times, we've seen customers acquiring expensive third party data, especially for demographics, only to discover that the data didn't help any decision making, or worse, increased the errors of the prediction models they were using.

Sometimes, acquiring the right data can only be done through a survey. This requires preparing a protocol for the survey, designing a questionnaire, testing it, conducting the interviews, compiling the results and analyzing them. Dion Research can design and conduct these for you and provide you with the results.

At other times, the data is not readily usable. It might require some computer vision or natural language processing in order to be fully leveraged (example: customer interactions through email, social media etc), or it might only be possible through a combination of several data science steps.

It is also possible that the data you need is available through a commercial data set or a publicly available data source. Again, through specific experiments, we can help you to identify which sources should be of interest and which should be ignored

We (do) have Data...

As data is identified, sources collected, transformed, adapted, or are created through instrumentation, polls, surveys, simulations and the likes, there is another important step that needs to be covered before going any further: Data Quality.

This issue of Data Quality goes beyond what is usually done in Data Governance or Data Management. This was covered in part in "What Grade of Data are you using?". The main thrust is that data quality has to be part of the whole QuestionAnswer pipeline. When it comes to surveys, the questions and answers have to be evaluated in terms of quality. And when we collect data, whatever the source, once more, we profile the data and measure its quality.

This measure is what helps a Data Scientist select an appropriate model, the right algorithm, the cross validation protocol, the hypothesis tests, etc. Dion Research is able to provide all these services by leveraging the automated Data Quality of its flagship product, VISUAI.

We have Data!

At this point, we can truly say "We have data". By working as a team, your business and our consultancy can not only transform your company into one that makes decisions based on data, but also get it ready for the next phase of Data Science: Automating the automation.

This will be the subject of a future blog post where we discuss how enterprises can continuously make informed decisions based on current data.

Francois Dion

Chief Data Scientist


About Dion Research LLC: We are a boutique Data Science consultancy . established 2011. As we do end-to-end Data Science, we can help you solve business problems every step of the way. Get in touch for more information.