Here are 8 Tips for Avoiding Unrealistic Data Science Project Expectations


Organizations across all industries are collecting more data than ever, and looking to data scientists and analysts to glean insights that will help improve business. However, with all of the hype around big data, it’s easy for data science project expectations to spiral out of control.

“Companies are super excited that data will solve every problem that they have,” Andrea Danyluk, a professor of computer science at Williams College and co-chair of the Association for Computing Machinery’s taskforce on data science. “It very well may be that data and data science will solve many of their problems and will move their business forward. But with every project you do, you should sit back and think very hard about the specific data you’re collecting and the potential implications about what that’s going to mean.”

For example, this means considering potential biases within the data itself, and how those biases could impact your business moving forward, Danyluk said.

Ultimately, “data science is not a silver bullet,” said Dave McCarthy, vice president of Internet of Things (IoT) provider Bsquare. “Instead it’s the highly advanced and ongoing mathematical analysis of extremely large data sets in search of unique and actionable insights.”

Here are eight tips on how your organization can avoid setting unrealistic data science project expectations.

1. Start small

Start with a small, low-risk project, said Meta S. Brown, business analytics consultant and author of Data Mining for Dummies. This means something that you aren’t very worried about at the moment, but that has a high chance of yielding success.

“One of the most common places to do that that most organizations are not really doing is testing something in your email,” Brown said. For example, most email newsletter vendors offer the ability to test alternative versions of an email. You could start testing your subject lines and seeing which produce more opens and clicks.

“That’s as low-risk as you possibly can go—you have nothing to lose, and you don’t have to spend any money, because your vendor already provides the technical capabilities,” Brown said. “And you might find out that, hey, this subject line works better than that subject line. It’s a good example of something that might be right there for you to do, and where you could start to show value.”

2. Create an analytics plan and process

Organizations need an analytics process, Brown said. “When people complain that analysts are not solving the right problems or giving them the right information, that’s a reflection of a process problem,” she added.

The process can begin by gaining agreement on what in the organization is a problem, and choosing a small problem that everyone can define and agree to work on, Brown said. Then, you have to evaluate whether you have the data to solve it.

3. Ignore the trends

Avoid starting with a flashy project, Brown said. “Don’t worry about what’s cool. Worry about what’s cost-effective for you,” she added. “The cool factor can be a really big problem.”

4. Don’t obsess over tools

When it comes to a data-driven project, “tools are the last thing you should think about,” Brown said. However, companies need to determine what particular products are important and spend the money when they need to, instead of spending a lot waiting or seeking another solution, she added.

5. Understand the computational limits

While data analysis can improve many processes, “there are mathematically provably things that cannot be done unconditionally,” Danyluk said. “It’s a wonderful thing to think that one field would be able to do everything to solve the world’s problems with data. But there are things that cannot be done through the end—unless we have a completely different framework for how we think about computation, it’s just not going to happen.”

Read the source article in TechRepublic.