Published on

How to Ask Good Data Questions

  • avatar
    Nelson Tang

Have you ever spent weeks of your life building a great data product, only to find that you solved the wrong problem? Or you built a cadillac, when all they needed was a bike?

Enter a secret from project management - you gotta nail the requirements, or the 'problem to be solved'. The analyst needs to ask the right questions to elicit the true requirements from the stakeholder, who often doesn't understand your domain or might be obscuring the true nature of their problem.

Caitlin Hudon is an awesome person and crowdsourced some responses in her twitter thread on this topic:

Writing an internal "how to ask a good data question" guide for both technical and non-technical stakeholders. What would yours include?

- Caitlin Hudon 👩🏼‍💻 (@beeonaposy) January 8, 2020

Where a new analyst should begin

Here's a great framework from Laura Ellis (@LittleMissData) to start with:

I wrote something along this lines showing ppl how to break down a business problem into a tangible data problem if it helps! I teach ppl this way at work.

- Laura Ellis (@LittleMissData) January 9, 2020

Mini checklist

Here are a few one-liners that I loved that you can use as a checklist at the start of a data project. I wish I asked these at the start of a few projects where I missed the mark.

  1. If you had the answer, what action would you take?
    • or: If you get a perfect 100% confidence answer back, what would you do with that information?
    • Who will take action using this data? What will they do?
  2. Who else may find this useful?
  3. What are you doing today? What are you trying to do that you can't today?
  4. Can you make a decision with imperfect data?
    • If I can give you an 80% confidence answer in a day, vs a 95% confidence answer in two weeks, which would you prefer?
  5. How do you want to receive the results?

Other responses that are worth some additional reading

Jeremy Howard's Project Checklist blog post. Pretty big questionnaire that he used over decades of consulting work. Key takeaway here is that you need to head off problems like initial constraints and resource limitations and to understand how this is going to benefit the company.

Do you have metrics about your data? Introducing Data Meta-Metrics.

1000% agreed. I've done some projects like this where the data wasn't great, and relied on data meta-metrics to help communicate availability and quality back up to stakeholders. (Like this:

- Caitlin Hudon 👩🏼‍💻 (@beeonaposy) January 8, 2020

Caitlin added scores for Relevance, Trustworthiness, and Reliability to describe the quality of the data or the analysis. I liked how she defined the scales - once you see it it's hard to imagine seeing data without meta-metrics.

Someone recommended a book called "Optimizing Data-to-Learning-to-Action": Looks like a promising book that advocates for the use of data to drive decisions/actions. Aspirational to be sure, but reading through the preview pages is a waste of time as it's all advocacy and no clear suggestions yet. Likely to just be another consulting framework that is too vague to be useful. But yeah, you should work backward from the decisions to data, estimate the flow of value along the chain, and identify potential issues. I love OODA and agree that you should build things that help you make decisions, looking for more books that have case studies.

Julia Evans wrote a blog post covering this exact issue in her How to ask good questions blog post. This seems to be more geared towards asking for help or advice rather than scoping a data project, but I'm including it here because it's good general advice.