I was a data scientist in a startup when a product manager walked up to my desk and asked, “What is unsupervised learning?”. I gave them a textbook definition and it helped us increase the revenue and make our customers happier. Soon, I got promoted.
This story obviously never happened but if you google data science interview, this is the type of questions you’re going to come across 🤷.
How much does this “what is (some ML concept)” format tell you about one's skills?
Say, you're hiring for a Product Data Scientist and a part of their job is going to be designing experiments. On a stats exam, you can expect questions about p-values and distributions 🎓. But if a data scientist is expected to make data-driven decisions, why not provide them with the context and ask how they'd measure the impact of a new feature? Eventually, after a discussion about why it has to be an experiment and what success would be like, it'll come down to the duration and stats tests, among other criteria.
When working on a data science mock interview plot for a free interview preparation tool, I based it on my experiences while working at Uber and other startups. It includes some fond (and not so fond) memories such as A/B tests resulting in metrics going in opposite directions 🤯 or constantly balancing between working on long-term modelling projects and urgent ad-hoc requests.
After reviewing the first 500 mock interviews, it led me to a conversational format. It guides a candidate through these situations and lets them highlight their problem-solving skills.
I realised that a mix of open-ended hypothetical and behavioural questions works the best. It’s also a good way to help candidates demonstrate their business sense and communication skills, in addition to technical knowledge.
Five Rules of Data Science Interview Questions
🧐 Provide clear expectations and don't anticipate a single correct answer
Often, interviewers expect to hear a certain solution or narrative in an answer but fail to articulate it when phrasing the question or giving hints. Think of some vague hypothetical situation that can be addressed in many different ways (e.g. "how would you predict user churn?"). Depending on the candidate's background, they might focus on certain things they're the most familiar with but not what the interviewer has in mind.
🧑🏫 Let it be as close to real work as possible
Unless you're looking for someone to help you cheat in exams, the fact that someone memorized a complex formula doesn't imply their value to the team. Unfortunately, it became almost standard to leave just a few minutes at the end of an interview to let the candidate ask about the team and its projects. Interview questions relevant to day-to-day activities also help you provide more insight into the kind of work your team does.
🤯 Avoid questions that require a lot of context-setting
Interviewing time is limited and you don't want to waste it setting the stage. Asking about niche topics, relevant only to a specific industry might also give an unfair advantage to a candidate who happened to work on a similar problem recently. It's worth focusing on the skills that take a longer time to acquire rather than something that can be learned in the first week at work.
🥱 If it can be Googled, don't ask it
... because they'll have a chance to look up things at work. According to Glassdoor, there's a link between interview difficulty and candidate satisfaction. As an interviewer, you’re trying to get the most signal about a candidate’s skills in a short time. Ideally, each series of questions should be a mini-project that allows a candidate to demonstrate their business sense, clear communication and technical depth. This approach also allows you to give hints and unblock them without spoiling the entire answer.
🤺 Know what skills are important for the role and focus on them
The goal of an interview is to assess how well a person would perform in a certain role and not what they don't know. I used to ask some tricky SQL question that half the candidates failed until I realised that those who didn't pass it were familiar with the topic. They lacked attention to notice a certain edge case (which is normal during an interview) but it wasn't the goal of the assessment. In other words, have a clear picture of the skills and competencies critical for the role and focus on them.
At the end of the day, interviews are the most outcome-defining part of the hiring process. Keeping them fair and consistent is key to building a strong team. Good questions make interviews engaging and fun, and remind us why we chose to become data scientists in the first place.
Comments