Need a Data Scientist? Try Building a ‘DataScienceStein’
Category : Big Data
Organizations are finding that hiring qualified Data Scientist is a real challenge. Experienced Data Scientists are expensive and are usually employed elsewhere. This high demand, low supply economics is leading to a situation of the ‘haves’ versus the ‘have-nots’, where the larger, financially rich organizations in the ‘sexy’ industries are most capable of attracting and hiring data scientists, while the lesser companies will have to make do without one.
Organizations are looking at new approaches to finding data scientists. Some are able to attract them with more than money like autonomy and development opportunities. Others are training current staff to become more data literate through professional development programs. Once trained, these individuals typically must work 12 to 24 months at the organization or have to pay back the amount spent on their training.
There is another approach that should be considered. It involves building your data scientist out of a team of people currently on staff or readily available in the marketplace. This is called the DataScienceStein approach modeled after Mary Shelley’s Frankenstein monster built from several human parts. In this case, building a DataScienceStein from a team with a variety of skills.
Some of the keys skills most required from a data scientist include:
- Data integration
- Advanced analytics
- Data visualization
- Industry or subject matter expertise
- Communication skills
- Programming skills
Other skills may be needed depending on the industry, company and the maturity or the organization’s analytics function.
To compensate for these skillsets a DataScienceStein team would need to include a:
- Data analyst who is responsible for gathering the data in an easily accessible location for use by other members of the team;
- Business analyst who understands the business and the data most relevant to the business leaders;
- Modeler responsible for creating and executing statistical models;
- Programmer to write the scripts in the preferred language to prepare, integrate and analyze the data for the modeler and the visualization specialist to use and the industry expert to review;
- Visualization specialist to translate the data results into visually engaging charts and diagrams;
- Subject matter or industry expert to provide insights on the data, industry and perspective on the results of models.
Larger teams and/or larger projects could also require additional programmers to write the code to access the data, quantitative analysts to help write scripts to access the data and project managers to keep the everything on track.
Two key skillsets that are extremely important to the effectiveness of the team are the subject matter or industry expert and the visualization specialist. Depending on the industry or business, having a person with extensive experience will help the team deliver more valuable analysis. This person provides the reasonableness factor by determining if the results of the analysis ‘makes sense’. The importance of these experts cannot be underestimated.
The visualization specialist is responsible for turning the data findings into graphics that non-technical people can easily understand and tells a story about the insights the data generated.
Using the DataScienceStein approach has several advantages to hiring the traditional data scientist, including:
- It broadens the applicant pool. By not having to find all of the skills in one person, but by searching for specific skills from a number of people, the organization has a better chance of identifying and hiring people to fill the positions. For example, data science is a fairly new function, so it will take a while for the supply to catch up with the demand. Modelers on the other hand have been around for decades. The chances of finding a qualified modeler are many times better than finding a data scientist.
- Cross-functional learning. Members will naturally pick up skills from each other which will make them more effective at the work they are individually tasked to perform. This cross-functional learning will not only benefit the DataScienceStein team, but also the individual. By learning from others on the team, they are also gaining valuable skills to help them in their careers.
- Key knowledge is disseminated. Years ago, when data was hard to access and the tools were harder to use, there was typically one person with the skills to perform the analysis for an organization. If that person was hit by a bus on the way home, all that information would go with him/her. Now that the data is more accessible and the tools easier to use, several people can access the data and perform the analysis. A cross-functional team is even better situated to share this knowledge with each other and with those outside of the team, creating the opportunity to accumulate the information for others to easily access and utilize.
The main obstacle to the success of a DataScienceStein team or any data science project is the lack of leadership. Without clear direction and authority, the projects may end up in the vast nebula of projects launched by the organization that end up incomplete or ineffective because no executive leadership was provided. Leaders must have a vision for the team, be engaged with them and support them as needed.
After you’ve decided to take this approach the only decision left would be if you should build a Boris Karloff DataScienceStein who brings terror to the villagers or a Peter Boyle DataScienceStein from Mel Brooks’ “Young Frankenstein”, who can sing and dance for the audience.