Through my discussion with many people on data science and artificial intelligence, I often hear people saying, “Start-ups do not need data science. Let’s focus on capturing users by building the features that our users wants.”, or something to that effect. Data science seldom made it to the list of priorities for most founders when they are working on their start-ups.

Most of the discussion revolve around the following reasons for not adopting data science; Data science is portrayed as expensive (mega-infrastructure!), takes up too much time, its very challenging (need an expertise to work on and expertise are very rare) and data science can only work when there are HUGE amounts of data.

I hold a different opinion to that. My opinion is yes, start-ups do not need the “sophisticated” machine learning initially but it is at the right time to consider and prepare for data science capabilities in the organization.

Data Collection and Management Processes

Start-ups usually work with a “cleaner sheet of paper” (compared to large enterprises) thus it is at the right opportunity to discuss what are the data to collect, the quality of data to collect, which stage of the business process should the data be collected etc. By having such discussions early on, data collection can be worked into the business processes easily before the different business processes extend and get more complex in the organization, becoming a “big bowl of spaghetti”. It is easier to fix the car while it is at a slow speed compared to fixing a giant car(big enterprises) moving at a fast speed.

For example, most start-ups are interested to ramp up popular features quickly they need a more data-driven approach to determine popularity (i.e. conducting A/B testing). Resources such as data (which part in the business process should the data collection be done, granularity of the data to be collected) and infrastructure (which database should we use) can be discussed up front to allow start-ups to do quick analysis of the data captured, either decide if a feature is popular or decisively move on to other worthy pursuits when the analysis showed otherwise.

Secondly, it takes time to collect data. Good quality data do not magically appear. It requires planning, from data collection, data quality to data storage and retrieval. Collecting data at the right quality level can cut down a lot of data preparation work that is required before any analysis. Time is an essential ingredient to collect enough data.

With data collected very early on, start-ups can learn about the impact of their strategy and conserve on resources (resources are precious in start-ups right?) if the impact is not going to be positive or great.

Thirdly, by starting your data collection early on, the start-up would be storing one of the critical resources that is needed to build artificial intelligence capabilities, if the start-up move through several rounds of funding. Though this might change if we see further development in AlphaGo Zero.

Data science needs HUGE amount of data?

This misconception is likely to be brought about by the “Big Data” term that was used extensively to create urgency among companies to adopt data science.

If start-ups are to tap onto their data for value immediately, the first thing to do is to setup the reporting process or perhaps establishing an operation and strategy dashboard. Decide on the concerned metrics, based on current business strategy, for each of the dashboard.

For operation dashboard, the start-up can have the metrics refreshed on a more regular basis as compared to the strategic dashboard. The key here is to have everyone in the start-up understand currently, how are the operations doing; are we at the stipulated service level for our customers, is there a drop in user experience in critical areas etc. As such, the start-up can move the limited resources to the right area to sustain operations at the right service level.

For strategic dashboard, it is more for the start-up to understand if their current business model is working, if the business strategy (like capturing users, extending usage of existing users etc) is working or not.

These two dashboards do not need huge amounts of data since the data captured can be processed immediately for insights (for higher frequency of refresh). It can help start-ups to manage their operations and strategy quickly and effectively, ensuring limited resources are used in areas that has the largest positive impact.

Data science is expensive?

Data science need not be expensive. A start-up should not commit a lot funds into tools without having a good long-term usage plan. My suggestion is plan out the data science use cases that the start-up wants to work on and research on the tools that are available, then see if it makes sense to go for open source or enterprise tools. Only commit to purchase tools when it makes absolute business sense, when the value produced by these tools exceeds the costs of tools. I strongly believe that infrastructure should grow together with the value produced by usage of data science in the start-up. Immediate purchase of enterprise tools without a good plan for it is likely to result in a huge waste of resources that are scarce in the start-up environment.

As previously mentioned, the types of analysis or machine learning done at the initial stages of start-ups need not be complicated, so start-ups could perhaps offer the opportunities to carry out the analysis or machine learning to interns, giving them relevant experience that can greatly benefit their career in the long run. This creates a win-win situation in that the start-up gets workable use cases and understand the value of data science at a low cost, which may include a nice surprise discovery of data science talents along the way. The interns gets to practice what they have learned in their undergrad studies and see the strengths and weaknesses of their current set of skills. Perhaps to ensure, that the win-win situation creates the largest impact, having a mentor to guide the intern(s) will be beneficial. More importantly is the mentor needs to have practical experience and have worked on data science projects before.

Access to data science resources

Most of the data scientists I have met, are always on the lookout for interesting challenges to work on, assuming they are adequately paid. In other words, what attracts data scientists is never salary alone but also the kind of challenges provided. So if the start-up can provide good challenges and an environment that is supportive of it, they can attract their fair share of data scientists.

VCs and angel investors may want to hire a data scientist (permanent role or consultative basis) to work on the data science projects provided or identified by the VCs’ and investor’s portfolio of start-ups.

In conclusion

Start-up should start thinking about building data science capabilities as early as possible. The greatest benefits to start early is the amount of data collected since they do not appear with a snap of the finger. Planning early allows the start-up to collect good quality data, iterate quickly and move up the data science learning curve much earlier than their competition.

Infrastructure should grow together with the amount of value derived from the use cases. Or more importantly, the costs to implement use cases moves in tandem with business value. This would create a sustainable momentum of adopting data science in start-ups.

Start-up do not need huge amounts of data at the start. They can start gaining insights from whatever data that they have captured and use these insights to conserve resources and focus on more critical areas.