The term ‘Data Science’ is an increasingly used phrase in the world of business today. Over the last 5 or so years it has risen from relative obscurity to the forefront of strategy, influencing how companies operate and plan for the future.
From our experience, people are using the term data science to cover a whole host of activity. We think it is best described as the process of using different tools and algorithms, combined with the fundamentals of Machine Learning, to discover hidden patterns in data.
What’s the point?
Ultimately data science is about unlocking the hidden value in data. Sometimes data discoveries provide a valuable insight to a business and the story ends there. For more complex scenarios the data can be used to identify trends and with these trends we can start to build predictive models.
Predictive models aren’t about seeing the future,they are about giving you the ability to forecast outcomes for important problems and this is all powered by data. The goal of a data scientist is to understand challenges currently facing businesses and if they suit a data driven approach, insights and predictive models can be employed to provide potential solutions.
How do businesses use it?
There are countless examples of businesses using data science in everything from fraud detection in financial services, to predictive maintenance in manufacturing. The basic principles can be applied to any area where data is core to a commercial operation, the key is finding the cases where this may not be obvious!
Imagine if a business ran a booking platform for hotels and they examined their data to find hotels showing a scenic window view were perceived better by travellers. This is an insight into the data and could be leveraged to possibly increase bookings.
However, the concept of what images a hotel presents to potential bookings could be taken further. A predictive model could be built to rank the images in terms of attractiveness and order them accordingly, so that customers are shown the best images of the hotel first. This is an insight driven solution and one which Expedia successfully employed to drive a significant increase in revenue [1].
Why now?
The rise of Data Science has gathered pace over the last few years, where data is being heralded as the new oil in terms of resource value [2]. The realisation of a data economy is only now becoming possible and this is because of three things;
- Infrastructure to store and access data presented a major challenge for companies in the big data era prior to ~2010 [3]. Although the sheer volume of data produced is still increasing, the cost of storage has dropped significantly over the last decade. Cloud has been the main driver of this with platforms like Microsoft Azure giving you easy access to almost unlimited processing power at an affordable cost. The flexibility of these platforms to provide high levels of processing power over short times allows businesses to run complex programs and models and only pay for the power when they use it.
- Tools have evolved to process large volumes of data in real-time, or near enough, meaning data focused products and services can be easily deployed. Cloud platforms now provide easily accessible offerings to achieve this, Azure Analysis Services being one such example.
- There has been a huge increase in Machine Learning theory, applications and publicly available tools, kick-started by the Deep Learning revolution in 2012. Again this is predominately available through cloud providers due to the high level of processing power required.
In other words, we are now in a place where huge, complex, multi-source datasets are easily accessible and can be processed very quickly. This is matched by a growing understanding of Machine Learning allowing Data Science practitioners to extract maximum value from it. Businesses are now gearing up to use historical data they have collected in ways they never imagined were possible.
What lies ahead?
Harvard Business Review famously called the humble Data Scientist “the sexiest job of the 21st century” in 2012 [4]. Given we are heading towards a data centric economy, more visibility of the discipline than ever before and record levels of investment in Machine Learning / AI research, it appears the statement is still true! The volume of data produced per day is predicted to grow year on year and it will be increasingly important that industry is equipped to exploit this valuable resource.
About the author
Iain Rodger is a Data Scientist with over 5 years’ experience across multiple sectors. Iain Specialises in Image Processing, Data Analysis, Machine Learning and Predictive Analytics.
As a Data Scientist, Iain is one of the leaders of our Intelligence practice and is responsible for working with customers on large data projects. His role is to combine the analysis of companies’ data with machine learning tools, helping to build predictive models that deliver real value to a business.
If you’d like to unlock the power of your data, our data scientists would love to help. Please get in touch.
References / Links
[1] https://blogs.nvidia.com/blog/2017/08/28/expedia-deep-learning-picks-hotel-photos/
[4] https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century