Written on 9th September 2021 - 4 minutes

DataOps – how to access your data and release untapped value

DataOps - Article cover

Ian Russell, Director of Operations and Jon Stace, Director of Technology discuss how to get the most from your data and examine the approach adopted at Software Solved. This was presented as a video at TechExeter Tech and Digital Conference 2021

Data access challenges

One of the biggest challenges that face both operations and infrastructure teams is gaining fast and efficient access to data throughout the development lifecycle. Internally and externally this process poses different challenges and speed is quite often a key driver to enable efficient business decisions.

How we previously dealt with data requests

The previous approach of how Software Solved dealt with data, includes the Waterfall process – Analysis, Design, Implement, Test, Acceptance, Production, Support. This was time consuming and went through the entire scope of requirements and functionality before delivering anything.ow we previously dealt with data requests

Why DataOps?

After some research, it was decided the focus would be on the DataOps approach. This is the perfect combination of Agile, DevOps, Statistical Process Control (SPC) and fits well with our continuous delivery, promoting improvement in speed and accuracy of analytics DataOps. It focuses on and attempts to integrate real-time analytics and the Continuous Integration / Continuous Delivery (CI/CD) process into a collaborative team approach. Access to stable data in an evolving environment can be challenging.

Why DataOps

The Value Pipeline

The Value Pipeline is broken down into two different streams/pipelines – Value and Innovation. The Value Pipeline focuses on ensuring the hygiene of data passing through the process increasing confidence. Both of which are attempting to extract value from the processes as well as the systems. Data enters the pipeline and moves through into productions with several quality checks and balances to increase confidence. Production is generally a series of stages: access, transforms, models, visualisations, and reports.

The Value Pipeline

As the diagram above illustrates, a new feature undergoes development before it can be deployed to production systems. This creates a feedback loop which spurs new questions and ideas for enhanced analytics. During the development of new features, code changes, but data is kept constant

DataOps Process

The end-to-end process in practice in comparison to DevOps (CI/CD):

Sandbox Environments – Isolated environments that enables testing without affecting the application, system, or platform on which they run.

Orchestration – Automation of the data factory/pipeline process e.g., container deployment, runtime processes, data transfers/integration etc.

DataOps Process

Why do it this way? 

This allows you to integrate data from different sources, automating their ingestion, loading and availability. You can control the storage of data with its different versions over time, centralise management of metadata, manage request, authorisation, access permissions to data and apply analytical. On top of this, you can do reporting and dashboarding mechanisms and techniques to monitor and track what is happening throughout the platform.

Tapping into The Value Pipeline

Access – We pull & ingest raw data from multiple sources via replication engine at regular intervals so that it’s as close to real life as possible​.

Transform – Generally an ETL process that standardises the data set, sorts it and validates that it is correct. To get it into a place to be analysed. e.g. filtering, aggregation, mapping values etc​.

Model – This is when we can start to model the data and interrogate it to analyse or ask the questions we want to gain value from​.

Visualise/Report – Then we can visualise and report using several different technologies​.

Orchestration and automation allow us to access the value of the data much faster with higher levels of quality​.

Tapping into The Value Pipeline

Our tech approach and how it’s evolved​

Originally, we used traditional tools and approaches, like:

Excel​ and SQL Server (SSIS / SSRS)​. These days we are using ADF, which gives us:​

  • Azure Data Factory​
  • Orchestration​
  • IaC / Sandboxes​
  • CI/CD​
  • Statistical Process Control (SPC)​
  • SPC
  • Azure monitor for metrics​
  • Testing​
  • Inputs​
  • Outputs / logic​
  • Rowcounts​
  • Warnings of variance​

Our tech approach and how it’s evolved

What does this mean for customers?

This means better access to data and all the below:

What does this mean for customers_

What does the future hold?

  • IoT and the ever-expanding sources of data ​
  • Devices have new ways to communicate data ​
  • Data Integration – explosion of new types of data in great volumes has demolished the assumption that you can master big data through a single platform​
  • AI and machine learning to supplement human talent – With more data in a variety of formats to deal with, organizations will need to take advantage of advancements in automation (AI and machine learning) to augment human talent​
  • Working with SME’s and non-technical users – self service

If you’d like to find out more about how we can help you gain insights and knowledge from your data, explore Elevate or please get in touch.

This article was written by our Director of Technology, Jon Stace and Director of Operations, Ian Russell

Share this post

Contact Us