Written on 23rd February 2023 - 3 minutes

The pros and cons of building a data warehouse vs collecting data from discrete sources

blog cover

The process of setting up a data warehouse, a central repository of data, could be daunting. There are a multitude of different data sources that your business could be using at any one time and the concept of having to get them together in one place could be too much to consider. Not to mention the time and cost of implementation.

But similarly, to continue using the same data collection methods that have been employed every month/quarter/year for the past X amount of years doesn’t feel at all progressive. It’s time-consuming and fiddly. Surely there’s a better way to do this?

We had a look at the pros and cons of building a data warehouse, either a custom built repository or an off-the-shelf one such as Power Bi and then at the pros and cons of using separate data sources to see what really worked better for the users involved.


Pros of building/using a data warehouse


Establishing a centralised repository for large amounts of data, which is in a stable environment. The benefits are huge. Improved data quality, improved reporting, improved business processes and decision making and actionable insights.

The data can be retrieved quickly, if there are errors, they can be identified and corrected easily.




There is an underestimation of the time taken to get the integration up and running. The time it takes to retrieve, clean and upload the data means that it is a time-consuming process to which time and people need to be allocated. There can be compatibility problems, issues with the source systems and ensuring that only certain team members can access confidential information.

There is also the maintenance cost to consider, this will cover all security issues but if choosing a custom-built warehouse, it will be based upon amount of data being held or if an off-the-shelf option, would be a fixed monthly fee.


Pros of using separate sources


This is how it’s always been done. Everyone knows the score, knows how to retrieve the data as quickly as possible. Each data source might have it’s own challenges but repetition has taught the team members involved how to overcome them. It’s certainly a far cheaper option. Most of these sources may be paid for but paying to place the data elsewhere, would mean a further increase in outgoings.

Each of these sources is secure in it’s own right, so to ensure the data is secure, it’s a matter of ensuring the business’ internal system is secure, which is a procedure that’s already in place.

The data can be trusted as it is being retrieved directly from the source.




Extracting the data can be a challenge. Navigating where and how to download it will vary with each source and can result in different formats that need to be amalgamated.

It’s a time-consuming process, actually inputting the data into a report, that derives from ten or twenty different sources takes a long time, coupled with sharing it with team members for them to add in their data, means there can be issues with data integrity; some data can be out of date by the time the report is ready, and with other users involved there is the possibility of duplication and manual errors.  Ultimately this method, isn’t scalable, if another twenty data sources were added into the mix it would take more time and energy compiling reports than would be worthwhile.

In summary, there’s no right or wrong response. Businesses at different life stages and financial landscape would take different approaches. But with the advantages that cloud computing gives, we would suggest that the different data source approach is not a long-term strategy and that planning should be in place to review how data is stored and retrieved. This would ensure the business is optimising its reporting and analysis procedures and so implementing time and cost-saving efficiencies.

If you would like to find out more about data warehousing, then please get in touch.



Share this post

Contact Us