When Developing System Architectures, Think About Data Integration

by Olga BelokurskayaFebruary 16, 2010
The right architectural approach to SOA and EAI—impemented in advance—can save you months of refactoring later.

(Featured image credit)


Build systems around business data

David Linthicum

“The data, and integration strategies around the data, is something that most figure is there, will be there, and requires very little thinking and planning.” This is what I’ve read today on ebizQ in a great post written by David Linthicum.

This again supports the idea that data, “the biggest companies’ value,” is still being neglected too often. This results in data integration and quality issues, providing inconsistent information and ruining the entire idea of providing a clear view on the enterprise data. I.e., the information important for business decisions.

In fact, very often, when it comes to designing and developing an architecture, all the attention is focused on the technical side of the process. Thus, data integration strategies become an afterthought making it difficult to meet business requirements. So, the message is that provisions for data integration should be made at the level of the development of enterprise systems architecture.

The importance of well-thought-out architectural decisions was also considered in another article from David published earlier, where he focused on the staging area of a data integration system.


The staging approach to data integration

According to David, not many consider the use of the staging area when integrating data. However it’s a great solution when support is needed for more complex and valuable data integration operations, including support for many large data sets and operations that are more complex and of higher value. Using a staging area helps perform complex operations on data, which are, normally, difficult to do using direct integration approaches.

Enterprise data warehouse architecture using staging (Image credit)

David provides benefits of a staging approach to data integration:

  • The ability to perform more complex operations on data, including complete transformation of semantics and the data content using any number of dimensions since, in essence, you operate on an intermediary database that you control completely.
  • The ability to leverage more coarse grained and complex data sets that may not always repeat.
  • Informational focused, supporting valuable information externalization approaches, including business intelligence.
  • More flexibility around business cycles, data processing cycles, widely disbursed systems, and hardware and network limitations, where it may not be feasible to extract all operational databases at the same time.
  • The ability to better support complex database functions, including replication, cleansing, and aggregation.

So, before implementing a complex enterprise-grade architecture, dig into requirements your critical data poses, choosing appropriate approaches to integration. In some cases, you may need a staging area, ELT (instead of ETL), or techniques such as data federation.


Further reading

The post is written by Olga Belokurskaya with contributions from Alex Khizhniak.