Solving Data Integration Issues when Moving to the Cloud
Why is migration to the cloud so hard?
Gartner has recently named cloud computing among top 10 strategic technologies for 2010. However, that doesn’t mean moving to the cloud immediately, experts say. Instead, they suggest exploring the ways to approach the cloud—by starting using cloud services, private cloud environment deployments, cloud-based app development, etc.
Why experts are warning against hurrying into the cloud? One of the reasons is that making certain applications work effectively in the cloud, especially, legacy ones, is quite difficult. There are several factors for this (according to this article at CIO.com), and the variety of cloud platforms is one of them. That means that data integration and data migration workflows may be different within/on each platform, as well as costs, supported data services, etc.
Then, the means of data integration, such as APIs provided by a cloud vendor, may differ from those used in a company’s systems, which may demand some solutions to address this integration challenge by rewriting code/logic. One more essential difficulty factor is that many legacy apps and tools may be incompatible with the newest technology stack cloud platforms are working with. In a nutshell, additional efforts will inevitably be needed.
At the same time, SMBs are more flexible in the adoption of emerging technologies due to numerous reasons, including organizational ones. What’s more, small companies and have relatively small amounts of data to be integrated between on-premises and the cloud. Vice versa, large enterprises have different requirements to data integration. So, there’s no surprise enterprises are not hurrying into the cloud, waiting until all the blockers are solved in advance.
Security is also among major cloud migration challenges. There were lots of talks this year about some clouds lacking transparency. However, this doesn’t mean that data migration to the cloud is a bad idea. This only means that you need to carefully decide on what apps (and what data) to be moved to the cloud.
Thus, while cloud computing named strategic, there are lots of things to be done about data integration, security, compatibility issues, etc., before migrating legacy applications to the cloud. First of all, there should be ways found to address those issues and ensure the apps work efficiently after migration.
Issues when you already integrate
Companies that moved their data to the cloud, sooner or later understand that it should be somehow integrated or synchronized with the data in their on-premises enterprise applications.
Data integration between the cloud and in-house apps and systems is, probably, one of the biggest challenges. Simply uploading operational data to the cloud may not bring much value, since without synchronyzing it with on-premises data, an enterprise will have just two separate environments, one of them containing obsolete data.
In this endeavor, there are several factors that complicate data integration with the cloud:
- Interoperability is a pain for cloud platforms. They evolve fast, compete independently, and interoperability is not taken into account. This is a complicated data integration initiative for enterprises, as the choice of the tools will depend on the platform. (So, if more than one, a hybrid cloud platform chosen, there’s a problem.)
- Another thing is that cloud-based data integration requires new architectural approaches to ETL and other data integration mechanisms. Thus, there will be a need for additional data synchronization solutions.
- Furthermore, different cloud platforms provide different levels of security and SLA. Since data integration between a cloud and on-premises apps means moving sensitive data across them (and the amount of data will be ever increasing), clear security standards should be developed to protect the data across all the systems.
Therefore, the possibility of data integration (or synchronization) between on-premises and cloud systems and tools is the thing a company should provide for before completely migrating to the cloud. This fact is often overlooked, which leads to data integration problems in the future, taking into account the amount of data that is going to increase. So, this should not be “an afterthought.”
When migrating data to any new location—whether it is a cloud or not—ensuring data quality is also essential. Data should not be taken to the cloud “as is,” replicating existing issues. Instead, before starting the migration, provisions should be made for data quality. Make sure you’ve checked whether your information is accurate and complete, as well as managed to find and clear up duplicates. Otherwise, inaccuracies will be taken to the cloud, which will make it inconvenient to work with the data later.
What to do?
- Firewall—there should be a way to externalize and consume data noncompliant with port 80 limitations.
- The speed of moving data should be considered in order to customize transformation and routing mechanism to perform properly.
- Provide for maintenance and support for cloud systems.
- Governance is important when you oversee all the integration points.
- And security, of course, which is still a big issue for cloud systems.
On top of that, there should be planning, provisions for data integration, and technologies thought over properly to address issues if they occur prior to migrating any systems to the cloud. Many enterprises still do not have a clear data integration strategy and therefore get stuck when it comes, for instance, to synchronizing data in their cloud-based and on-premises systems. The move to the cloud should be prepared, including data integration needs for every system, architecture requirements, identified data sources that would need to be synchronized, etc.
As for cost savings, yes, moving to the cloud is one of the goals in many cases. But that won’t come immediately. Initially, data migration will demand investment, for either a data migration tool will be necessary, or the need to pay for a data migration service. Open-source solutions are available today, as well as widely adopted proprietary tools. What matters is that there won’t be free data migration anywhere—due to hidden costs. However, when making a decision about what to choose, you shouldn’t be driven by merely cutting costs. Instead, follow your business requirements (including processes, systems, policies, etc.) and then choose the tool providing the best way to achieve the corporate goals.
Start small: backing up data
In particular, you may start with backing up noncritical data to a cloud to explore its capabilities. One of the options to have this information backed up to a secure remote location is using Amazon Simple Storage Service (S3), which lets you easily store and retrieve virtually any amount of files anytime, anywhere. Amazon S3 deploys the same highly scalable, reliable, fast, and inexpensive data storage infrastructure that Amazon.com uses to run its own global network of websites.
In this scenario, Apatar’s Amazon S3 connector can help corporate users and DBAs back up data from CRM/ERP systems or databases, uploading selected documents to a web storage at a specified time on a regular basis.
Make sure you follow the security recommendations described above, as well as introduce privacy policies, logging, etc. Be consistent: create a clear long-term strategy, a realistic integration plan closely tied with existing processes, and a set of requirements for both developers and business users. Your cloud data integration project should have a set of goals and priorities, a budget, and possible milestones/deadlines.
Don’t migrate to the cloud “all at once,” make it a gradual shift instead, ensuring successful data integration between moved systems and the rest of the enterprise—these are the recommendations from experts to avoid data integration issues in the cloud.
- Top Data Integration Challenges: Meet DQ, CDI, EAI, DW, and BI
- ETL vs. ESB from Apatar’s Point of View