ETL vs. ESB from Apatar’s Point of View

by Renat KhasanshynMarch 16, 2007
In short, ETL is a “pull” approach, working on user’s demand or on schedule, while ESB is a “push” model. Read on for more.

(Featured image credit)

A few months ago, prior to the release of the Apatar Community Preview, Matt Asay asked me via email whether Apatar competes with ESB products, specifically with MuleSource and ServiceMix. After I responded to Matt by email, I thought about posting my response to a blog, and Matt said, “Go ahead.” In fact, two people asked me a similar question during the MySQL User Conference, which reminded me about the email that I’m posting below.

The short answer is that Apatar is an ETL (Extract, Transform, and Load) technology; ETL does not compete, but compliments ESB products across different information integration scenarios. ServiceMix and Mule are ESB products. ESB is a standards-oriented (in the SOA age) EAI/EII technology. Therefore, I will address the comparison question, “ETL vs. ESB” from an ETL vs. EAI/EII point of view. EAI, by the way, is a term coined by Dave Linthicum in his book (published in 1999) called, “Enterprise Application Integration.”

ETL is geared toward data movement, typically in batch modes across the enterprise. It is “pull” technology and works on user’s demand or on schedule.

ESB is a “push” technology, sending messages when they occur.

ETL is a “pull” technology, works on demand/on schedule.
ESB is a “push” technology.
ETL cannot time-out, decay, or issue transactions to front-office applications during transformation processes.
ESB is capable of timing and decaying data in queues, escalating information content to the right decision-maker on that piece of content.
ETL is fully scalable, capable of loading massive batches of data in parallel.
ESB is not suitable for massive volumes of data because of its service bus architecture (by network, and source system speed to X transactions per second).
ETL can hook to ESB/EAI middleware as just another feed, if desired.
ESB’s primary job is to integrate applications, opposed to Data Migration, Replication, Data Warehousing, and BI.


I can also refer you to a comparison table I found here.


Batch snapshots
Real time
Real time
Unit of work
Set of transactions committed within an ETL cycle interval
Single business transaction
Single business transaction
Historical Record
Persistent Auxiliary Tables
No. Transactions applied directly to applications’ tables
No. Virtual database
Managerial reporting, trend analysis, multi-dimensional aggregation
Near real time synchronization of operational data where transaction commitment is dependent on state of related transactions
Near real time decision making based on most current information in operational systems. No update.
What it’s not
Not source of record. Does not support transaction processing
Not appropriate for ad-hoc analysis and reporting
Not a virtual data warehouse


At the end of the day, I think that business users have to consider their unique requirements, and pick the technology accordingly. There are different horses for different courses. Be it Apatar, Mule ESB, DataStage, Yahoo! Pipes–whatever works.


In my opinion, as more vendors enter the information integration space, they are confusing things even more because of the differences in technology they are offering. No two are alike, but all are calling themselves data integration vendors. Go figure.