Remember the childhood story of Thirsty crow? The story is about a crow collected pebbles and put it into the pitcher to bring the water level up to the brim to drink water easily.
It is a simple and small tale of using logic to achieve the said goal.
It also gives us a wonderful moral – Where there is a will there is a way!
Source – google image
Why I am correlating this short and wonderful tale to my blog because many of our IT professionals use a variety of techniques to fetch the required data for an organization by similar logic and putting the data and applications together again.
Here integration of data and applications across the enterprise has been the long-standing goal of many organizations to became successful. However, until recently, we have been limited in the technological help to achieve this goal. Fortunately, we have three technologies to support. I call them the three big Elephants – E’s –
Enterprise application integration (EAI),
Enterprise information integration (EII) and
Extract, transform and load (ETL)
Assume if a thirsty crow ever writes a first two computer programs to bring enterprise data up to the level with help of code (pebbles) to consume it, but it was struggling initially with the resulting integration. Crow’s intelligence brought the water (data) up to the mark for consumption.
Apart from the story line, now let’s start with term definitions and differentiator EII-EAI-ETL
EII – Enterprise Information Integration, crudely defined as a middle-tier query server; but it’s much more than that. It contains a metadata layer with consolidated business definitions. It also contains (usually) an ability to communicate through web-services, database connections, or XQuery/XPath (XML translation). In fact, it relies heavily on the metadata layer to define “how and where” to get its data. It’s a PULL engine, that waits for a request – splits the query (if it has to) across heterogeneous source systems (multiple sources), gathers transactional (mostly) data sets, merges them together (again relying on the metadata layer for integration rules), then pushes them out to the requestor; which could be a web-service, a BI query tool, Excel, or some other front-end (like EAI or Message Queuing Systems). EII usually sits seamlessly between the requestor and the multiple scattered data sets. You can see this as a framework for real-time integration of disparate data types from multiple sources inside and outside an enterprise, providing a universal data access layer, using pull technology or on-demand capabilities. The target for EII is a person, via a dashboard or a report.
EAI – Enterprise Application Integration. The target for this technology is usually an application. This one’s been around for a while. In layman’s terms: EAI connects your SAP or Salesforce to another application like JDA, Oracle Financials to your SAP systems, and vice-versa. Most EAI systems are PUSH driven, a transaction happens in your Enterprise App, and an EAI listener “sees” it and pushes it out over the bus, or to a centralized queue for distribution to other applications. Most EAI engines are more “workflow” and “process flow” driven rather than on-demand. EII is typically used to collect related information from disparate systems. In some ways, it can be thought of a suped-up join engine that happens to handle non-relational data as well as relational. EAI is really a glue layer between applications that should talk to each other but don’t.
ETL – Extract Transform and Load, sometimes known as ELT (extract load THEN transform). The target for ETL technology is a database such as a data warehouse, data mart or operational data store. ETL/ELT offers PUSH technology. Usually geared towards huge volumes, highly parallel, repetitive tasks, scheduled and continuous. These are a kind of heart-beat of many integration systems around the world today – they feed massive amounts of data from point A to point B in a timely fashion. They are responsible for performing that task on a consistent and repeatable basis. They handle massive transformations (sometimes in the database, sometimes in a stream). ETL is quite different. It’s most common use is to populate a data mart or warehouse for use by analytical applications. This involves converting data from a system optimized for transactional systems to one designed to support dimensional analysis and ad-hoc querying. Another common use is to collect several data sources into a single data store that can be archived or used for auditing purposes. Unlike the other two systems, ETL isn’t really intended to work with real-time information and is used to create systems where real-time is inappropriate. Finally, another common front end that is used with EII systems is good old-fashioned reporting.
Source – SlideShare
Where EAI, EII and ETL Fit Into Your Architecture –
EAI is most useful when you need to connect applications in real-time for business process automation. Another practical use for EAI is in making a change (typically to a small set of records) in one application and reflecting it elsewhere in other applications. This technology is very good at ensuring that the change is captured and delivered reliably to the appropriate application or system. EII is most useful when you need to create a common gateway with one access point and one access language to disparate data sources. These tools provide more flexible and ad hoc access to data by end-users or applications without requiring permanence or a long-term purpose. They are able to access XML, LDAP, flat files and other non-relational data in addition to traditional relational databases, and they can publish relational data as XML/Web services data. EII is particularly useful in supplementing master data warehouse (DW) data with additional or real-time detail (e.g., combining historical data with the current situation). In addition to understanding these cases of when to use these technologies, you should also understand some challenges that go along with all of them. First, they require that your implementers have a thorough understanding of the data requirements for both strategic and tactical decision making. With ETL, this ensures that the appropriate data is extracted, transformed and loaded, ready for use by the analysts directly or for consumption by an EII server. With EII, it ensures that the views you design and build meet the analysts’ reporting requirements. In all cases, understanding your data sources and requirements is a necessary step and is worth the significant time it can take.
Finally, it is important to constantly monitor the performance and efficiency of these technologies in your particular infrastructure running on Cloud which can be expended at any point.
Any comments, feedback are appreciated.
Have a safe and happy holidays to all my SCN friends and followers!
Disclaimer – These are my personal opinions and thoughts. This does not represent any formal opinions, POVs, inputs, product road-maps etc. from my current or past employers or partners or/and any Clients.