Quantcast
Channel: The Cloud Computing Forum » SOA
Viewing all articles
Browse latest Browse all 7

Understanding Data Federation Patterns

0
0

There is profound agreement in the industry that data services have a transformational influence on enterprise datacentric architectures. Current analysis indicates that there are three important scenarios where data can be exposed as reusable services. For a solution to meet expected results of the moniker of data services, it must achieve more than simple data access. Many off-the shelf SOA, BPM, DB products and even development tools include basic functionality for accessing data sources. Data services can also be generated from data-consolidation – bulk data scenarios through the use of ETL as a data hub.  Finally, data federation is defined as the capability to aggregate information across multiple sources into a single view. It leaves data at the source and consolidates information virtually, almost the same way an enterprise service bus (ESB) virtualizes messages but for data. Data federation essentially allows companies to aggregate data across multiple sources into a real-time view which can be re-used as a service.

Data federation aims to efficiently join data from multiple heterogeneous sources, leaving the data in place  without creating data redundancy.  The data federation pattern supports data operations against an integrated and transient (virtual) view where the real data is stored in multiple diverse sources. The source data remains under the control of the source systems and is pulled on demand for federated access.

Risk areas to apply the data federation pattern

  • Integration scenarios that require complex transformations to build the integrated view will have a negative impact on the response time particularly in this approach. 
  • Source servers may be negatively impacted by increased workload when they have to return data that is requested in a federated query. In order to process a request to the integrated view, the federation server will send sub operations to integrated sources. The more complex those sub operations are and the more frequently they are sent to the sources, the more additional workload the source servers need to manage.
  • Scenarios that result in large intermediate result sets being moved from the target data sources to the federation server may have significant performance implications.
  • Situations in which applications require a relatively high degree of availability of the integrated data may not be good candidates to apply this pattern. The availability of the integrated data is wholly dependent upon the availability of all federated and source servers involved in the process as well as availability, capacity and responsiveness of the network.


Viewing all articles
Browse latest Browse all 7

Latest Images

Trending Articles





Latest Images