Data Provisioning and Integration

Sybase Data Federation leaves distributed data in place and provisions it to users and applications across the organization. The Data Federation environment is called a data services layer or data grid.

Data provisioning—the process of making data available in an orderly and secure way to users, application developers, and applications—is a significant challenge for large, distributed organizations. With many widely varying demands for data, geographically distributed users and data sources, production systems that must be insulated from uncontrolled access, and concerns about intellectual property and confidential data, careful data provisioning is more important and more difficult than ever before.

If everyone needed data in the same format from a single data source, and that format happened to be the way the data is currently stored, there would be no data integration challenge. However, some applications expect relational data; others need data in XML form. Still others aggregate sales data across multiple departments, or integrate data from different systems to obtain a single view of the customer. Data of multiple types must be combined to provide a result.

This poses significant challenges to application developers, who must spend time writing code to access and transform data, rather than writing business logic. Developers must also know where the data resides, and changes in the location typically break the application. Sybase Data Federation shields developers from this issue, as only the service definitions must be changed when data moves—not the application code.

Data Federation implements a federated approach to data provisioning and integration, which leaves distributed data in place and provisions it to users and applications across the organization. Federated solutions are generally easier to implement than “big bang” solutions that require moving data into a central repository. Federated solutions minimize costs associated with disrupting data, users, applications, and administrators.

When you install Data Federation software, you create a unified, low-overhead system for provisioning distributed data across departments, locations, and companies. This system is called a data services layer or data grid. The data services layer’s scope can be small or large. It can serve one department or an entire extended enterprise.

Data Federation retrieves data from multiple sources of different types, tailors it in ways users and developers need, and makes it available securely across the organization. Users and applications access data through standard interfaces.

Figure 1. Data retrieval
Data Federation accesses data from databases, file systems, and other sources.

A data grid provides:

How is this accomplished? First, install a set of server components on your existing network, creating a data grid that you can think of as a large catalog you might use to “shop” for data. At the beginning, the catalog is empty. Then, one by one, individual data owners “publish” their data for others to use, creating entries in the catalog. At the same time, they establish access rights for each catalog entry, specifying who can read the data, who can update the data, and so on.

In creating entries in the data catalog, data owners do not create replicas of their data. Instead, they create a link from the data catalog entry to data that exists somewhere—in a production database, an operational data store, a data warehouse, or a file server—wherever data is currently stored and managed.

The data catalog’s entries are arranged in a hierarchy much like any other directory structure, with one important difference—the data catalog is location independent. Users and developers do not need to know where data is physically stored to find and use it. They have one place to go to retrieve all data available to them.

But what about impact on the data source as new users and applications add to its load? By making explicit which queries are allowed to run against a data store, and by controlling cache coherence windows or scheduling queries to run at specific times or with a certain frequency, data owners can control the load on operational systems. Data Federation has rich caching and scheduling capabilities that make this process easy to administer and transparent to the consuming users and applications.

For product-related issues, contact Sybase Technical Support at 1-800-8SYBASE. Send your feedback on this help topic directly to Sybase Technical Publications: pubs@sybase.com