Monday, February 16, 2009

Not All Data Integration Connectors Are Alike

Connectors are a vital part of any data integration solution. No matter what data sources you're integrating—databases, applications, flat files, Web services, etc.—it's awfully handy to have a preconfigured connector or at least a template to minimize the amount of hand-coding required to move data out of or into a particular data source.

The importance of connectors is perfectly clear to customers. In news stories, such SaaS Integration: Real-World Problems, And How CIOs Are Solving Them, which appeared in InformationWeek in October, customers are blunt about their expectations regarding connectors: vendors need to have a lot of them, one for every piece of middleware being integrated, and vendors better know how to make them work.

[H.B. Fuller CIO Steven] John is asking SaaS vendors lots of questions related to middleware, such as whether they have developed plug-ins for a specific middleware package and whether they have direct experience implementing that middleware. "If they say no to either, it's a strike against them," he says.

Recognizing the importance of connectors to prospects, most integration vendors parade their list of connectors on their Web sites.

And certainly, if you walk the tradeshow floor at events like the O'Reilly Web 2.0 conference or the Enterprise 2.0 Conference in Boston, you'll find vendors rattling off the names of the connectors they have.

"SAP? Oh, yeah. We've got a connector for that."

I've written before about the misleading simplicity of this approach. Data connectors aren't like Converse sneakers. You can't simply amass a bunch of them (red, orange, purple, black), and assume you have what you need for every occasion.

Different data sources have different security and access requirements. Applications integrating with protected data need to ensure that the data remains protected. You certainly don't want to bypass all the security and access controls protecting, say, a SAP ERP system, simply so that social platform users can pull ERP data into their wiki pages. These requirements become even more pressing when you're dealing with data in the cloud, where it's outside the perimeter of an internally secured and controlled data center.

Another integration requirement, above mere "connectivity," is transformation. Data might need to be transformed ("groomed") or narrowed before being presented to a group. For example, if I'm pulling in sales numbers from the Tokyo office, I'd probably like to see the amounts in yen converted to dollars. It would be nice to have the integration solution do this, so that I know I'm using a tested conversion tool that the company officially endorses.

Beyond merely connecting, then, connectors may need to work as part of an integration solution that supports access controls, transformation, auditability, and orchestration (e.g., before providing data set X, ensure than operation Y is complete, so that X is valid and up-to-date).

The NetSuite Example

The other fallacy of the "Converse sneaker" approach to connectors is it assumes that all integration endpoints and APIs are more or less alike. There's a DB interface or a middleware API. You write to it. You're done.

Not always.

Some APIs are more interactive and complex.

NetSuite, the hosted provider of mid-market CRM, ERP, and accounting solutions, now has 6,600 active customers. So lots of people have reason to connect to NetSuite data.

Integrating with NetSuite, however, is more involved than integrating with other SaaS applications. Why? Because NetSuite provides a great deal of flexibility in customizing their basic record schema and in defining custom records. You can query meta data to discover some, but not all, aspects of the customization. So a connector cannot rely on an automated process for discovering how an account has been customized and access its data.

Ideally, a connector should mask as much of this complexity as possible from the IT user creating the integration, so a good NetSuite connector will take advantage of the flexibility of the NetSuite API, while hiding as much of its complexity as possible from the user building or using the integration.

SnapLogic, an open source data integration company, offers a NetSuite connector that automates as much of this discovery as possible, while supporting integration that works with custom NetSuite records. Here's an explanation from the SnapLogic documentation pages:

NetSuite allows customization of its schema by allowing users to define custom record types and by adding custom fields to existing records. This extension package can automatically discover all the custom record types in a NetSuite account at install time. However, NetSuite does not provide interfaces which allow the extension package to discover custom fields have been added to all existing record types. The extension package is able to do this discovery for some kinds of records (like entity records, item records and CRM records), but not all. For this reason, a manual approach has been provided for specifying the custom fields of records. The user can use the NetSuite UI to browse the records and find custom fields that are of interest. The user can then use the utility: netsuite/resources/customize_resources.py (provided by the extensions package) to manually add custom fields to the SnapLogic Resources that represent a given NetSuite record types.

The connector automatically discovers of the accounts data schema as possible, then supports one-time additions for custom data. Once this set-up work is done, the user has a collection of ready-to-use, snap-together building blocks for building integration pipelines.

Because the connection reads the NetSuites schema, generates components, and supports customizations, it's able to provide NetSuite customers with a flexible solution for integrating NetSuite with other applications and data sources.

A more perfunctory connector would look just as good in a check list of available connectors, but it wouldn't serve users nearly as well.

No comments: