At the Enterprise 2.0 conference in Boston, there was a lot of talk about data. By applying Web 2.0 technology and practicesblogs, wikis, social networks, tagging, RSS, etc.Enterprise 2.0 would transform enterprise IT infrastructure and foster the collaboration and knowledge-sharing promised by earlier technology practices such as knowledge management. In this new era, users at last will be able to find data easily and discover who else in the company has similar interests and pertinent knowledge. Through collaboration platforms such as Microsoft SharePoint and Jive Software Clearspace, data previously buried in email messages and PC desktops would be published on company blogs and wikis, where it could be found, read, and elaborated upon by coworkers and, if appropriate, by partners and customers.
The software companies creating these portals recognize that a lot of valuable data isn’t found in email or Words documents; instead, it's distributed across data centers and departments in databases and data warehouses. So the portal vendors talk about being able to access Oracle and SAP and other enterprise data sources, in order to pull this data into the collaboration platform.
But as I talked to vendors, I found their views of data access in many cases to be overly simplistic. Their premise seemed to be that all one needs to do is attach a connector to a data source and suck the data out, much as one might stick a straw into a paper cup and extract whatever concoction is sloshing about inside.
If you talk to data integration experts in data centersor if you talk to security officers for Fortune 1000 companiesyou quickly discover that the requirements for data access are much more varied and nuanced. It's rare that you’ll actually want to simply extract data and, say, stage it in an Excel worksheet on a server where it can be accessed by a homogeneous group of authorized users. More likely, you’ll want to apply access controls before the data even reaches a collaboration server, and you’ll need several different views of the data, based on business needs and permissions.
Instead of simply extracting data, it's more useful to think in terms of data access, data transformation, and data delivery. The tables below compare these approaches.
First, here’s the kind of straightforward data access that software vendors often talk about.
Table 1: Simple Data Access
Data Source | Data Access |
Customer database | Post customer records as Excel spreadsheet for SharePoint |
Next, here's a more realistic scenario, at least for organizations operating under security policies or industry regulations that mandate data security and data governance.
Table 2: Data Access with Support for Data Transformation and Data Delivery
Data Source | Data Access | Data Transformation | Data Delivery |
Customer database | Query customer records, presenting only columns 1, 2, 5, and 7 | Convert dollars to Euro | Post results to spreadsheet or Web page accessed by EMEA marketing group |
Query customer records, collecting columns 1-5 and 7-9 | Add a unique ID to each record for use in this project | Post results to portal used by Private Client Group | |
Query customer records, returning columns 1-4 | Make this query executable for customer service agents working on the customer service portal |
In enterprises operating with strict security and compliance controls, it's rare for data to be simply dumped from a database and made broadly accessible. Policy compliance requires tighter controls over data access (permission to extract the data from its source) and data delivery (the presentation of data to specific users).
Businessesand software vendorsought to recognize the critical importance of data transformation: changing, reformatting, or editing data to suit its particular purpose and audience. There's no point in delivering too much data, or financial results in dollars when they should be in yen, or raw data from three sources that end users have to combine for themselves through machinations with spreadsheets. In the real world of harried workers overloaded with information, data transformation is an essential capability for any effective solution for data management and knowledge sharing.
Just as Enterprise 2.0 frees workers from the clutter of irrelevant email messages, so flexible data access and transformation practices can ensure that the right users receive the right data at the right time. The goal should be to get everyone all the data they need—and nothing more.
Conclusion
At the Enterprise 2.0 Conference, it was obvious that software vendors of collaboration and community platforms have made clear progress developing attractive, usable front-ends. Now it's time to apply that same energy and thoughtfulness to developing the back end—data access, transformation, and deliveryin order to realize the full vision of business-ready data platforms for Enterprise 2.0.
Postscript: Since I wrote blog post back in June, SnapLogic, an open source data integration vendor and a client of mine, formed a partnership with MindTouch, an open source wiki company, to create a Customer Relationship Management (CRM) solution building on the kind of custom-tailored data access, transformation, and delivery I described above. In the SnapLogic-MindTouch solution (summarized with a diagram here), CRM applications such as Salesforce.com and SugarCRM are extended with collaborative dashboards based on MindTouch's wiki platform. The wiki is configured with SnapLogic data integration pipelines, enabling CRM users to securely access financial data and customer support records for prospects and customers. No tell-all spreadsheets insecurely posted on servers. Instead, a wealth of account-specific data is made available to authorized users.
I expect will see more partnerships like this one in the coming months.
1 comment:
I thought the post made some good points on extracting data, For simple stuff i use python to get or simplify data,data extraction can be a time consuming process but for larger projects like files, the web, or documents i tried http://www.extractingdata.com which worked great, they build quick custom screen scrapers, extracting data, and data parsing programs
Post a Comment