Microsoft’s Excel is THE de facto spread-sheet program, pretty much everyone turns to Excel if they need to do something with numbers.
Tax returns, finances, holiday charts, sales figures, bonus calculations, project progress and many many more are all common uses for Excel sheets in 1000’s of businesses all over the world. This means there is an almost incalculable amount of business data buried in .xls/x files, only a handful of people know the document exists and even fewer people know where the document is stored <—that’s usually how it goes!
This problem isn’t limited just to Excel files, there has been a huge proliferation in Word documents, project files, pdf’s, Sharepoint sites…even SQL Databases, all strewn across an organization’s estate with little discoverability. This makes it hard for users to properly act upon data and also makes it much more difficult administrators to update and change parts of their infrastructure. Microsoft have decided it’s time to do something about this and so was born:
There are 3 “buckets” of components to this project:
Microsoft will provide crawlers for various Microsoft products including:
- SQL Server
- Reporting Services
- Analysis Services
These will extract metadata & “Enterprise Dataflow” information which will be indexed in the Barcelona server. If the data source cannot be crawled then a “declarative way of describing the metadata and dataflow information” will be provided. A neat feature is that it will support auto discovery of new target sources which will reduce the amount of on-going user/admin interaction needed to make it a success.
This will be the cache for all the metadata & dataflow info collected. It will also allow querying, augmenting and annotating of the data via an exposed API.
Initially there will be 2 tools:
An Admin experience to manage the crawlers & Index server.
A DBA experience for things such as renaming columns, retiring servers etc.
The overall architecture will look like this:
This is a great idea, something that will be very useful to a huge number of companies across all sectors and in many different ways. I can definitely see this fitting into my discussions with customers very well. I have multiple conversations each week with companies hoping to discover:
- What they have
- Where it is
- What it costs
- How it’s used
when it comes to software licensing and also hardware. Being able to do the same thing for data is surely something that customers are already wishing to do?
There are some quotes from Andrew Conrad of the Project Barcelona team that show Microsoft’s thoughts and plans for this:
“although we are designing the first iteration of the product to be a DBA/ ETL developer solution, we believe that the long term value will grow significantly beyond this”
“developers can plug in their own crawlers or metadata providers.”
“we will support metadata augmentation and have rich annotation support (both crawler support and via server API) which will allow producers and consumers of the system to leverage the crawlers and Index server in ways we haven’t even thought about”
“One of our goals for Project Barcelona is customer driven innovation”
We “really want to work with the community on the design and feature prioritization”
“we strongly believe we will need significant feedback before landing on the right design and feature set. Hence, to accelerate the feedback loop, in addition to shipping a number of CTP releases, we plan on being very transparent on our design plans via this blog”
These give a great insight into how Project Barcelona will progress and it looks good to me. A clear focus on customer insights, ideas and requests is always positive.
To read more on this interesting and surely one day to be a product that gets RTM’d, head over to:
You can also follow them on Twitter @projbarcelona
One last thing, the team are jokingly calling this “Marauders Map”, a reference to the Harry Potter map that shows the location of each & every person!