Project Madison is a new Data Warehousing solution from Microsoft, born out of their purchase of DATAllegro in August 2008. It is a:
“highly scalable data warehouse appliance that delivers performance at low cost through a massively parallel processing (MPP) architecture.”
It’s official name is “Microsoft SQL Server 2008 R2 Parallel Data Warehouse” (I’d have kept Madison myself) and it’s aim is to make Datacenters easily scalable from “Terabytes to Petabytes”.
Massively Parallel Processing (MPP)
Most traditional architectures are Symmetric Multi-Processing. This means that all the queries are processed in one physical instance of the database; so CPU, Memory & storage limitations of the box all limit the speed & scale of the implementation.
Madison and it’s MPP approach get around that nicely as large tables are partitioned over multiple physical nodes. Each node has it’s own CPU, Storage and Memory and it’s own running instance of SQL Server…this is a patented approach known as “Ultra Shared Nothing” 🙂 Everything is mirrored as well for HA and redundancy.
It’s use of Industry Standard hardware helps keep the costs down and gives a much lower TCO (Total Cost of Ownership) that current DW (Data Warehousing) offerings. If you need to scale you can simply buy some more server (HP DL380’s, IBM Xwhatever’s etc) and add them into the environment…no more needing to purchase a whole new appliance and write off the previous one. Definitely a good point for CFO’s and their kind 🙂
Architecture
Madison’s approach to data storage makes it quicker, more reliable & more responsive to the needs of a business; or even to the needs of individual dept’s within a business. If you have multiple separate but related companies under a single umbrella (or you’re a big enterprise that has internal departments the same size as a small company!) Madison is definitely something you should take a look at.
Here, each Business Unit has it’s own Data Mart making it easier, quicker and cheaper for them to store and access their data, but a single “Golden” copy of data in the central reservoir resolves many issues. There is also great high availability here as Spokes or hubs can back each other up.
This next image does a great job of showing the difference between Madison and current DW solutions:
Much more flexible 🙂 It’s also going to be fast, one example I saw was:
“625K rows returned in 11 seconds from 1 trillion row table”
That’s amazing!
You can see in the diagram below that it plugs into Office and also “BI Tools”, which surely is Sharepoint. This backs up what I’ve heard that Sharepoint Online will support Madison too!
Learn more over at:
http://www.microsoft.com/sqlserver/2008/en/us/parallel-data-warehouse.aspx