There are some pretty awesome BI jobs

I started out my career as a programmer. Why did I become a programmer? To make games of course! This is something everyone knows: Every programmer has a secret (or not so secret) craving to make a game. Now you can get the best of two worlds! Check out these job openings from Riot, makers of League of Legends: 


Reminds me of a Kimball seminar I was on where they talked about how they analyzed game patterns in Call of Duty. 


Is the datawarehouse going the way of the Dodo?

dodoThe Dodo went extinct because it could not adapt. Maybe data warehouses are going down the same route?

For the decade or so I’ve been working with data warehouses they have basically all looked the same: At the core a relational database modelled for fast load and retrieval of data batches and an ETL tool that does all the data integration heavy lifting. This model is quite mature and there have been few surprises over the last couple of years. It does have it challenges though. The relational theory and technology were originally designed for quite different workloads than those found in datawarehousing. The technology might have changed to accomodate other types of data profiles but the fundamentals around transactions and relational integrity remain as a bottleneck. The ETL paradigm has also been under fire for a number of years. In a typical DW project a majority of the work goes into shuffling data around; applying transformations and in some cases business logic to the data. This often leads to a disconnect between the reality in the data warehouse and the real world in the line of business. Data duplication is also an issue and its hard to argue against the notion that data proliferation carries with it a burden both in term of governance and data quality and also in pure costs.

On the storage side much has happened over the last years. Alternatives to relational storage are rapidly maturing and a flurry of new ideas are coming out from the NoSQL “movement”. Take append-only databases for instance. They share many of the same characteristics as data warehouses but are built from the ground up for that kind of data storage. Additionally these kinds of databases scale very nicely. Need more space? Just add a node.

There are also new thoughts on how to integrate data that depart radically from the established ETL paradigm. Data virtualization is one of these. It basically ditches the whole ETL / data warehouse concept and replaces it with modelleling and real time access to sources with some caching thrown in. In effect it brings the promise of top-down, rapid data integration. Seems like a dream? At least forrester does not think so.

These are only two examples of new ways to think about old problems. There are many more. I for one, have a lot of reading to do!