How I would like Microsoft to distribute .NET v.Next

I have been told that I tilt at windmills, and been compared to King Lear raging at the storm. It seems I hold forth about things over which I have no control. So be it.

The deployment model that Microsoft are using for .NET is, in my opinion, outdated and untenable. Each release of the framework is dropped on the world as a monolithic block of CLR, language updates, a new Visual Studio version and a “Base” class library which tries to be all things to all developers. In this post, I would like to call into question the efficacy, if not the sanity, of this approach, particularly regarding the BCL.

The “Base” class library?

The first couple of versions of .NET included two UI paradigms within the BCL: ASP.NET and Windows Forms. If you built an ASP.NET application, you still had Windows Forms in the framework; if you built a Windows Forms application, you still had ASP.NET.

With .NET 3.0, Microsoft added WPF into the mix; the full version of that framework now included 3 UI paradigms. It also added WCF and Workflow, and then in the oddly-numbered 3.5 came the born-moribund LINQ-to-SQL ORM. Whether you wanted them or not, you got them all. This made the framework distribution large and unwieldy, and so the Client Profile was conceived, a “lightweight” version of the BCL which excluded all the server-related classes (and quite a few classes which were very useful for client development, such as HttpUtility).

During the lifetime of .NET 3.5, Microsoft discovered the “out-of-band” release model. They created another ORM, Entity Framework, and distributed it out-of-band. Scott Guthrie got bored on a plane and made ASP.NET MVC, and it was distributed out-of-band. The out-of-band model was good, although Microsoft’s apparent desire to create the kind of framework-proliferation that Java suffered from, but comprised entirely of in-house projects, was bizarre and inexplicable.

Then it was time for .NET 4, and into that was rolled Entity Framework 4.0, and ASP.NET MVC 2, an update to WPF, and a new version of Workflow for good measure. Again, there was a Client Profile. (Again, it was missing useful classes.)

Shortly after .NET 4 was released, ASP.NET MVC 3 became the current version. A little while after that, Entity Framework 4.1 appeared, and it appeared in a way that was new (to .NET) and exciting: NuGet.

Package Management

The idea of package management has been around in various guises for a long time. The Linux distribution Debian introduced the APT package manager in 1999 to distribute everything from system libraries to office application suites. In the software development world, at RubyConf 2003, Ruby programmers were first introduced to the RubyGems system: a centrally-managed repository of libraries large and small which could be installed to your system with a three-word command (plus –no-rdoc –no-ri, of course).

It took a few years, but then suddenly several package management projects were announced for .NET within a relatively short space of time, some based on (and piggy-backing) RubyGems, and others taking a ground-up, self-hosted approach. All of this would probably have remained the bailiwick of open-source enthusiasts and the ALT.NET community, but then Microsoft suddenly snapped up one of the projects, brought it in-house, hosted the repository, (changed the name to NuGet) and generally made it official, and started using it to distribute their own .NET libraries.

Brave NuGet World

So we now have a situation where the current versions of ASP.NET MVC (3) and Entity Framework (4.1) are not a core part of the framework, but are distributed and maintained through NuGet. It seems likely that the new-and-improved WCF Web API will also appear through NuGet before we get another full framework. Of course, to use these packages, you still have to install .NET 4.0, with the now-deprecated versions of all these things in it. That’s annoying.

I’d like to see the next version of .NET, be it 4.5 or 5.0, ship with a BCL that has been stripped down to the absolute basics: fundamental types, I/O, LINQ, the Task Parallel Library and ADO.NET. I don’t want to be able to write anything but Console applications with it. I don’t want ASP.NET, MVC 4, WPF, Windows Forms, WCF, Workflow, LINQ-to-SQL, Entity Framework 5.0 or any of that stuff. Instead, I want the “Add Reference” dialog in Visual Studio 11 to show Packages where currently there are GAC-registered assemblies.

Ideally, within this new Packages reference tab, I would like to see equal weight given to third-party and open-source libraries, so that things like OpenRasta, FubuMVC, Nancy, NHibernate, Dapper and Simple.Data are presented on a par with ASP.NET and Entity Framework.

In conjunction with that, I’d like it if NuGet, the enabler of this new paradigm, were distributed as a core part of the framework, much like RubyGems is now a part of Ruby, so that when I deployed my application to a web server, desktop or shiny new “device”, it could automatically retrieve – and potentially NGen and GAC – those dependencies if they were not already present, thus reducing the size of my deployment.

By adopting this model, Microsoft could not only embrace and support the many excellent third-party solutions that exist; they could also optimise the lifecycle of their DevDiv output. No longer would a release of a new .NET version herald a massive set of breaking changes to existing libraries; no longer would releases have to wait until the combined efforts of countless development teams could be wrangled into a single MSI. A new framework version would mean the next generation of languages with their supporting classes, and potentially a new CLR version. Surely that would be easier to manage than having to wait until the WPF team have got text not to look like it’s been smeared in petroleum jelly?

Why JetBrains should make a Mono IDE

If you haven’t seen the news already, Miguel de Icaza announced yesterday that the Mono team, having been cut loose by Novell’s new Attachmate overlords, had started their own company, Xamarin, to continue the great work they do for the .NET community beyond Microsoft. They’ve got some angel investment and a business model based around their MonoTouch, MonoDroid and MonoXYZ stacks, plus a bit of consulting.

This is welcome news indeed following a couple of weeks of fear, uncertainty and doubt over the future of not only the Mono product line but also the personalities whom many of us “know” through Twitter, mailing lists and so on.

The day after this announcement, I caught this tweet from @agileguy:

They could partner with a certain dev. tools maker and make the bestest mono/.NET IDE evar!! MonoStorm FTW

I don’t know if Daniel was serious, but I think this is a truly awesome idea. JetBrains IDE offerings are the best out there (with the possible exception of Visual Studio + Resharper). I use RubyMine, which is based on IntelliJ IDEA, and I love it.

But a Mono IDE, written in Mono, would be a smart move for JetBrains. At the moment their flagship product is written in Java and runs on the JVM. Since long before Oracle’s acquisition of Sun, Java has been drifting, and has fallen behind C# as a language and the CLR as a runtime. Now that Oracle have taken control, Java’s future seems even less uncertain, whether you believe that’s due to apathy, incompetence, litigiousness or downright malice.

By contrast, Mono is now under the safe stewardship of a passionate and capable team who are pushing C#, .NET and the CLR in amazing directions, often further than Microsoft seem to manage.

Porting IntelliJ to C# and Mono would result in that amazing JetBrains IDE know-how, still running cross-platform, but on a runtime that is fast and stable, and under active, enthusiastic development. It would mean that MonoTouch development could be done in exactly the same environment on OSX as Android (and other) development on Windows or Linux. It could become the de facto choice for anybody doing cross-platform mobile device development, with built-in support for sharing code between MonoTouch, MonoDroid and WP7 apps. And it would be an IDE to challenge Visual Studio for the hearts and minds of hundreds of thousands of .NET developers worldwide.

This is an opportunity for an established, respected company to work with and support a fledgling start-up, and for a small team of some of the best developers on the planet to really hit the big-time. I hope that opportunity isn’t missed. If you agree, please take a minute to add a comment below.

The Interface Segregation Principle and MEF

I use MEF in Simple.Data to support the Adapter/Provider model. People can write Adapters to add support for any data store they like or, for traditional databases with ADO.NET support, they can write a Provider to be used by the AdoAdapter.

So far, there are Providers for SQL Server, SQL Server Compact, MySQL and SQLite. Now somebody is working on a Provider for Oracle, and this has presented a challenge. Oracle doesn’t do automatically-incremented integer columns like most databases (think IDENTITY columns in SQL Server, or AUTO_INCREMENT in MySQL). Instead, it provides Sequences, which are a first-class database object that provides similar functionality, but independent of tables. SQL Server 2011 is adding support for them as an alternative to IDENTITY, in fact*.

This means that the Oracle provider might need to do something a bit different when inserting records, and also that the AdoAdapter’s default method for returning a newly-inserted row won’t work.

To cope with this, I decided to add an extension point for providers to say “I need to do insert operations a bit differently, so just give me the table name and the data and I’ll handle it.”

Currently, providers simply implement two interfaces, one for Connections and one for Schema, and there’s a method on the IConnectionProvider interface to return an ISchemaProvider. The Interface Segregation Principle (pdf) says that a custom insert method should be on a separate interface, so I initially thought of adding a method to IConnectionProvider to return an ICustomInserter. But this will break existing Providers, forcing their author’s to add the implementation of that method just to return null.

Instead of doing that, I returned to MEF, which has already done a sterling job of finding the Provider in the first place and creating the IConnectionProvider instance for me. The new code creates an AssemblyCatalog using the assembly containing that instance’s type, and looks for an exported ICustomInserter:

Code Snippet
  1. private static T GetCustomProviderExport<T>(IConnectionProvider connectionProvider)
  2. {
  3.     using (var assemblyCatalog = new AssemblyCatalog(connectionProvider.GetType().Assembly))
  4.     {
  5.         using (var container = new CompositionContainer(assemblyCatalog))
  6.         {
  7.             return container.GetExportedValueOrDefault<T>();
  8.         }
  9.     }
  10. }

If it finds one, it will defer to that for doing inserts; if not, it’ll just carry on using the existing method. For optional extension points, this is a much better approach, since it only requires effort when the extension point is used, and none at all when it isn’t.

Now, the Oracle provider can export an ICustomInserter which can handle Sequence allocation as the developer sees fit, and return the newly inserted row by (at a guess) using the built-in ROWID pseudocolumn. Meanwhile none of the other providers need to change or rebuild in order to work with the new version of Simple.Data.Ado.dll.

Incidentally, this also highlights my YAGNI approach on Simple.Data: I’m not trying to predict in advance what extension points Adapter and Provider developers are going to need; I’m just reacting to problems that they report and adding or changing whatever needs adding or changing, as-and-when.

*Thanks to @GaryMcAllister for that tip.

Simple.Data 0.5.4

I just pushed the latest version of Simple.Data to NuGet. No huge changes, but a couple of improved extensibility points in the core library, and a change to the way the packages are organised.

More flexible connection handling

Simple.Data is pretty aggressive about how it uses, and especially closes, ADO database connections. These things are expensive resources to hold onto on a web-server; you want to grab one from the pool, get what you need and then get it back in the pool so the next request can use it. So every use of a connection is wrapped in a using statement. Which is great, but then somebody created a SQLite adapter, and one of the really cool things to do with a SQLite adapter is to use an in-memory database for running your BDD tests. And when you close the connection to an in-memory SQLite database, you drop the database.

Anyway, rather than change all the connection handling in the Simple.Data.Ado code, we decided to use a Delegating Wrapper pattern. In this instance, this is a class which implements IDbConnection, and takes an instance of IDbConnection, and just delegates all the calls. But it marks its implementation methods as virtual, so you can derive a concrete class from it and override just a few methods – in this case, Open, Close and Dispose. I added a base class for doing this as it is the kind of thing that might be needed by several provider authors.

Database.OpenNamedConnection

This came from a request on the mailing list, asking if the named connections in the <connectionStrings> section of web.config (or app.config) were supported. They weren’t, because I’d forgotten about them, and that was bad, so I fixed it.

This means that you can now have a section in your config like

<connectionStrings>
  <add name="Test"
        connectionString="Data Source=SQL2008;Initial Catalog=SimpleTest;User ID=SimpleUser;Password=SimplePassword"
        providerName="System.Data.SqlClient" />
</connectionStrings>

And you can call Database.OpenNamedConnection(“Test”) and that will open up that connection.

At the moment that’s as far as it’s gone, but this also provides a way for more than one database engine to be in use in a single application. The providerName attribute can be used as the MEF contract name in the provider libraries, and that gives us a nice, easily extensible and standardised way for people to create new providers.

So if, for example, you have created a MySQL provider, you need to add an additional Export attribute to your IConnectionProvider implementation, specifying “MySql.Data.MySqlClient” as the contract name. :)

In the next release (0.5.5) this will be the standard way in which ADO providers are resolved, although it’ll fall back to the old (and deprecated) mode if necessary.

Package changes

The other important change is that the Simple.Data.Ado.dll assembly has been put into its own package, which is depended on by the provider packages. When I first pushed Simple.Data up to NuGet, it only supported ADO-based connections, so it made sense for the ADO adapter to be in the Core package. But now, there is a MongoDB adapter, and talk of adapters for CouchDB and Redis, and none of those have anything to do with ADO, so why would you want it cluttering up your packages folder?

If you are the author of an AdoAdapter provider (e.g. MySQL and SQLite) you now need to make Simple.Data.Ado a dependency of your NuGet package.

If you are the author of an Adapter, just leave the dependency on Simple.Data.Core; now, users of your package will not have the Ado assembly installed.

Supported store round-up

I’m going to finish release posts with a quick roll-call of the supported databases and other data-stores. For release 0.5.4, there is support for:

  • SQL Server 2005 and up
  • SQL Server Compact 4.0
  • MongoDB
  • MySQL 4.0 and above
  • SQLite (3?) including in-memory mode

Simple.Data 0.5

There wasn’t going to be a 0.5 release of Simple.Data, but it started picking up a head of steam and I decided to push an extra release to help some people build some adapters and providers.

Changes

No more Reactive Extensions

The main change in this release is that I took out the dependency on the Reactive Extensions. It’s a bit of a shame, but the Rx assemblies are strongly-named, which means that when I build a Simple.Data release against what’s currently on nuget, and then they push a new build, it breaks everything the next time somebody installs the package. As Seb Lambla says in his OpenWrap presentation, strong-naming is anathema to package management, as well as just generally evil. I understand why they do it, but the actual implementation needs mending.

The only Rx functionality I was using was a trick for buffering data so that connections are closed as early as possible. I was doing this by pushing data from DataReaders through an IObservable and then using the Rx ToEnumerable method to cache the results. I’ve replaced this with a BufferedEnumerable type, which was interesting to write, and involved creating a Maybe<T> type to support it. .NET really, really needs a Maybe in the BCL.

NoSQL compatibility

A guy called Craig Wilson is creating a MongoDB adapter, and he ran into quite a few issues with the dynamic property name resolution. The code was using a special dictionary which “homogenized” keys as their values were set; essentially all non-alphanumeric characters were removed and what was left was down-shifted. This was fine for SQL Server, where the column names for CUD operations were resolved by interrogating the schema, but completely failed when used against a data store which has no schema. So the dictionary has been replaced with normal dictionaries using a custom IEqualityComparer implementation. While I was in there, I also optimised the Homogenize method, and created a new custom Dictionary implementation which only holds one copy of the keys for an arbitrary number of values; this saves quite a lot of memory when returning lots of rows.

Fewer internal types

In previous releases, I followed the minimal public API approach, and marked as internal anything I could. In order to facilitate testing, I added InternalsVisibleTo attributes to expose some stuff specifically to the SqlServer and SqlCe40 test projects. However, another guy is building a MySQL provider, and he rightly pointed out that all these internals made it impossible for him to copy the tests to use as a start point for his project. So I’ve made those things public.

It’s made me ponder the nature of internal and private and protected and so forth, and I might even manage a blog post on it at some point.

Roadmap update

So that’s where things are at for 0.5. The next feature release, 0.6, will appear at some point in March, bringing support for lovely complex queries with explicit joins, cross-table column lists, aggregates and so on. And hopefully there’ll be even more adapters and providers coming from the community (CouchDB and Redis have been mentioned); I’ll release minor updates to the 0.5 branch as and when necessary to support those things.

Acknowledgements

Mad props to Paul Stack for setting up a Continuous Integration and NuGet-deploying project on his TeamCity server.

Simple.Data is out there

On Friday the 21st of January, 2011, I pushed Simple.Data up to NuGet for the first time. (Actually there was a push before that, but it was premature, so I’m pretending it didn’t happen.)

On Saturday the 22nd I got my first bug report, and on the 23rd I fixed it and pushed an update, version 0.4.5.

Then I started writing this blog post. In between then and now, I’ve done a presentation at DDD9 where I had Jon Skeet in the audience, and done a podcast with the guys at Herding Code, and things have generally been a bit mental and awesome in equal measure.

Anyway, I’ve just pushed another update, 0.4.6, which addresses a performance issue with creating arbitrary numbers of dictionaries to represent rows.

So I thought I’d post a quick overview of where it’s at right now, on version 0.4.6.

This is the first I’ve heard of Simple.Data. What is it?

It’s a database access library built on the foundations of the dynamic keyword and components in .NET 4.0.

It’s not an O/RM. It looks a bit like one, but it doesn’t need objects, it doesn’t need a relational database, and it doesn’t need any mapping configuration. So it’s an O/RM without the O or the R or the M. So it’s just a /.

I think it’s ideal for smaller projects on platforms like WebMatrix or Nancy, where something like Entity Framework or NHibernate would be overkill, but you still don’t want to have to type a bunch of crufty SQL and the associated ADO.NET boilerplate.

Also, it makes it impossible to create database code which is vulnerable to SQL Injection attacks. If you don’t know what that means, then you need to use Simple.Data.

Current features

At the moment, Simple.Data provides support for the most basic find operations using the FindBy* and FindAllBy* dynamic methods, which generate straight equality conditions at the database. These also support lists, which produce an IN clause, and a Range type, which produces a between.

For more complex criteria, there are the Find and FindAll methods which take an expression, so you can do inequality operators, plus use a Like method on strings. They also support joins across tables where there is referential integrity defined in the DB.

Then there are the Insert, Update and Delete methods, which do those things. Insert and Update have versions which take entire dynamic or static objects, or you can use named parameters.

There’s basic support for master-detail relationships built into the dynamic records, again, as long as you have the primary and foreign keys defined.

And for those scenarios where you just can’t do something in Simple.Data itself, there’s Stored Procedure support for SQL Server databases.

Full details on all the syntax is on the wiki.

Supported Databases

The code has been tested on SQL Server 2008 and 2008 R2, and SQL Server Compact 4.0. I have decided to drop the support for SQL Server Compact 3.5 as I don’t expect there to be much demand for the legacy support and I don’t want to waste cycles that are better spent adding features.

Pretty high on my list of priorities is support for the 3 main OSS databases: MySQL, PostgreSQL, and SQLite.

Because Simple.Data has a very flexible adapter model underneath, it’s possible to get it talking to non-RDBMS data sources. I spiked something that worked against the Windows Registry, although I only did that to see if I could. It’s not something I’d recommend. But I have got the ground-work in place to get it working with Windows Azure Table Storage, which will be cool.

Getting it

Currently the best way to get Simple.Data into your project is using the NuGet package manager. I’m working on adding native OpenWrap support, but it’s lower priority since OpenWrap can use the NuGet repository.

Available resources

If you need some help with something, I’ve created a Google Group for discussion, so post on that. I get the emails straight away, so you should get a reasonably quick response.

If you think something is broken, or you’ve got a burning need for a feature, then open an issue on the Github project page.

Things that need doing

I’m working on 0.6 features, which are as described in this earlier post, but there are a few other things that need doing if you’re looking for something to hack on. The single most important of these is getting the project building on Mono, so if you fancy doing that, please pull the code and let me know what stops it from working. Obviously getting the additional databases supported would be a bonus for that scenario too, so if you want to try writing a provider for one of those, give me a shout and I can remote-pair with you to get you started.

A Quick Simple.Data roadmap

Version 0.4, mid-January

I’ve recently got to do some more work on Simple.Data, and this week I’ll be pushing the 0.4 release to NuGet and OpenWrap. This release adds support for stored procedures, which can be executed by calling them as methods on a Database or Transaction:

var db = Database.Open();
var procedureReturnValue = db.usp_SomethingTheDbaWrote(fromDate: DateTime.Today);

This also adds support for multiple result sets being returned from the procedure. I’m looking into how to handle Output Parameters with some kind of mutable value wrapper, because the dynamic framework doesn’t support ref/out method parameters.

Version 0.6, early February

The next release version will be 0.6, which is going to add support for constructing complex queries with explicit joins, column lists, ordering and grouping and so on. I’m intending the syntax for that to look something like:

var db = Database.Open();
var q = db.Employees.As("Employee").CreateQuery();
q.LeftJoin(db.Employees.As("Manager"), q.Employee.ManagerId == q.Manager.EmployeeId)
    .Select(q.Employee.Name.As("Employee_Name"), q.Manager.Name.As("Manager_Name")
    .OrderBy(q.Employee.StartDate);
var rows = q.Run();

If you’ve got any ideas for improving that syntax, please leave them in the comments. I’m aiming to get 0.6 out in early February.

Version 1.0, late February?

Unless somebody points out a glaring omission, the difference between 0.6 and 1.0 will be bug fixes and performance improvements, plus I’ll also be adding support for SQL Server Compact Edition 4.0 in 1.0 Beta 1. If you don’t need that or the complex query feature, then by all means start using the 0.4 release, which would let you work around that gap with stored procedures anyway.

I’ve also had some people ask about Mono support, and that’s something I intend to support fully for Mono 2.8 and above. If you’d like to contribute towards that goal, then the main thing that will be required is Providers for the AdoAdapter. This works much like the Ruby DataMapper approach, which has an adapter for RDBMS systems which then accepts further plugins for specific databases. Obviously the key providers for non-Windows systems are MySQL, PostGRESQL and Sqlite. So if you fancy working on that, let me know.

Annual Review

That Was The Year That Was, That Was

I’ve had a pretty good year this year. It’s been my first full year at Dot Net Solutions, where we’ve had a bit of a time of it with several people moving on to new opportunities; I’m now the Principal Architect round these parts. It’s a primus inter pares type of role, so I still work as lead dev on some projects, but I have an overview of all the projects and I can make tech/process decisions. We’ve had some interesting projects, and some interesting-as-in-may-you-live-in-interesting-times projects, but I think there’s always something to take from whatever it is you find yourself working on.

This was the year I finally realised a long-time ambition and started giving talks in the UK community (and Ireland too thanks to DDD Dublin). I’ve been trying to deliver slightly more in-depth talks, and I think that’s gone over fairly well in most cases, although I’m still learning. Thanks to DeveloperDeveloperDeveloper, DevEvening, NxtGenUG, DotNetDevNet and VBUG for hosting me (and please give me a shout if you’d like more). I’ve met a whole bunch of interesting new people and learned far more than I’ve taught.

I’ve also been a lot more active with my personal/OSS projects: I’ve got a range of libraries/spikes/samples/PoCs up on Github, plus a couple over on Bitbucket. In particular, Simple.Data has gone from a pop at Microsoft to a nearly-finished, already-in-production-use library.

Oh, and I finally got to go to Microsoft in Redmond, where I made a nuisance of myself at an Azure SDR and met some people whose blogs I’ve been reading for years, which was awesome. Very poor swag, though.

So…

Assuming that civilisation as we know it doesn’t come to an end due to solar storms, I’m going to carry on this way in 2011.

Some specific things I want to achieve:

  • Software to ship:
    • Simple.Data 1.0 in January, plus more adapters during the year
    • Pocket C#/F#/VB for Windows Phone 7 in Q1
  • Things to learn:
    • F# to the point where I can use it for real projects
    • MonoTouch/MonoDroid
    • Kinect hacking with OpenNI
  • Community:
    • Lots more presenting to do, including a complete lap of the NxtGenUG circuit. I’m presenting at NxtGenUG in Hereford on the 17th January, and DDD9 at Microsoft Reading on the 29th. Looking for bookings beyond that, so if you’ve got a handful of people and a projector, please give me a shout.
    • I’m going to try for at least one blog post every week, either on here or azureblog.com.
    • Get our new CloudEve.ning.com user group established.
    • I want to be an MVP.

Bring it on.

RavenDB Azure fork discontinued

I had planned to bring my Azure worker compatible fork of RavenDB up to date after getting Simple.Data 0.2 out. However, given the upcoming enhancements to the Azure platform that were announced at PDC, the work now seems unnecessary. Specifically, the full IIS support on web roles means that RavenDB can now be run in that way without needing the complex TCP stack adapters that I implemented.

I’m still a huge fan of RavenDB, and as soon as the full IIS feature is available I’ll get it running and publish my findings, along with any code required for CloudDrive and so on.

[moved]

I’ve moved my blog to WordPress, because it’s better.

Follow

Get every new post delivered to your Inbox.