Learn Simple.Data & Nancy at SkillsMatter

I should have blogged about this ages ago, but I’ve been unbelievably busy. Still, better late-but-not-too-late than never-or-not-soon-enough.

Following the well-received half-day Introduction to Nancy and Simple.Data that I presented with Steven Robbins (@Grumpydev) at the Progressive .NET Tutorials 2011, Wendy at SkillsMatter approached me and asked if I’d be interested in doing a full two-day course. I’ve always been very impressed with the courses they offer, so I jumped at the chance to get involved. So you can now sign up for a full-on, two day deep dive into building web applications with minimal code.

The course will quickly cover the basics, such as View Engines, Dependency Injection, Nancy’s TDD/BDD test helpers and Simple.Data’s built-in data mocking functionality, then move on to more interesting things like: security, authentication (including OAuth) and authorisation; building RESTful web APIs (proper ones, not just HTTP web services); using different data stores with Simple.Data; and different methods for validating data.

It’ll be a heavily interactive couple of days. You can either turn up with an idea for a web application that you want to get started on, or you can work on an app I’ve designed to exercise the skills you’ll be learning. Either way, at the end of the course, that code is yours to take away and use as a starting-off point or a useful reference.

The first instance of the course is on the 12th and 13th of March, and there are still places available. Lots of places. All of them, in fact. So go sign up for it.

Simple.Data 1.0.0-beta1

At last

It’s taken me so much longer to get here than I originally expected, but I’ve released the first 1.0 beta version of Simple.Data.

If this post is the first you’ve heard of Simple.Data, then head over to the GitHub page and browse back through previous posts to find out more.

New features

I didn’t do a post for the 0.14 release, so I’ll cover the changes from that as well as what’s new in the 1.0 beta.

Eager-loading with With

Since very early on, Simple.Data has supported lazy-loading when you reference a joined property from the dynamic record type. There were two issues with that: firstly, if you assigned the record to a static type, the joined properties were not hydrated; secondly, multiple selects do not make DBAs happy, and should be avoided if possible.

Now you can use the With method to load the joined data at the same time as the main record.

var db = DatabaseHelper.Open();
Customer actual = db.Customers.FindAllByCustomerId(1).WithOrders().FirstOrDefault();

This uses a single SQL select statement to pull all the Customer rows, plus the Order detail rows, and groups the data in-memory; generally, that’s more efficient than running two select operations. When the record is converted to the Customer type, it sets the ICollection<Order> property at the same time, either creating a new instance of List<Order>, or populating an existing instance if the property is readonly.

The inverse works too:

var db = DatabaseHelper.Open();
Order actual = db.Orders.FindAllByOrderId(1).WithCustomer().FirstOrDefault();

If the property name is not the same as the database table name, you can use an alias to tweak it:

var db = DatabaseHelper.Open();
var actual = db.Orders.FindAllByOrderId(1).With(db.Orders.OrderItems.As("Items"));

And if there’s no referential integrity in the database, you can specify an explicit join separately:

dynamic manager;
var q = _db.Employees.All()
    .OuterJoin(_db.Employees.As("Manager"), out manager)
    .On(Id: _db.Employees.ManagerId)
    .With(manager);

 

(Oh, yeah, and check out the OuterJoin method. Finally.)

Of course, if there’s no referential integrity, it’s hard for Simple.Data to work out whether the joined property is a collection or a complex object, so you can specify WithOne or WithMany to help it out:

dynamic manager;
var q = _db.Employees.All()
    .OuterJoin(_db.Employees.As("Manager"), out manager)
    .On(Id: _db.Employees.ManagerId)
    .WithOne(manager);

 

And of course, you can mix and match all these. This test gives you a good idea of some of the heavy lifting that’s going on for you with this feature, just with the SQL:

public void MultipleWithClauseJustDoesEverythingYouWouldHope()
{
    const string expectedSql =
        "select [dbo].[employee].[id],[dbo].[employee].[name]," +
        "[dbo].[employee].[managerid],[dbo].[employee].[departmentid]," +
        "[manager].[id] as [__withn__manager__id]," +
        "[manager].[name] as [__withn__manager__name]," +
        "[manager].[managerid] as [__withn__manager__managerid]," +
        "[manager].[departmentid] as [__withn__manager__departmentid]," +
        "[dbo].[department].[id] as [__with1__department__id]," +
        "[dbo].[department].[name] as [__with1__department__name]" +
        " from [dbo].[employee] left join [dbo].[employee] [manager] " +
        "on ([manager].[id] = [dbo].[employee].[managerid])" +
        " left join [dbo].[department] " +
        "on ([dbo].[department].[id] = [dbo].[employee].[departmentid])";

    dynamic manager;
    var q = _db.Employees.All()
        .OuterJoin(_db.Employees.As("Manager"), out manager)
        .On(Id: _db.Employees.ManagerId)
        .With(manager)
        .WithDepartment();

    GeneratedSqlIs(expectedSql);
}

(Never mind the logic involved in turning that result set into the correct in-memory object graphs.)

Now, this works on multi-record queries, but not on single-record ones such as FindById or the key-driven Get method, and that’s more problematic since those methods don’t return a query you can modify, just a record. In the past I did actually toy with having the SimpleRecord type do lazy self-evaluation, but the fact that the NUnit Assert.IsNull test wouldn’t accept that the object was null even when it swore black-was-blue that it was put me off. (It works for Nullable<T>; no fair.)

Instead of that, and this is only in the 1.0 beta release, you can specify your With clause before the Get or FindBy:

var db = DatabaseHelper.Open();
Customer actual = db.Customers.WithOrders().Get(1);

So now you get all that goodness for single records, where it’s arguably more useful anyway.

Upsert

Inserting and updating is all very well, but sometimes you’ve got some data and you just don’t know whether it’s in the database already or not. If it is, you want to update the row with some new values; if it’s not, you want to insert it. Boring.

To save you the time and trouble, Simple.Data now provides the Upsert method. Give it a record, and it will do all the checking to see if it exists or not. And in beta2, there’ll be back-end database-specific optimizations; for example, if you’re using the SQL Server provider with SQL 2008 or later, it will use the MERGE operation.

Upsert returns the record as it is in the database following the operation, with any database-specified values intact.

var db = DatabaseHelper.Open();
var user = new User {Id = 2, Name = "Charlie", Password = "foobar", Age = 42};
var actual = db.Users.Upsert(user);

 

That’s one example, but there are plenty of other ways to use Upsert. Take a look at the tests to see the others.

NuGet and SemVer

I’ve pushed this release to NuGet using the Semantic Version number for pre-release (1.0.0-beta1), which NuGet added support for in 1.6. Using this form means that NuGet knows that it’s pre-release software, and you’ll have to explicitly tell it that that’s what you want. So to get the 1.0 beta releases, remember to use the –Pre flag for the Install-Package command. The great thing about this is that when we get to 1.0 RTW, I’ll start on 1.1.0-alpha1 and both package types will be available from the repository.

If you’re waiting for the Mono build, I hope to have it out as a tgz by the end of this weekend.

What’s still to do?

So with those two features, I’m drawing a temporary line in the sand and focusing on getting everything to release quality. That means implementing some optimizations around object creation and database tricks, but more importantly, working on test coverage, refactoring some messy code, and making the documentation comprehensive. Help on that last one would be much appreciated!

Come to that, if anybody wants to make a really nice website… :)

Macro-optimisations

I’ve used Simple.Data in a few production projects now (and it’s doing a great job so far). It’s not often you actually get to use the software that you write, but when you do, it’s a great opportunity to see it through users’ eyes, and I’ve made a few changes and improvements over the past year as a result.

The most recent project that we’ve used it on at Dot Net Solutions is the Met Office open data thing that was announced on Tuesday. And that forced me to bring forward an optimisation that’s been way down my to-do list, partly because I didn’t really just how much of an optimisation it would be.

The Met Office project involves inserting something like 8 million records a day into a SQL Azure database, which isn’t a huge amount, but enough to need you to be smart about how you do it. The version of Simple.Data that was on NuGet when we started supported bulk inserts, but it wasn’t friendly to the error handling we needed and it assumed it needed to return the inserted records, doing that whole ‘select just-inserted-record’ thing, which is often completely unnecessary.

(So it turns out that when you’re handling TryInvokeMember in a DynamicObject, you can actually find out whether the return value is used by the caller, and not bother if it isn’t. But that’s another blog post.)

Anyway, I tweaked a couple of things and shaved off a fraction of the time it was taking, but it was a small fraction, and things were still far too slow. So we did what we should have done in the first place, and used SqlBulkCopy.

If you haven’t used this (SqlClient-specific) method, you should read up on it and keep it in your mental list of “things that are good that I might need some day”. It lets you prepare a big batch of rows in a DataTable (turns out they’re still good for something) and then insert them in a single operation, and man, it’s quick.

But it’s SQL Server specific, so I couldn’t support it in the generic ADO adapter code.

I’ve exposed a few interfaces in the Simple.Data.Ado assembly which providers can optionally implement if they need to do something a little differently or can do something better. The first instance was ICustomInserter, which is implemented in the Oracle provider to handle fetch-backs in a world without IDENTITY columns. Since then I’ve added more as I went along, and IBulkInserter was one of them because, as I said earlier, I had half a mind to implement this. And now I have.

Anyway, I’ll stop blathering now and just post the comparison code I wrote (measures time to insert 10,000 records, five times) and the before and after results.

Code:

namespace BulkInsertComparison
{
    using System;
    using System.Collections.Generic;
    using System.Diagnostics;
    using Simple.Data;

    class Program
    {
        static void Main(string[] args)
        {
            var db = Database.OpenConnection("data source=.;initial catalog=BulkInsertTest;integrated security=true");

            for (int i = 0; i < 5; i++)
            {
                Console.WriteLine(TimeInsert(db));
            }
        }

        private static TimeSpan TimeInsert(dynamic db)
        {
            var stopwatch = Stopwatch.StartNew();
            db.Target.Insert(GenerateItems(10000));
            stopwatch.Stop();
            return stopwatch.Elapsed;
        }

        static IEnumerable<Item> GenerateItems(int number)
        {
            for (int i = 0; i < number; i++)
            {
                var guid = Guid.NewGuid();
                yield return new Item(0, guid, guid.ToString("N"));
            }
        }
    }

    class Item
    {
        private readonly int _id;
        private readonly Guid _guid;
        private readonly string _text;

        public Item(int id, Guid guid, string text)
        {
            _id = id;
            _guid = guid;
            _text = text;
        }

        public string Text { get { return _text; } }
        public Guid Guid { get { return _guid; } }
        public int Id { get { return _id; } }
    }
}

Before:

00:00:16.9799819
00:00:17.1971797
00:00:18.0744958
00:00:19.1514537
00:00:17.3798541

After:

00:00:00.4616911 (First run includes MEFing IBulkInserter)
00:00:00.2757802
00:00:00.2852119
00:00:00.2504587
00:00:00.2453277

Totally worth it.

0.12.2 on NuGet now. Mini-roadmap: eager-loading (0.14) and upserts (0.15).

Simple.Data for Mono

TL;DR

I’ve got Simple.Data running on Mono 2.10.6. Most tests pass. YMMV. Download it here.

Ritalin version

Simple.Data on Mono is something that people have been asking about, and I’ve been meaning to sort out, pretty much since the project went from a proof-of-concept to an actual OSS product. One way or another I’ve never gotten round to it, but a couple of things have made it seem more relevant. Firstly, there are providers for lots of OSS databases now (MySQL, SQLite, PostgreSQL); secondly, a certain @fekberg has been very persistent in his status update requests, so I’ve taken the time this weekend to make it work.

The challenges

For the most part, I’ve been pleasantly surprised by how easy it’s been. I’ve found what I think is a bug in the Mono implementation of either dynamic or LINQ or the combination of the two, which I’m going to file with Xamarin once I’ve created a simple repro project. I’ve come up with a workaround involving old-school class-based IEnumerable/IEnumerator implementations, and the performance doesn’t seem to be affected, so that’s fine.

The hardest thing about the whole process is that MonoDevelop just isn’t Visual Studio 2010 + ReSharper. It’s not a bad IDE by any standards – I’d still take it over Eclipse any day – but I work day in, day out with VS and jumping into any other IDE just feels like somebody moved all the cheese. Add to that the fact that I’m running it under OSX, so even the standard Windows keyboard shortcuts don’t work, and it’s a bit like running in treacle. As I understand it, the Mono Tools for Visual Studio don’t support 2.10; Miguel de Icaza tells me that they’re working on some awesome new VS tooling, so I’m really looking forward to that.

The other big challenge was testing the ADO adapter against a real database. I only own the SQL Server and SQL Compact providers, so I really wanted to test with SQL Server to keep things simple and let me step-debug if necessary. I was expecting not to have fun with this, but it turned out fine. I run my Windows development environment on my MacBook using Parallels 7, so I set up the Host-Guest networking (easy) and opened inbound port 1433 in Windows Firewall (also easy) and that was it. There were a couple of failing tests, one calling a stored procedure with a DataTable and one involving a scalar function, but the rest just passed. I’m guessing that the majority of people who want to use Simple.Data on Mono will be using one of the OSS DB providers, so hopefully this won’t be a problem.

Releases

I don’t know what the Mono NuGet situation is, so I’ll be releasing Mono builds as tgz downloads from the GitHub project page. For the time being, there are differences between the Mono build of Simple.Data.SqlServer and the Microsoft .NET build, so if you want to use that provider, don’t use the NuGet version.

Right now, there’s a 0.11.4 build in the Downloads section which I hope works. However, I haven’t really exercised it to any great degree, so if you encounter any problems please raise an issue if you think it’s a bug, or ask on the mailing list if you aren’t sure.

If you are one of the developers of an adapter or provider, you might want to test against Mono. If, for any reason, that’s not an option, then let me know and I’ll try to find time to test it for you.

Going forward, I will support (in the OSS sense of the word) both Microsoft .NET and Mono for all releases.

Simple.Data 0.11

Some API changes and enhancements

After the slow-down in development caused by all that InMemoryAdapter stuff, there were a few important things I needed to address quickly. One of these will have broken third-party adapters (but not providers) so let’s talk about that one first.

Get

var db = DatabaseHelper.Open();
var user = db.Users.Get(1);
Assert.AreEqual(1, user.Id);

I’m not really sure why Get wasn’t already there, to be honest. Part of the problem is that it requires a new abstract method internally, and until adapter authors implement that method, their users are stuck on <0.11, which I try to avoid where possible.

For the ADO adapter, Get will use the table’s primary key to construct the query (once; it’s then cached internally, no worries about performance). For the MongoDB adapter, I’d expect it to use the built-in id value that Mongo assigns to all records. Somebody is working on an OData adapter, for which, e.g., Customers.Get(1001) will resolve to the /Customers(1001) URL.

Get is supported in the InMemoryAdapter, but you’ll have to configure the key(s) for each table:

var adapter = new InMemoryAdapter();
adapter.SetKeyColumn("Test", "Id");
Database.UseMockAdapter(adapter);
var db = Database.Open();
db.Test.Insert(Id: 1, Name: "Alice");
var record = db.Test.Get(1);
Assert.IsNotNull(record);
Assert.AreEqual(1, record.Id);
Assert.AreEqual("Alice", record.Name);

Trace configurability

The ADO adapter has been writing all generated SQL to the Trace output at the point of execution for a while now. While this is often very useful, I’ve had a couple of people ask if I could make it turn-off-and-on-able, so I have. You can do this in two ways:

In code:

Database.TraceLevel = TraceLevel.Off;

In config:

<?xml version="1.0" encoding="utf-8" ?> <configuration> <configSections> <sectionGroup name="simpleData"> <section name="simpleDataConfiguration"                type="Simple.Data.SimpleDataConfigurationSection, Simple.Data"/> </sectionGroup> </configSections> <simpleData> <simpleDataConfiguration traceLevel="Error"/> </simpleData> </configuration>

(Gotta love XML.)

ADO SQL output will happen with the trace level set to Info, Warning or Error.

More ADO connection control

I occasionally see how Simple.Data performs compared to other ORM/micro-ORM tools, using the PerformanceTests project from Dapper. I was running this through the other day, and I realised that Simple.Data was losing a lot of time opening and closing connections, while the other test cases were mostly using an open connection for the duration of the test. I’ve had a few comments that they’d like more control over the connection, or that Simple.Data is too aggressive in closing connections, so I decided to improve my standing in the Dapper smack-down and hopefully help some real people out too.

Start using an open connection like this:

SqlConnection connection = Program.GetOpenConnection();
((AdoAdapter) db.GetAdapter()).UseSharedConnection(connection);

And stop using it again like this:

((AdoAdapter) simpleDb.GetAdapter()).StopUsingSharedConnection();

And that’s it. In my performance test project (which is in the solution on Github), this knocked 20-30% off the runtime, with 500 FindById operations taking ~80ms, versus ~50ms using plain ADO.

I’ve also tried to tone down the aggression a little when it comes to closing connections, doing it as soon as possible, instead of (occasionally) before.

Immediate road-map

I want to get the Azure Table Service adapter done, and help with the OData adapter, and I’ve got a cool website to build using Simple.Data and Nancy, so Core will go into maintenance while I do those things for a while. I’ll try to fix any problems in a timely fashion, as usual. If you’ve got anything still outstanding as of 0.11.1 (now on Nuget) please gently remind me on Twitter.

Just a quick note about Tests

I’ve been fixing a few bugs in Simple.Data over the last couple of days, and I’m feeling the need to post something about the benefits of a good test suite.

For most feature development on Simple.Data, I do test-driven development. The dynamic nature of my API lends itself really well to this approach; this is one of the real reasons TDD is popular with dynamic languages like Ruby and Python. When I want to add a new Query operator or method, I can write a test that uses the syntax I’m aiming for, and the Behaviour test project will compile and run, and I’ll get a failing test, either with a failed assert or an exception. I really like the latter form of failure, since I get a stack trace and I can just dive into the code at that point and start working out what’s wrong.

So that’s great, but the thing that makes me want to write this post is my “QA” process. When someone reports a bug, like this one, it means my test suite is incomplete. So again, the first thing I do is to create a test which reproduces the bug. Then I fix the code so that test passes. Then I do a Release build, and run the full set of tests (currently 560+ including integration tests) to make sure that the fix hasn’t broken anything else. And then, and this is the really awesome part: if all the tests pass, I package the build and push it to NuGet.

I can do this because I trust those tests to be verifying all the behaviour that Simple.Data users are relying on in their applications. If the tests pass, I’m not going to break anybody’s system when they update to the new version. On the (rare) occasions when this has gone wrong, it’s been because I didn’t have the right tests, and I’ve gone back and added them (would you believe the SQL Server test project didn’t test delete’s until yesterday? Epic fail).

When people look at the Simple.Data repository, and say “wow, you’ve got lots of tests”, it’s as if they think that’s a discipline thing, that I’m just really conscientious about coverage. But that’s not it. I couldn’t do this without the tests. And neither can you, whatever you’re working on.

Internal for a reason

One of the things about working on an open source project is that everybody can go and hunt through the code and see all the private, protected and internal things. I occasionally get requests to make certain types or methods public, because the requester can see a way to use them to achieve some piece of functionality that they are otherwise unable to implement. Sometimes these requests have an undertone of “why are you hiding this stuff in the first place?” So I thought I’d take a moment to explain the reason why we have these visibility modifiers at all, why I use them the way I do, and also to highlight a couple of times where I was over-protective (or over-internalist) and did actually make the change.

You can’t depend on me

When a type is internal, or a method isn’t public, it means your code can’t take a dependency on it. Think about how much we try to avoid firm dependencies just within a single codebase with a single programmer maintaining it. We do this to avoid the code becoming brittle and hard to maintain; the fewer things that directly depend on something being the way it is, the fewer places we need to make changes when we change the thing they depend on. This is just good practice.

For a public API, it becomes even more important. Every publically accessible member in an API gets an arbitrary number of dependencies the moment you publish the library. So you hide everything, just as a matter of course, and only make things public if you really have to.

As an example, in Simple.Data, I have a type called ExpressionHelper, which turns an IDictionary<string. object> into a SimpleExpression composed of straight-forward equality comparisons. I use it to generate the criteria for all the FindBy/FindAllBy methods. I got a change request asking me to make this type public, and I rejected it, because I think there might be a better approach so I want to reserve the right to completely replace ExpressionHelper and remove it from the code-base entirely. I’m not even sure there are any direct tests against ExpressionHelper; it’s probably covered through the higher-level behaviour tests, and that’s the way it should be.

Now, if you want to copy ExpressionHelper out of my code-base and include it in your own source, well, that’s fine, and more power to you. That’s one of the beauties of Open Source. But if I’ve marked something as private or internal, it means I don’t think it’s a required part of the API, and I want to reserve the right to change it, or remove it, as and when the need arises.

An example of when I changed my mind

There is a public method on the Database class called “GetAdapter”. Until a few versions ago, this method was internal. Then Bobby Johnson (@NotMyself) started having some issues with his SQLite provider when using SQLite’s in-memory mode. He needed to be able to get to the underlying ADO connection to do initial setup for tests. My initial reaction was doubt, because my immediate take on what he wanted to do was

“Get the IDbConnection from the Database object”

which would be bad, not least because the Database object doesn’t necessarily have an IDbConnection; it might be using the Mongo adapter, or the Azure Table Service adapter (I’m working on it) or any of the adapters I hope people will write in the future.

But what Bobby actually wanted to do was

“Get the Adapter from the Database, and if it’s an AdoAdapter, get the IDbConnection from that.”

which is fine, and it’s why we have the as keyword in C#. And it wouldn’t just apply to AdoAdapter, there might be other adapters with arbitrary public methods which could be used in this way. This was a case of me being over-private, and I made the change, and I think the API is much better for it.

Tonight’s movie

will be Falling Down. An unemployed defense worker frustrated with the various flaws he sees in society, begins to psychotically and violently lash out against them.

That is all.

Simple.Data 0.8.0 and more

Status update

I’ve just pushed Simple.Data 0.8.0 to NuGet. The version hike here is mainly because of a fairly big change in the way the core communicates with adapters to run servers, although there is a new feature that I really like; more on that in a bit.

From now on, I’m going to follow semantic versioning. This 0.8 release is intended to be the last one that is going to change the “internal API” that adapter authors have to implement. It also represents the full public API that will be in the 1.0 release. The only task now is to document that API and fix any bugs or performance issues that come up.

I’ve started doing some proper profiling of the code using ANTS Profiler; so far, it hasn’t thrown up anything hideous, or any obvious paths for optimization, but there will probably be a couple of “best practice” notes that come out of it. I’ve yet to do any memory profiling, so there’s still that to look forward to.

I’ve also run NDepend across the code, and there are a few things I maybe need to clean up (and a few things it needs to shut up about), but generally speaking, it’s pretty good quality. The main issue so far is tracking down junk methods and types that aren’t used any more due to the various brutal refactorings that have gone before.

Both these tools are excellent, and I’m intending to use them as a matter of course on all my projects from now on. Watch this space for some posts on my experience with them during this phase of Simple.Data development.

The final metric I’m focused on is code coverage, which I’m measuring with a combination of dotCover and Mighty Moose. I’ve added nearly 200 new tests since I started this exercise, and quite a few of them highlighted some minor issues which I’ve fixed. It’s not just for show! For the entire project code coverage is now up around 85%, which is good, but still not quite good enough.

Aside from all the code stuff, I’m getting around to some proper documentation, too.

That new feature: “Magic Counting”

A feature I’ve been asked for a couple of times is the ability to run multiple statements in a single database call, and in both cases, the underlying requirement was the same: getting the overall count for a query in conjunction with paging. A couple of weeks ago I was reading a post on Ayende’s blog where he showed some sample RavenDB query code that included a call to a method called Statistics, which dumped out various bits of info about the query into a special class, and I really liked the way it did that, so I borrowed the idea. Now, in Simple.Data, you can do this:

Future<int> count;
var q = _db.Customers.QueryByType("Valued")
.WithTotalCount(out count)
.Skip(40)
.Take(10);
// q returns 10 records
// count.Value is set to total number of records matching criteria

The Future<int> type is a very lightweight wrapper type that doesn’t have its value set when you get it back from that out call, but will have a value after the query has executed (whether by calling ToList, or just starting a foreach). The count and the paged result set are retrieved in a single database call (assuming the provider supports compound statements; SQL Server does, SQL Compact doesn’t, for other providers YMMV), which means performance is good and your DBA should be happy.

Simple.Data.Pad

While I’m here, if you missed the brief mention on Twitter, I’ve knocked together a GUI tool – inspired by LINQPad – which lets you test out queries against your data store and provides some basic auto-complete functionality! I’m hoping that this will help to compensate for the lack of IntelliSense™ in Visual Studio; I’m actually finding it a pretty useful tool in its own right. It dumps out all the SQL that gets run in a separate tab so you can keep tabs on it, too.

It’s very basic at the moment, but I’ll be using it to help me write the documentation so I’ll polish it up as I go. If you want to play, best to build from the source from Github.

One last thing

I did an interview with Dot Net Rocks, about Simple.Data and my thoughts on simple, minimal development frameworks generally. It’s here if you like that sort of thing.

Simple.Data 0.7

Feature complete!

I am very, very happy to announce that, with the latest release of Simple.Data – 0.7.1 – to NuGet, it is essentially feature-complete, at least as far as my personal road-map for it is concerned. There are still a couple of things to implement which will help adapter authors out, but for end-user-developers, it covers most of the things we need from a data-access framework. That is not to say I am not open to feature requests, particularly those that are for blatantly obvious things which I have half-wittedly omitted. The 0.7 series of releases will focus on improving performance, tidying up some of the code, and improving test coverage to catch any naughty bugs that are hiding out in there. The last release in the 0.7 series will effectively be the final 1.0 Release Candidate.

Before I embark on this final stretch, I’m going to take some time to create proper documentation so that all you lovely people out there can really put Simple.Data through its paces. Because of the dynamic nature of the framework, it lacks the discoverability that most .NET libraries get through IntelliSense, so it’s important to have some good quality docs to make up for that. I’m going to try and use Github’s wiki facility to build and maintain, particularly because community contribution would be greatly appreciated, but if that comes up short in any way, I’ll look for a better solution.

Anyway, never mind all that:

What’s new in 0.7.1?

There’s a fair selection of new features in this release, and one breaking change which I’ll get out of the way first:

Count and CountBy are now GetCount and GetCountBy

In order to support the Count operator within queries, I’ve had to change the name of the Count methods on the table object, so these are now called GetCount and GetCountBy:

db.Posts.GetCount(db.Posts.Rating >= 4);
db.Posts.GetCountByTag("simpledata");

“Having” clause support

Queries now support a Having method, which represents the clause of the same name in SQL selects. You can use all the aggregate methods that are supported in column selection lists: Min, Max, Avg, Sum and Count.

// Find all posts without comments
db.Posts.All().Having(db.Posts.Comments.CommentId.Count() == 0);

Math operators

Expressions can now include simple mathematical operators: add, subtract, multiple, divide and modulo.

// Find all animals with an odd number of legs
db.Animals.Find(db.Animals.LegCount % 2 == 1);

Bulk Inserts and Updates

You can now pass lists (or IEnumerables) of dynamically- or statically-typed objects to the Insert, Update and UpdateBy methods. In the case of Insert, when using the Ado adapter, you get back a new list of objects with any database-assigned default values such as identity values or timestamps. The Update methods just return a count at the moment, but I might change that to return the updated records.

(One thing that definitely may be a possibility by 1.0 is an Upsert method, which would take a list of objects, insert the new ones and update the existing ones.)

Fixes

Guids are once again supported for SQL Server’s “uniqueidentifier” type, and probably the equivalents in the other supported ADO databases.

A null-equality expression in Simple.Data criteria will become an “IS NULL” expression in the SQL generated by the ADO adapter.

Various other glitches and peccadilloes have been spotted and terminated with extreme prejudice. However, I have reason to believe that there may be more glitches, and even another peccadillo or two, at large in the code. If you encounter one, please either fork the code, kill the creature and send me a pull request, or, if it’s particularly fierce and you don’t want to get near its teeth, just open an issue on Github with some sample code that will draw it out into the open so I can shoot it.

Go for it

At this point, I am happy to recommend that you can use Simple.Data in production systems. Many people are already, with no reported problems. Obviously your mileage may vary, but I am confident that any speed-bumps or potholes (or rogue metaphors) that you run across will be relatively minor and easy-to-fix (except the metaphors, which are tenacious and systemic).

I’ll post here again once I’ve made a start on some comprehensive documentation. In the meantime, if you need any help, the first place to try is the Google group/mailing list; the second is the Github site, and the third is either me on Twitter or, potentially, the creator of your particular adapter or Ado provider.

Simple.Data 0.6.8

I just pushed Simple.Data 0.6.8 to NuGet. Closing in on 1.0 now, just a couple more things to go. So what’s new in this release?

Explicit join syntax

Queries have supported implicit, “natural” joins for a while, where the relationship between two tables could be discovered from the database system catalogs. This was used by criteria expressions, column lists and so on, which would automatically add the join clause(s) to the select statement. But implicit joins have their limits. Your database may not have referential integrity implemented fully, if at all. And even if it does, there is no way to infer from it the nature of self-joins. Here are some examples:

Straight-forward joins without referential integrity

There are two syntaxes available. The first uses named parameters to specify columns from the table being joined:

var q = _db.Employees.Query()
.Join(_db.Department, Id: _db.Employees.DepartmentId)
.Select(_db.Employees.Name, _db.Department.Name.As("Department"));

The second uses an interim operator, On, which takes a criteria expression:

var q = _db.Employees.Query()
.Join(_db.Department).On(_db.Department.Id == _db.Employees.DepartmentId)
.Select(_db.Employees.Name, _db.Department.Name.As("Department"));

The second form is more verbose but allows greater flexibility. In either case, the forms that are supported when using the analogous Find method (e.g. ranges and arrays, and literal values) are also supported for join criteria.

Self-joins

The above two syntax forms are also supported for self-joins, but there is an additional level of complexity caused by table aliasing.

First, the named parameter syntax:

var q = _db.Employees.Query()
.Join(_db.Employees.As("Manager"), Id: _db.Employees.ManagerId);

q = q.Select(_db.Employees.Name, q.Manager.Name.As("Manager"));

In this case, we’ve had to split the query into two statements, because the aliased form of the table is added to the underlying SimpleQuery object; it has to go somewhere. For the Select clause to work, q must have been assigned a value so that we can use it as a parameter. I’m not wild about this, and if anybody can think of a better approach, I’d love to hear it.

That need for q to have a value arises sooner in the second form:

var q = _db.Employees.Query();
q = q.Join(_db.Employees.As("Manager")).On(q.Manager.Id == _db.Employees.ManagerId);
q = q.Select(_db.Employees.Name, q.Manager.Name.As("Manager"));

Here, we’re passing the alias contained in q into the On method, so we need to interrupt the method chain before that call. And because each method call in the chain returns a new SimpleQuery object, we need to reassign q so that it will understand the alias in the Select call.

I’ve just had a thought about out parameters that might make this a bit more elegant. Hmm.

ToScalarList and ToScalarArray

When materializing a query, you can now use these new methods to pull out the first column value in each row. There are also generic versions, ToScalarList<T> and ToScalarArray<T>, which will cast the result to the required type. (These methods were requested by @kristofclaes, who is writing a Photo Blog app using Simple.Data and Nancy.)

And finally…

I made a silly mistake while optimizing the FindBy code which resulted in the query being run twice. Not very optimal. Thanks go to @korneliuk for spotting this and sending me a pull request with the fix.

Still to do

I’ve got to add support for the Having clause, for completeness, and then I think querying will be done. But I also want to implement a general-purpose system for running queries over in-memory data, so that adapters for non-SQL data sources which don’t support some of the traditional functionality such as grouping or Skip/Take can use it to fill in the missing functionality. I’ve done a spike on this and it’s not as complicated as it sounds, so it shouldn’t take too long.

After that there’ll be a minor version bump to 0.7; releases within that series will focus on optimization and code quality, so lots of profiling and NDepend japery, which might inspire some blog posts. And then… 1.0!

Follow

Get every new post delivered to your Inbox.