Simple.Data 0.6.6

Nice new features

Hot onto NuGet now, Simple.Data 0.6.6 has a couple of nice new features to make life even easier.

Cleverer type handling in Simple.Data.Ado

The Ado adapter now tries much harder to make the data that it is given work with the underlying provider. In practice, this means that it will check to see whether the value is a “known-good type”, and if it’s not, it will try calling “ToString” to see if it returns something other than the type’s full name; if it does, it will pass that string. This means that, amongst other things, you can pass the dynamic values from Nancy’s Request objects across without casting them. So technically, you can now say


db.Comments.Insert(Request.Form)

although I wouldn’t recommend it.

Child rows returned as Query

Up to now, if you used the inferred detail property feature in Simple.Data, what you got from that property was an IEnumerable<dynamic>. Now, you get a SimpleQuery, which means you can do filtering, ordering and so on on those detail rows.

This feature means that the query comes pre-created with some criteria, so I’ve changed the way the Where method works (still in beta, I’m allowed). Now, Where will combine any existing criteria with the specified criteria. If you want to completely replace any existing criteria, use the new ReplaceWhere method.

ToScalarOrDefault

I don’t think I mentioned this last time, but there is now a SimpleQuery.ToScalar method which, as long as your query returns one row with one column, will return that single value. If there is not exactly one row, an exception will be thrown. 0.6.6 adds ToScalarOrDefault, which will return null if no row was found, and ToScalarOrDefault<T>, which returns default(T).

Detail properties on Queries

By “detail properties” I mean those properties on records returned from Simple.Data which execute a further query on the database and return detail (or master) records. These are now available on SimpleQuery objects, so you can say:


db.Customers.QueryByTown(“London”).Orders.Count();

for example. This code runs only one command against your database; the Orders pseudo-property returns a new query which incorporates the criteria from the original Customers query. Thus your DBA is happy, because you are not doing that SELECT N+1 thing he/she hates so much. (Of course, if they’d just upgrade the server to that one with a PCI-e SSD array it wouldn’t matter anyway. But no. So here we are.)

An aside on writing APIs

If you check the commit history for Simple.Data, you’ll see that I implemented that last change twice. The first time I did something super-awesome with “rebasing” expressions and object references, and it all worked as if by magic. But then on the drive home, I realised that (a) that was the hard way, and (b) it hides too much information from adapter authors, who might be implementing this kind of functionality in a totally different way. So I re-implemented it in a much more obvious way, which also included a bit of much-needed refactoring, and it all works as it should and adapter authors won’t hate me any more than they already do.

That’s it for Simple.Data coding for this week; I’ve got to polish up my demo apps for DDD South West this weekend. The same mini-app created twice: once in WebMatrix with the data access layer that started all this; and once in Nancy using Simple.Data. Guess which one’s better.

Simple.Data 0.6.5

New features

I haven’t kept up to date with posts about the new features that are in the Simple.Data 0.6.x releases. I’m going to go ahead and blame the new baby, because he can’t defend himself. :)

So let’s have a bit of a catch-up. The 0.6 changes are all about more advanced querying. Up to now, it’s been possible to use complex criteria to search a single table, but the data that comes back is limited to the full row from the “central” table. That’s all well and good for the not-really-an-ORM use cases, but I’ve always envisaged Simple.Data as an alternative to ADO.NET and WebMatrix.Data, so it needs to be able to do complex queries and give the developer more control. Let’s take a look at some of the features that are in 0.6.5, which I’ve just pushed to NuGet this evening.

SimpleQuery

The core of these new features is the new SimpleQuery type. This is a dynamic type which allows queries to be constructed in a fluent style that resembles the dotted notation form of LINQ. Since 0.6.1, the All, FindAll and FindAllBy methods have returned a SimpleQuery instead of a result set. There are also new Query and QueryBy methods, which are effectively identical to All and FindBy, but make the intent clearer in your code. The query executes when it is enumerated in any way, so all this should be transparent to you, and backwards-compatible with any existing code.

Ordering and Paging

The first thing to get implemented, because it was the thing the most people were asking for, was paging. Paging allows you to specify which subset of rows within a result set you want to return, which is very useful for web applications where you don’t want to return 100,000 rows to your JavaScript client.

Ordering

Of course, if you’re returning a subset of rows, you need to specify an order so you get consistent results. So SimpleQuery supports ordering in a couple of ways:

If you’re only selecting from a single table, you can use the magic methods approach:

db.Users.All().OrderByJoinDate();

Or for most recent first:

db.Users.All().OrderByJoinDateDescending();

As much as possible, I’m trying to keep the method names similar to LINQ operators, so for ordering on multiple columns, there’s ThenBy:

db.Users.All().OrderByJoinDate().ThenByNickname();

In 0.6.4, you’ll also be able to specify a column list within an OrderBy[Descending] or ThenBy[Descending], like:

db.Users.All().OrderBy(db.Users.JoinDate, db.Users.Nickname);

Paging

Again trying to use LINQ conventions to keep things simple, paging is handled using Skip and Take methods:

db.Users.All().OrderByNickname().Skip(10).Take(10);

Although ideally you should, you don’t have to specify an OrderBy; if you omit it the query will be arbitrarily ordered on the first column, since that is commonly the table’s key.

The important thing to note here is that the paging is applied at the database level, using syntax specific to each RDBMS, or NoSQL datastore. (In the case of SQL Server, it uses ROWNUMBER OVER ORDER syntax; for SQL Compact, it’s the alternative OFFSET x FETCH NEXT y syntax. The authors of the other providers have to implement an interface to make it work with their RDBMS; adapter authors can implement paging however it is appropriate for their underlying store.) The thing I think is really neat about this is that it’s taken something which is differently implemented in almost every different database, and abstracts it away from you, meaning that you can seamlessly work with, for example, SQL Server Compact or SQLite in your development and test environment, and then SQL Server 2008 or Oracle in production.

Explicit column selection

As I’ve already said, prior to 0.6, Simple.Data has just returned complete rows from a single table. It just did a “SELECT * FROM [table]…” under the covers. This is not ideal for some scenarios, such as very wide tables, or tables with lookup data that should be pulled back.

Now, SimpleQuery has a Select method, which works like this:

db.Users.All().Select(db.Users.Id, db.Users.Name);

You can also use the natural joining capabilities within that list:

db.Users.All().Select(db.Users.Name, db.Users.UserProfiles.BioText);

Obviously if you do that with a one-to-many join, you’ll get the master table’s data repeated as you would with a SQL statement. Which is what is getting run. So that makes sense, really.

Aliasing

This introduces the possibility of column name ambiguity into the mix; you might want to select two columns with the same name from two tables.

To cope with this, you can now assign aliases to columns within the selection using the As method:

db.Employees.All().Select(db.Employee.Name, db.Employees.Department.Name.As("DepartmentName");
Console.WriteLine("{0} works in {1}", emp.Name, emp.DepartmentName);

Aggregates

Long post! I had a good afternoon of coding last weekend. So the last thing to announce for this 0.6.5 release is support for aggregate functions. These are implemented as methods on the query object, the table object or the column objects within a selection list.

Table aggregates

At the table level, there is now support for Count and Exists, which is also exposed as Any. These methods can be called with no parameters on a Query, or as methods similar to Find and FindBy on the table object:

int adults = _db.Users.Query().Where(_db.Users.Age &gt;= 18).Count();
// … is the same as …
int adults = _db.Users.Count(_db.Users.Age &gt;= 18);
bool anyBirthdays = _db.Users.FindByDateOfBirth(DateTime.Now.Date).Any();
// … is the same as …
bool anyBirthdays = _db.Users.ExistsByDateOfBirth(DateTime.Now.Date);
// … and also …
bool anyBirthdays = _db.Users.AnyByDateOfBirth(DateTime.Now.Date);

That’s several ways of doing the same thing, but, hey, this is data access, so it’s kind of traditional. If you’re a provider author, don’t worry, it’s all handled. If you’re an adapter author, the SimpleQuery object that you get to play with will look the same regardless which of these methods are used.

Column aggregates

The most common aggregate methods at column level have been implemented: Min, Max, Sum and Average. You can specify these as methods on the column objects in the Select call:

_db.Employees.Query()
    .Select(_db.Employees.Department.Name, _db.Employees.Age.Average().As("AverageAge"));

The observant reader will have noticed that there is no GroupBy method in that code. I was trying to work out how to implement it in the simplest way, and I realised that whatever was specified in the GroupBy would have to be all the columns from the Select method which weren’t aggregates. Since that information already exists in the Query object, why make the developer write it out again? I did a little checking, and MySQL is of the same opinion, but most RDBMS systems still seem to require an explicit GROUP BY clause. So that’s another place where Simple.Data makes your life a little bit easier.

Length

I’ve also added support for the LEN function, which is represented by a Length method on columns:

_db.Users.Query.Select(_db.Users.Name.Length().As("NameLength"));

Length calls will not apply grouping to the query.

And also

There is more new functionality (for example the ToScalar<T>() method) but I want to wrap this up and get the release done. Do take a look at the code for SimpleQuery if you want to know what other methods it supports; reading the code (especially the tests) is still the best way of learning Simple.Data until I get some proper documentation worked out.

I’ll write another post this week detailing what’s left to do before I bump the version to 0.9 and call it a feature-complete beta, but in the meantime, if any of these new features are useful to you, please do get 0.6.5 and start hammering it; I’ll try to turn around fixes rapidly if you report bugs on Github. My absolute favourite form of bug report is a pull request with a fix, but pull requests with failing tests added to the test suite are a very close second.

Simple.Data dependencies

There are a couple of things that have come up recently regarding the dependencies Simple.Data has outside of the Base Class Library (BCL).

WebMatrix.Data

I got a shout-out from Rob Conery in his post about his ultra-lightweight data access library file (which is very nice, btw). He suggested that Simple.Data “embraced” WebMatrix.Data, which is not actually the case. Simple.Data has its own adapter and provider model, and the supplied ADO adapter and SQL Server and SQL CE providers are built on top of plain old ADO. It also has its own SimpleRecord dynamic type, as opposed to using ExpandoObject. The SimpleRecord type includes a couple of advanced features for lazy-loading joined data, and convention-based implicit casts to static types.

The important thing about Simple.Data’s adapter model is that, as with the Ruby DataMapper library, it’s pretty straightforward to create adapters which expose non-SQL data stores. A guy called Craig Wilson has created a MongoDB adapter, and I’m just consolidating some of the changes he made to the core code to facilitate this before releasing 0.5, so he can release his adapter at the same time. I’m hoping that people will contribute adapters for other NoSQL stores; I’ve already got one in the works for Azure Table Storage.

Reactive Extensions

I was a bit unsure whether I should add the dependency to the Reactive Extensions to the project, but I went for it because it provided a really nice way to do a buffered fetch from the database while still returning the first record to the application code as soon as it was available. Last week, though, the Rx guys pushed a new version of their packages to NuGet, and people installing Simple.Data from there found that it didn’t work. This is because the Rx library is not open-source, and they strongly-name their assemblies, which is entirely incompatible with the idea of a package manager.

So I’ve written my own code for handling the buffered fetching, and I’ll be removing the Rx dependency from Simple.Data in 0.5. This means that using Simple.Data in your project will not require any dependencies outside the BCL.

I’m not ruling out adding other dependencies to the package – I don’t want to waste time reinventing wheels if I can avoid it – but I’m going to restrict it to open-source efforts in future.

Simple.Data 0.2.0 available

Having polished some rough edges and created some initial documentation, I am pleased to announce the availability of the first stable release of Simple.Data. The release has been branched and will be supported with bug-fixes; new developments will be in the 0.3 trunk, leading up to another stable release in 0.4.

This release includes all the things I’ve posted about over the last couple of weeks, such as:

  • A variety of Find, Insert, Update and Delete methods;
  • Complex queries and joins across tables;
  • Adapters for SQL Server and SQL Server Compact Edition 3.5;
  • A Mocking assembly to make unit-testing your projects easier;
  • XmlMockAdapter, which lets you specify data for Simple.Data calls using an XML string within the test definition;

If you’re not into the whole git cloning and building thing, you can just download the binaries here.

Documentation is on the Github wiki. I’ll be extending it over the coming days.

If you encounter any problems, please consider creating an issue to report it. For any questions, you can either contact me directly through Github or Twitter, or I’ve created a Google Group for discussion, if the library gets enough users to start one.

That’s it for now, because it’s really, really late. I’ve got a couple of posts lined up about the development and a look at the Ruby gems that inspired this project, so check back for those soon.

Enjoy.

Announcing Simple.Data 0.1.4

Just confirming and announcing availability of what I proposed in this post.

So far, Simple.Data has only supported equality comparisons in queries with the FindBy* methods. Also, query operations were only possible against a single table. Obviously this is not going to get very far in real-world applications, and more complex queries across more than one table must be supported.

Complex criteria

As of the current version, the Find method on tables within a Simple.Data database will accept an expression similar to this:

db.Customers.Find(db.Customers.MoneyOwing > 0);

The (SQL Server) SQL that is generated for this looks like:

SELECT [Customers].* FROM [Customers]
WHERE [Customers].[MoneyOwing] > @p1

And the command has a parameter with the value 0. No SQL injection here!

Currently, the library supports equal, not equal, and greater/less than (or equal) operators. For strings, the LIKE operator is always used, so wildcards are supported.

Joining tables

In the SQL Server ADO adapter in version 0.1.4, joining is only possible using “natural joins” defined by referential integrity. This is a design decision, and it’s because I want the library to support NoSQL data sources where explicit joins are not generally supported. However, if you are using, or are considering using, Simple.Data and this is a blocker for you, please let me know by raising an issue at the Github project or sending me a message on Twitter.

Assuming that you have the primary and foreign keys defined in your SQL database, this Find operation:

db.Customers.Find(db.Customers.Invoices.Paid == “N”)

will produce this query:

SELECT [Customers].* FROM [Customers]
JOIN [Invoices] ON [Customers].[CustomerId] = [Invoices].[CustomerId]
WHERE [Invoices].[Paid] = @p1

Object name resolution

This release also provides better support for resolving object names. Often, database naming conventions are different from those used in application code; also, most databases support things like spaces and other special characters in object names.

Simple.Data will attempt to resolve names by removing all characters except for alpha-numerics and underscores from object names, and forcing all names to lower case. It will also try the plural or singular forms of names in its quest for a match.

So, for example, OrderItems.ItemName will match all of:

  • OrderItems.ItemName
  • OrderItem.ItemName
  • ORDER_ITEMS.ITEM_NAME
  • ORDER_ITEM.ITEM_NAME
  • [ORDER ITEMS].[ITEM NAME]

And other such variations. Obviously this causes a problem if the database contains names that will be ambiguous without the formatting, case and special characters, but I consider that [a] an edge case, and [b] the product of a bloody stupid database designer.

Inserted records returned from Insert

The ADO adapter for SQL Server will now return an record from the Insert operation, with any values set or modified by the database in it, as long as the table has an IDENTITY column.

Type specification in XmlMockAdapter

You can now specify the type of a value in the XML document supplied to XmlMockAdapter:

<data><Users Id=”System.Int32”><User Id=”1” …/></Users></data>

Coming in 0.1.5

The main aims for the 0.1.5 release are improved, more informative exception handling, and performance enhancements with caching. This will then, hopefully, become the stable 0.2 release which will be released via NuPack and OpenWrap.

Mocking with Simple.Data’s Adapter model

As promised, more details on my Tumblr post from yesterday.

The inherent difficulties in automated testing of database-dependent code are well known. For pure unit tests, which exercise minimal chunks of functionality and definitively ignore dependencies, the issue is less important. But for behaviour tests, which exercise components at a higher level, the ability to provide mock data is extremely useful. Unfortunately, where your data access code directly addresses a specific data store, substituting some form of mock or stub for that store is labour-intensive, and can significantly impact the performance of your tests if a database must be set-up and torn-down each time a test or suite of tests is run.

Simple.Data provides a level of abstraction from the underlying data store, since it works without requiring the developer to write any actual SQL code, or to generate code or construct mappings to link objects to the store. Instead, it uses a convention-over-configuration approach to translate dynamic constructs in C# code into operations against a data store, which may or may not be a relational database.

The Adapter model

I recently refactored the internals of Simple.Data so that all store-related operations are provided by an implementation of a basic interface. An implementation which uses ADO.NET classes and generates SQL is included with the library, but it is possible to write your own adapters to persist data to anything you like. I plan to create an adapter for working with Windows Azure Table Storage.

We can provide this functionality because of the way data operations are specified when using Simple.Data. For example, to find a User given an email address and password, we can use this code:

var user = db.Users.FindByEmailAndPassword(email, password);

When the ADO adapter is used against a SQL Server database, this will be translated into a parameterised SQL statement:

select * from [Users] where [Email] = @p0 and [Password] = @p1

An alternate adapter, though, could turn the exact same call into anything it wanted. For example, against Azure, we could produce an OData-style REST query:

https://odatasource.net/Users?$filter=Email eq ‘foo’ and Password eq ‘bar’

OK, so that’s cool and fun and possibly moderately useful, right? If you were writing a system to be back-end agnostic, that might be handy. But it also means we can write a very, very basic adapter which works in-memory, just to satisfy data-store dependencies in testing.

XmlMockAdapter

The system I’m working on has a feature that will send a password reminder to people if they forget theirs. There is a controller action which takes an email address, looks for a matching user, and sends an email or displays a message accordingly. The controller talks to a Data Access Layer, which is in turn using Simple.Data to talk to a SQL Server database. I wanted some way that the controller could talk to the Data Access Layer, which would talk to Simple.Data, and get served some testing data without needing SQL Server to be present.

So, I threw together an implementation of IAdapter which took an XML string in its constructor, and translated Find operations into LINQ-to-XML queries. The meat of the implementation is this:

public IEnumerable> FindAll(string tableName, IDictionary criteria)
{
    var query = GetTableElement(tableName).Elements();
    foreach (var criterion in criteria)
    {
        var column = criterion.Key;
        var value = criterion.Value;
        query = query.Where(xe => xe.TryGetAttributeValue(column).Equals(value.ToString()));
    }

    return query.Select(e => e.AttributesToDictionary());
}

I also added a Mocking library to the solution, which provides access to some internals of the main assembly, and allows you to override the database connection process and inject your own database or adapter to be used instead. The code to do that looks like this:

MockHelper.UseMockAdapter(
     new XmlMockAdapter(
         @"<data><Users><User Email=""foo"" Password=""bar""/></Users></data>"));

Which is very simple, and straightforward, and most importantly is short enough to include within the code of the test itself. That one line of code, and when my Users DAL object goes to Simple.Data and says “give me my database”, instead of the SQL Server 2008 R2 Enterprise instance which the config tells it to connect to, it will just return this little object with its one XML element which is enough to satisfy the criteria we’re using for the test.

I think that’s pretty cool. You’re welcome to disagree in the comments.

What’s next?

OK, the main issue that’s outstanding is that at present the data access of which the library is capable is a tad limited; most importantly, it can’t do joins. And with all this great stuff that I can do with the abstraction, I don’t want to have to fall back to SQL to achieve something really pretty common like that. So, here’s what I’m thinking.

var activeCustomersInLondon = db.Customers
    .Join(db.CustomerAddresses, "CA")
    .Where(Active: "Y", CA_Town: "London")
    .All();

That’s the most succinct syntax I can think of. Any good? Any better ideas?

Half-truths and sleight of hand

Tweet by Brad Wilson:

“The Simple.Data blog post is built upon half-truths and sleight of hand. Microsoft.Data supports parameters. Your stuff is vuln to concat.”

Apology by me:

I didn’t know that Microsoft.Data supported parameters. Sorry.

And yes, Simple.Data is also vulnerable to SQL injection through concatenated SQL, as is anything that lets developers execute text SQL statements against a database.

However, what I’m trying to do with this project is explore other, non-text-based ways of interacting with a database that are not as complicated as full-blown ORMs.

Simple.Data with proper types

One of the comments on my previous post asked about working with “real types”. It was something I intended to try, and I had a couple of ideas about how it might work. But one of the issues is that any value returned from a method or property on a dynamic object is also dynamic. So to get, for example, a User object back, the developer would need to use a cast or “as” conversion.

While I was working on the DynamicTable class in the Simple.Data code, I noticed one of the available overrides from the DynamicObject subclass was “TryConvert”. It turns out that this method is called when an implicit or explicit cast is attempted on the object.

ExpandoObject, which is what was being returned by the Find methods, doesn’t implement this method, so I created my own ExpandoObject called DynamicRecord and added the TryConvert override. So now, using the Simple.Data.Database class, you can do this:

var db = Database.Open(); // Connection string in config.
User user = db.Users.FindByName("Bob");

And the dynamic object will be magically implicitly cast to the User type, with compile-time checking and IntelliSense intact.

I’ve only proved this with happy-path tests so far (exact matches for all properties), and obviously it’s going to rely on the database table having column names which match the type, but it’s a very neat solution and extremely new-developer-friendly.

Updated code at http://github.com/markrendle/Simple.Data

Introducing Simple.Data

Update – 06-Aug-2010

This post was inaccurate, in that it failed to acknowledge that Microsoft.Data supports parameters in its text querying. I have made changes to address this inaccuracy.

What is it?

It’s a very lightweight data access framework that I wouldn’t dream of calling an ORM. It uses the dynamic features of Microsoft .NET 4 to provide very expressive ways of retrieving data. It provides non-SQL ways of doing things, inspired by the Ruby ActiveRecord library. It also provides SQL ways of doing things, but with features to protect against SQL injection.

Why?

Because Microsoft have identified a gap in the data access toolkit ecosystem which they think scares casual web developers who are used to environments like PHP. These developers neither want nor need a full-blown ORM like NHibernate or Entity Framework, and they don’t want to deal with the complexity of ADO.NET connections, commands and readers.

Fair enough.

Microsoft have recently released a preview of a new library, Microsoft.Data.dll, which aims to serve these developers. It provides various simple ways of opening up a database, and then lets you run SQL against it, which is how casual developers are used to working with MySQL in the PHP world. They build up their SQL strings and then they execute them.

The problem is, that’s wrong. And I believe that attempting to attract people to your stack by giving them a really easy way to carry on doing things wrong is also wrong. The right thing to do is give them a way to do things right that is as easy as, if not easier than, what they have been using.

Simple.Data is my attempt at that.

Example?

Using Microsoft.Data, you would should query a Users table like this:

var db = Database.Open(); // Connection specified in config.
string sql = "select * from users where name = @0 and password = @1";
var user = db.Query(sql, name, password).FirstOrDefault();

But you could query it like this:

var db = Database.Open(); // Connection specified in config.
string sql = "select * from users where name = '" + name
+ "' and password = '" password + "'";
var user = db.Query(sql).FirstOrDefault();

A lot of people are quite cross about this. One of the main problems they have is that building query strings in this way opens your application up to SQL injection attacks. This is a common pattern in PHP applications written by new developers.

To achieve the same task with Simple.Data, you can do this:

var db = Database.Open(); // Connection specified in config.
var user = db.Users.FindByNameAndPassword(name, password);

That’s pretty neat, right? So, did we have to generate the Database class and a bunch of table classes to make this work?

No.

In this example, the type returned by Database.Open() is dynamic. It doesn’t have a Users property, but when that property is referenced on it, it returns a new instance of a DynamicTable type, again as dynamic. That instance doesn’t actually have a method called FindByNameAndPassword, but when it’s called, it sees “FindBy” at the start of the method, so it pulls apart the rest of the method name, combines it with the arguments, and builds an ADO.NET command which safely encapsulates the name and password values inside parameters. The FindBy* methods will only return one record; there are FindAllBy* methods which return result sets. This approach is used by the Ruby/Rails ActiveRecord library; Ruby’s metaprogramming nature encourages stuff like this.

[A section which wrongly implied that Microsoft.Data did not support parameterized queries has been removed here.]

More examples…

Inserting using named parameters:

db.Users.Insert(Name: "Steve", Password: "Secret", Age: 21);

Executing an update:

db.Execute("update Stock set Current = Current - 1 where ProductId = ?", productId);

(Planned) inserting/updating using an object:

db.Stock.Insert(newStockItem);
db.Stock.UpdateByProductId(stockItem);

Roadmap

This project is more of a constructive criticism of Microsoft’s preview than anything else. I’m going to develop it a bit further (for example, the “planned” syntax support above), mainly as an exercise. One thing I’d maybe like to explore further is whether it can be used as a layer over NoSQL stores as well as RDBMS.

The project is hosted at http://github.com/markrendle/Simple.Data for anybody who wants to download and play with it.

I’d really appreciate feedback, so please do use the comments if you’ve got any criticisms, suggestions or encouragement to express.

Follow

Get every new post delivered to your Inbox.