Improving the Development and Production Time Experience with Marten V5

Marten V5 dropped last week, with significant new features for multi-tenancy scenarios and enabling users to use multiple Marten document stores in one .Net application. A big chunk of the V5 work was mostly behind the scenes trying to address user feedback from the much larger V4 release late last year. As always, the Marten documentation is here.

First, why didn’t you just…

I’d advise developers and architects to largely eliminate the word “just” and any other lullabye language from their vocabulary when talking about technical problems and solutions.

That being said:

  • Why didn’t you just use source generators instead? Most of this was done before source generators were released, and source generators are limited to information that’s available at compile time. The dynamic code generation in Marten is potentially using information that is only available at run time
  • Why didn’t you just use IL generation instead? Because I despise working directly with IL and I think that would have dramatically curtailed what was easily possible. It’s also possible that we end up having to go there eventually.

Setting the Stage

Consider this simplistic code to start a new Marten DocumentStore against a blank database and persist a single User document:

var store = DocumentStore.For("connection string");

await using var session = store.LightweightSession();
var user = new User
{
    UserName = "pmahomes", 
    FirstName = "Patrick", 
    LastName = "Mahomes"
};

session.Store(user);
await session.SaveChangesAsync();

Hopefully that code is simple enough for new users to follow and immediately start being productive with Marten. The major advantage of document databases over the more traditional RDBMS with or without an ORM is the ability to just get stuff done without having to spend a lot of time configuring databases or object to database mappings or anywhere as much underlying code to just read and write data. To that end, there’s a lot of stuff going on behind the scenes of that code up above.

First off, there’s some automatic database schema management. In the default configuration used up above, Marten is quietly checking the underlying database on the first usage of the User document type to see if the database matches Marten’s configuration for the User document, and applies database migrations at runtime to change the database as necessary.

Secondly, there’s some runtime code generation happening to “bake in” the internal handling of how User documents are read from and written to the database. It’s not apparent here, but there’s a lot of knobs you can twist in Marten to change the behavior of how a document type is stored and retrieved from the database (soft deletes, turning on more metadata tracking, turning off default metadata tracking to be leaner, etc.). That behavior even varies between the lightweight session I used up above and the behavior of IDocumentStore.OpenSession() that adds identity map behavior to the session. To be more efficient over all, Marten generates the tightest possible C# code to handle each document type, then in the default mode, actually compiles that code in memory with Roslyn and uses the dynamically built assembly.

Cool, right? I’d argue that Marten can make teams be far more productive than they would be with the more typical EF Core or Dapper backed approach. Now let’s move on to the unfortunately very real downsides of Marten’s approach and what we’ve done to improve matters:

  • The dynamic Roslyn code generation can sometimes incur a major “cold start” issue on the very first usage. It’s definitely not consistent, as some people do not see any noticeable impact and other folks tell me they get a 9 second delay on the first usage. This cold start issue is especially problematic for folks using Marten in a Serverless architecture
  • The dynamically generated code can’t be used for any kind of potentially valuable AOT optimization
  • Roslyn usage sometimes causes a big ol’ memory leak no matter what we try. This isn’t consistent, so I don’t know why
  • The database change tracking does do some in memory locking, and that’s been prone to dead lock issues in some flavors of .Net (Blazor, WPF)
  • Some of you won’t want to give your application rights to modify a database at runtime
  • In Marten V4 there were a few too many places where Marten was executing the database change detection asynchronously, but from within synchronous calls using the dreaded .GetAwaiter().GetResult() approach. Occasional deadlock issues occurred, mostly in Marten usage within Blazor.

Database Migration Improvements

Alright, let’s tackle the database migration issues first. Marten has long had some command line support so that you could detect and apply any outstanding database changes from your application itself with this call:

dotnet run -- marten-apply

If you use the command line tooling for migrations, you can now optimize Marten to just turn off all runtime database migrations like so:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
                opts.AutoCreateSchemaObjects = AutoCreate.None;
            });
    }).StartAsync();

Other folks won’t want to use the command line tooling, so there’s another option to just do all database migrations on database startup once, but otherwise completely eliminate all other potential locking in Marten V5, but this time I have to use the IHost integration:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
                
                // Mild compromise, now I've got to tell
                // Marten about the User document
                opts.RegisterDocumentType<User>();
            })

            // This tells the app to do all database migrations
            // at application startup time
            .ApplyAllDatabaseChangesOnStartup();
    }).StartAsync();

In case you’re wondering, this option is safe to use even if you have multiple application nodes starting up simultaneously. The V5 version here relies on global locks in Postgresql itself to prevent simultaneous database changes that previously resulted in interestingly chaotic failure:(

Pre-building the Generated Types

Now, onto dealing with the dynamic codegen aspect of things. V4 created a “build types ahead” model where you can generate all the dynamic code with this command line call:

dotnet run -- codegen write

You can now completely dodge the runtime code generation issue by this sequence of events:

  1. In your deployment scripts, run dotnet run -- codegen write first
  2. Compile your application, which will embed the newly generated code right into your application’s entry assembly
  3. Use the below setting to completely disable all dynamic codegen:
using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");

                // Turn off all dynamic code generation, but this
                // will blow up if the necessary type isn't compiled
                // into 
                opts.GeneratedCodeMode = TypeLoadMode.Static;
            });
    }).StartAsync();

Again though, this depends on you having all document types registered with Marten instead of depending on runtime discovery as we did in the very first sample in this post — and that’s a bit of friction. What we’ve found is that folks have found the origin pre-built generation model to be clumsy, so we went back to the drawing board for Marten V5 and came up with the…

“Auto” Generated Code Mode

For V5, we have the option shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");

                // use pre-built code if it exists, or
                // generate code if it doesn't and "just work"
                opts.GeneratedCodeMode = TypeLoadMode.Auto;
            });
    }).StartAsync();

My thinking here is that you’d just keep this on all the time, and as long as you’re running the application locally or through your integration test suite (you have one of those, right?), you’d have the dynamic types written to your main project’s code automatically (in an /Internal/Generated folder). Unless you purposely add those to your source control’s ignore list, that code will also be checked in. Woohoo, right?

Now, finally let’s put this all together and bundle all of what I would recommend as Marten best practices into the new…

Optimized Artifact Workflow

New in Marten V5 is what I named the “optimized artifact workflow” (I say “I” because I don’t think other folks like the name:)) as shown below:

using var host = Host.CreateDefaultBuilder()
    .ConfigureServices(services =>
    {
        services
            .AddMarten(opts =>
            {
                opts.Connection("connection string");
            })
            // This is the call you want!
            .OptimizeArtifactWorkflow(TypeLoadMode.Static)
            .ApplyAllDatabaseChangesOnStartup();
    })
    
    // In testing harnesses, or with AWS Lambda / Azure Functions,
    // you may have to help out .Net by explicitly setting
    // the main application assembly
    .UseApplicationProject(typeof(User).Assembly)
    
    .StartAsync();

With the OptimizeArtifaceWorkflow(TypeLoadMode.Static) usage above, Marten is running with automatic database management and “Auto” code generation if the host’s environment name is “Development” as it would typically be on a local developer box. In “Production” mode, Marten is running with all automatic database management disabled at runtime beside the initial database change application at startup. In “Production” mode, Marten is also turning off all dynamic code generation with the assumption that all necessary types can be found in the entry assembly.

The goal here was to have a quick setting that optimized Marten usage in both development and production time without having to add in a bunch of nested conditional logic for IHostEnvironment.IsDevelopment() throughout the IHost configuration code.

Exterminating Sync over Async Calls

Back to the very original sample code:

var store = DocumentStore.For("connection string");

await using var session = store.LightweightSession();
var user = new User
{
    UserName = "pmahomes", 
    FirstName = "Patrick", 
    LastName = "Mahomes"
};

session.Store(user);
await session.SaveChangesAsync();

In Marten V4, the first call to session.Store(user) would trigger the database schema detection, which behind the scenes would end up doing a .GetAwaiter().GetResult() trick to call asynchronous code within the synchronous Store() command (not gonna get into that here, but we eliminated all synchronous database schema detection functionality for unrelated reasons in V4).

In V5, we rewired a lot of the internal guts such that the database schema detection is happening instead in the call to IDocumentSession.SaveChangesAsync(), which is of course, asynchronous. That allowed us to eliminate usages of “sync over async” calls. Likewise, we made similar changes throughout other areas of Marten.

Summary

The hope here is that we can make our users be more successful with Marten, and side step the problems our users have had specifically with using Marten with AWS Lambda, Azure Functions, Blazor, and inside of WPF applications. I’m also hoping that the OptimizedArtifactWorkflow() usage greatly simplifies the usage of Marten “best practices.”

One thought on “Improving the Development and Production Time Experience with Marten V5

  1. This is really nice!

    When using `TypeLoadMode.Static`, is there any way to specify which assembly pre-generated types should be loaded from?

    My use case is a shared database that has multiple web apps using it – I’d really like to generate the code once in a shared project, and commit it to git, rather than having to generate it for each web app.

Leave a comment