How to compose and refactor Ecto queries with Queries Modules

How to compose and refactor Ecto queries with Queries Modules

For a while, I’ve been looking for a perfect way to construct Ecto Queries. At Curiosum we came through a couple of iterations and eventually ended up with a solution that you may call resource-based query modules.

Table of contents

    Background

    Here is a typical scenario for creating a resource and its logic.

    There is a resource file, for example, user.ex (for User resource) that has only two responsibilities:

    • To define User struct,
    • To define User changeset.

    Nothing else, just these two things. The business logic for this resource should be defined in its context file, for example, accounts.ex (for Accounts context).

    The most common scenario I see across Phoenix apps is to put resource queries in files that hold resource business logic – in this case in accounts.ex. This solution has at least two disadvantages:

    • The Accounts context file may grow to a very huge size. Queries tend to be pretty long, as well as the business logic itself.
    • It’s not a great idea to mix DB queries with non-DB-related logic.

    With these two things in mind, we decided to extract resource queries into a file that has only one responsibility – to construct Ecto.Query for a specific resource.

    Folder structure

    To make this concept more feasible let’s create a sample app:

    mix phx.new queries

    … and three resources:

    Accounts.User

    mix phx.gen.schema Accounts.User accounts_users email:string

    Blog.Post

    mix phx.gen.schema Blog.Post blog_posts content:text user_id:references:accounts_users

    Blog.Comment

    mix phx.gen.schema Blog.Comment blog_comments content:text post_id:references:blog_posts

    As you can see, it’s a pretty easy and simplified structure – User has many posts and posts have many comments.

    Queries module

    Let's imagine that the Accounts context looks like this:

    defmodule Queries.Accounts do
      import Ecto.Query
    
      alias Queries.Accounts.User
    
      def list_users do
        User
        |> Repo.all()
      end
    
      def list_users_having_posts do
        User
        |> join(:left, [user], _ in assoc(user, :posts))
        |> where([_, posts], not is_nil(posts.id))
        |> Repo.all()
      end
    
      def find_user_by_id(user_id) do
        User
        |> where([user], user.id == ^user_id)
        |> Repo.one()
      end
    
      def user_has_posts?(user_id) do
        User
        |> join(:left, [user], _ in assoc(user, :posts))
        |> where([user, _], user.id == ^user_id)
        |> where([_, posts], not is_nil(posts.id))
        |> Repo.exists?()
      end
    end

    Currently, the Accounts context's main responsibility is to compose Ecto.Query and pipe it to Repo. We can do better here. We can design an approach that will totally separate context from Ecto.Query and deliver a module that is composable and reusable.

    Let's create a file under this path: lib/accounts/user/user_queries.ex. Notice, that I created a user folder here so that the user_queries file is scoped to the User resource. (In Curiosum, we also put other files in this folder, like user_validations, and user_specs, but this is a topic for another blog post).

    The next step is to put our Ecto-related logic from context to the UserQueries module:

    defmodule Queries.Accounts.UserQueries do
      import Ecto.Query
    
      alias Queries.Accounts.User
    
      def all(), do: User
    
      def with_id(id) do
        User
        |> where([user], user.id == ^id)
      end
    
      def with_existing_posts() do
        User
        |> join(:left, [user], _ in assoc(user, :posts))
        |> where([_, posts], not is_nil(posts.id))
      end
    end

    ...and update Accounts context:

    defmodule Queries.Accounts do
      alias Queries.Accounts.UserQueries
      alias Queries.Repo
    
      def list_users do
        UserQueries.all()
        |> Repo.all()
      end
    
      def list_users_having_posts do
        UserQueries.with_existing_posts()
        |> Repo.all()
      end
    
      def find_user_by_id(user_id) do
        UserQueries.with_id(user_id)
        |> Repo.one()
      end
    
      def user_has_posts?(user_id) do
        UserQueries.with_existing_posts()
        # now what?
      end
    end

    So far, we've been able to reduce Accounts responsibility in terms of Ecto.Query generation process to simple and effective delegation. Sounds great! But why is the last function not complete? This is what I wish I could do in this case:

    def user_has_posts?(user_id) do
      UserQueries.with_existing_posts()
      |> UserQueries.with_id(user_id)
      |> Repo.exists?()
    end

    ... but it's not possible, as neither UserQueries.with_existing_posts/0 nor UserQueries.with_id/1 accepts query as an argument. Let's change that.

    Composable and Reusable Ecto Queries

    To make our functions composable, we need to allow them to receive a query that will be then extended with other conditions:

    defmodule Queries.Accounts.UserQueries do
      ...
    
      def all(query \\ base()), do: query
    
      def with_id(query \\ base(), id) do
        query
        |> where([user], user.id == ^id)
      end
    
      def with_existing_posts(query \\ base()) do
        query
        |> join(:left, [user], _ in assoc(user, :posts))
        |> where([_, posts], not is_nil(posts.id))
      end
    
      defp base, do: User
    end

    There is one important function added here - base. Its goal is to provide... well, the base for all queries in a file which in most cases will be the resource module. As you can see I also set the default value for the query argument in all of the functions. Thanks to that you don't have to start the construction of queries pipe with UserQueries.base().

    With these changes, we're now able to finish the implementation of the Accounts.user_has_posts? function:

    defmodule Queries.Accounts do
      ...
    
      def user_has_posts?(user_id) do
        UserQueries.with_existing_posts()
        |> UserQueries.with_id(user_id)
        |> Repo.exists?()
      end
    end

    Notice one cool thing here. We have now two functions in Accounts context file that use UserQueries.with_existing_posts/1. This makes it reusable which is a huge benefit.

    Dealing with multiple joins

    Our Queries module is still not perfect, and you'll soon find out why.

    In our app there is also a Comment resource. We might want to create a function that will return query able to filter users that have posts with comments. Here it is:

    def with_existing_comments(query \\ base()) do
      query
      |> join(:left, [user], _ in assoc(user, :posts))
      |> join(:left, [_, posts], _ in assoc(posts, :comments))
      |> where([_, _, comments], not is_nil(comments.id))
    end

    In Ecto, the order of joins matters. It means that in this case, you can do the following thing:

    |> where([_, _, comments], not is_nil(comments.id))

    but you can't do this:

    |> where([_, comments, _], not is_nil(comments.id))

    Comments were joined after posts, so you can pattern-match this resource only as the third element. Unless... you use named bindings.

    Here is a version of the with_existing_comments function that takes advantage of bindings:

    defmodule Queries.Accounts.UserQueries do
      ...
    
      def with_existing_comments(query \\ base()) do
        query
        |> join(:left, [user: user], _ in assoc(user, :posts), as: :posts)
        |> join(:left, [posts: posts], _ in assoc(posts, :comments), as: :comments)
        |> where([comments: comments], not is_nil(comments.id))
      end
    
      defp base do
        from(_ in User, as: :user)
      end
    end

    With bindings, the order of joins doesn't matter as you can simply extract resource with a pattern-matching-like syntax. Is this error proof version of joins? Not really, here is what you get when you pipe with_existing_comments to with_existing_comments:

    iex(20)> UserQueries.with_existing_comments() |> UserQueries.with_existing_comments()
    ** (Ecto.Query.CompileError) alias `:posts` already exists
        (queries 0.1.0) iex:21: Queries.Accounts.UserQueries.with_existing_comments/1

    It basically means that you can't define a binding with the same name twice - which is the case here. We can do better here as well. We can create a with_join function that will pattern match a specific resource, and join it to existing query with binding only if it doesn't exist yet:

    defmodule Queries.Accounts.UserQueries do
      ...
    
      defp with_join(query, :posts) do
        if has_named_binding?(query, :posts) do
          query
        else
          query
          |> join(:left, [user: user], _ in assoc(user, :posts), as: :posts)
        end
      end
    
      defp with_join(query, :comments) do
        if has_named_binding?(query, :comments) do
          query
        else
          query
          |> with_join(:posts)
          |> join(:left, [posts: posts], _ in assoc(posts, :comments), as: :comments)
        end
      end
    end

    The has_named_binding?/2 (documentation) function checks if query contains a given binding. Notice that we also used with_join(query, :posts) to assure that with_join(query, :comments) can access it:

    query
    |> with_join(:posts)
    |> join(:left, [posts: posts], _ in assoc(posts, :comments), as: :comments)

    It's just another great reusability thing we get here.

    The updated with_existing_comments/1 function looks like this:

    defmodule Queries.Accounts.UserQueries do
      ...
    
      def with_existing_comments(query \\ base()) do
        query
        |> with_join(:comments)
        |> where([comments: comments], not is_nil(comments.id))
      end
    end

    Clean and simple. Now when you pipe with_existing_comments/1 to with_existing_comments/1 it won't raise an error:

    iex(22)> UserQueries.with_existing_comments() |> UserQueries.with_existing_comments()     
    #Ecto.Query<from u0 in Queries.Accounts.User, as: :user,
     left_join: p1 in assoc(u0, :posts), as: :posts,
     left_join: c2 in assoc(p1, :comments), as: :comments,
     where: not(is_nil(c2.id)), where: not(is_nil(c2.id))>

    The final versions of the Accounts context module and UserQueries module look like this:

    Accounts

    defmodule Queries.Accounts do
      alias Queries.Repo
      alias Queries.Accounts.UserQueries
    
      def list_users do
        UserQueries.all()
        |> Repo.all()
      end
    
      def list_users_having_posts do
        UserQueries.with_existing_posts()
        |> Repo.all()
      end
    
      def list_users_having_posts_with_comments do
        UserQueries.with_existing_comments()
        |> Repo.all()
      end
    
      def find_user_by_id(user_id) do
        UserQueries.with_id(user_id)
        |> Repo.all()
      end
    
      def user_has_posts?(user_id) do
        UserQueries.with_id(user_id)
        |> UserQueries.with_existing_posts()
        |> Repo.exists?()
      end
    end

    UserQueries

    defmodule Queries.Accounts.UserQueries do
      import Ecto.Query
    
      alias Queries.Accounts.User
    
      def all(query \\ base()), do: query
    
      def with_id(query \\ base(), id) do
        query
        |> where([user: user], user.id == ^id)
      end
    
      def with_existing_posts(query \\ base()) do
        query
        |> with_join(:posts)
        |> where([posts: posts], not is_nil(posts.id))
      end
    
      def with_existing_comments(query \\ base()) do
        query
        |> with_join(:comments)
        |> where([comments: comments], not is_nil(comments.id))
      end
    
      defp with_join(query, :posts) do
        if has_named_binding?(query, :posts) do
          query
        else
          query
          |> join(:left, [user: user], _ in assoc(user, :posts), as: :posts)
        end
      end
    
      defp with_join(query, :comments) do
        if has_named_binding?(query, :comments) do
          query
        else
          query
          |> with_join(:posts)
          |> join(:left, [posts: posts], _ in assoc(posts, :comments), as: :comments)
        end
      end
    
      defp base do
        from(_ in User, as: :user)
      end
    end

    Summary

    I believe that resource-based query modules are yet another step towards clean and maintainable code in Elixir & Ecto. Being able to reduce context file length and delegate all queries-related logic to another module while improving readability at the same time is a huge benefit.

    One last thing that I want to add here is that I believe you should create query functions with as little responsibility as possible. This way you'll follow the path for well-defined composable functions.

    What's your approach for building queries? Do you follow a similar approach? Can you suggest yet another improvement to this process? Or maybe you find this solution wrong - I'd love to hear it! ;)

    FAQ

    Why Separate Ecto Queries from Context Files in Elixir?

    Separating Ecto queries from context files keeps the code organized, prevents files from becoming too large, and clearly distinguishes database queries from business logic.

    What is the Benefit of Using Query Modules in Elixir?

    Using query modules provides clear organization, enhances readability, and improves reusability of Ecto queries within Elixir projects.

    How Do You Create Composable Queries in Elixir?

    Create composable queries by defining functions that accept and return query structs, allowing them to be chained together for building complex queries.

    What is the Role of the Base Function in Elixir Query Modules?

    The base function establishes a starting point for queries, typically representing the main resource, to ensure queries are built with a consistent foundation.

    How Can You Handle Multiple Joins in Ecto Queries?

    Handle multiple joins by creating functions that conditionally apply joins only if they haven't been applied already, maintaining query efficiency and preventing errors.

    What are the Advantages of Named Bindings in Ecto Queries?

    Named bindings in Ecto queries allow for clearer and more flexible query construction, especially when dealing with multiple joins and associations.

    How Can Composable Queries Improve Code Maintainability in Elixir?

    Composable queries improve code maintainability by allowing developers to construct modular, clear, and reusable query components, reducing duplication and enhancing readability.

    Szymon Soppa Web Developer
    Szymon Soppa Curiosum Founder & CEO

    Read more
    on #curiosum blog