LIMITing Loaded Relationship Records While Avoiding N+1 in Laravel

13th July 2023

Read this first!

It was recently announced that a package called Eloquent Eager Limit created by Jonas Staudenmeir has been merged into Laravel 11. This means you will be able to do the following and get the expected result;

1User::query()
2 ->with([
3 'comments' => fn ($query) => $query->orderByDesc('rating')->limit(5),
4 ])
5 ->get();

So the methods described below are no longer necessary, but I'm going to leave this up as a reminder that we didn't always have things so easy!

Laravel provides an easy mechanism to eager-load related models when querying the database;

1User::query()
2 ->with('comments')
3 ->get();

This loads all users and their related comments with just two SQL queries and avoids any N+1 issues if we want to iterate over comments. For this article I'm going to assume you know what the N+1 issue is, so if you're not familiar you could do a quick Google search; it's very widely documented.

Sometimes however, we only want to load a sub-set of the related records. For example; what if we only wanted to load the 5 most highly rated comments by each user? We may try creating a new relationship on the User model;

1public function highest_rated_comments(): HasMany
2{
3 return $this->hasMany(Comment::class)
4 ->orderByDesc('rating')
5 ->limit(5);
6}

Or if we're only using it in this one place, we may just adjust how the existing relationship is loaded.

1User::query()
2 ->with([
3 'comments' => fn (HasMany $hasMany) => $hasMany->orderByDesc('rating')->limit(5),
4 ])
5 ->get();

Unfortunately, rather than limiting the number of comments loaded on a per-user basis, both these methods limit the total number of comments loaded. If we dump out the queries that are being performed then it becomes clear why;

1select * from `users`;
2 
3select * from `comments` where `comments`.`user_id` in (...) order by `rating` desc limit 5;

Depending on the relationship, if you're confident that no user will ever have very many comments, you could load all comments and then reduce the collection afterwards.

1$users = User::query()
2 ->with('comments')
3 ->get();
4 
5foreach ($users as $user) {
6 $user->setRelation('comments', $user->comments->sortByDesc('rating')->take(5));
7}

It's not ideal that we're having to iterate over all the users to do this, but at least we're minimising the number of SQL queries. But what about when that relationship gets BIG? If a user can have thousands of comments, we don't really want to load them all and then throw away all but the top 5. If you're in a hurry, you may just opt to load the comments for each user individually;

1$users = User::query()->get();
2 
3foreach($users as $user) {
4 $user->load([
5 'comments' => fn (HasMany $hasMany) => $hasMany->orderByDesc('rating')->limit(5),
6 ]);
7}

When the number of user comments gets too large this could work out being the faster way of doing things, but we've now introduced an N+1, where we're performing an extra query for each user. So how can we avoid having to load all comments and avoid introducing an N+1 issue at the same time? One possible solution is adding an extra select to our user query, which will give us the IDs of the comments we need to load;

1$limit = 5;
2 
3$users = User::query()
4 ->addSelect([
5 'highest_rated_comment_ids' => Comment::query()
6 ->selectRaw("SUBSTRING_INDEX(GROUP_CONCAT(comments.id order by rating desc, ','), ',', {$limit})")
7 ->whereColumn('comments.user_id', 'users.id')
8 ->limit($limit)
9 ])
10 ->get();

This adds a sub-select to our query that adds a property called highest_rated_comment_ids to our User model. The sub-query sorts the comments by rating in descending order, meaning the highest rated comments will come first. GROUP_CONCAT concatenates the IDs of the comments into a long string. Then SUBSTRING_INDEX returns everything in that string up until it has seen a delimiter (in our case ,) $limit times. So we end up with a comma separated string of the IDs of the five most highly-rated comments for each user. Something like this; "27,4,39,21,107". Here's how the SQL query looks;

1select
2 `users`.*,
3 (
4 select
5 SUBSTRING_INDEX(GROUP_CONCAT(comments.id order by rating desc, ','), ',', 5)
6 from
7 `comments`
8 where
9 `comments`.`user_id` = `users`.`id`
10 limit 5) as `highest_rated_comments`
11from
12 `users`

We now need to fetch the comment IDs from all the users and turn them into an array, so we can load all the necessary comments;

1$highestRatedCommentIds = $users->pluck('highest_rated_comment_ids')
2 ->map(function (string $commentIds) {
3 return explode(',', $commentIds);
4 })
5 ->flatten();
6 
7// $highestRatedCommentIds = [27, 4, 39, 21, 107, ...];
8 
9$highestRatedComments = $this->getQueryBuilder()
10 ->whereIn('id', $highestRatedCommentIds)
11 ->get();

We now have all the comments representing the 5 highest-rated for our users (i.e. 25 comments if we have 5 users). Finally, we map over the users and add the relevant comments as a relation;

1foreach ($users as $user) {
2 // Turn the highest_rated_comment_ids property from the user into an array
3 $userHighestRatedCommentIds = explode(',', $user->highest_rated_comment_ids);
4 
5 // Pull the comments that relate to this user out of the collection of all comments
6 // sortBy ensures they are added in the same order as highest_rated_comment_ids i.e. descending rating order
7 $userHighestRatedComments = $highestRatedComments
8 ->whereIn('id', $userHighestRatedCommentIds)
9 ->sortBy(fn (Comment $comment) => array_flip($userHighestRatedCommentIds)[$comment->id])
10 ->values();
11 
12 // Add the comments to the user
13 $user->setRelation('highest_rated_comments', $userHighestRatedComments);
14}

That's it! This solution still requires us to map over the users in order to set the relationship, but allows us to do it with only two queries.

Code highlighting powered by Torchlight