|
Understanding Cursor Pagination and Why It's So Fast (Deep Dive)
Read on: my website / Read time: 11 minutes
|
|
|
|
The .NET Weekly is brought to you by:
Build better with AWS, using tips and tools from the Developer Center:
- Get hands-on with DevOps, Data & ML, and Generative AI
- Use any language, level up your skills
- Connect with like-minded devs all over the world on AWS Communities
The AWS Developer Center has everything you need in one place. Don't miss out!
|
|
|
|
Pagination is crucial for efficiently handling large datasets. While offset pagination is widely used and gets the job done, cursor-based pagination offers some interesting advantages for certain scenarios.
It's particularly valuable for real-time feeds, infinite scroll interfaces, and APIs where performance at scale matters - like social media timelines, activity logs, or event streams where users frequently page through large datasets.
Let's explore both approaches using a simple UserNotes table and see how they perform with a million records.
We'll look at the implementation details, compare query performance, and discuss where each approach makes the most sense.
I've included real execution plans from PostgreSQL to demonstrate the significant performance differences between these approaches.
|
|
|
Database Schema
I created a simple table to demonstrate pagination techniques. The table is seeded with 1,000,000 records for testing purposes, which should be enough to show the performance difference between offset and cursor pagination.
We'll use the following SQL schema for the examples:
CREATE TABLE user_notes (
id uuid NOT NULL,
user_id uuid NOT NULL,
note character varying(500),
date date NOT NULL,
CONSTRAINT pk_user_notes PRIMARY KEY (id)
);
And here's the C# class representing the UserNote entity:
public class UserNote
{
public Guid Id { get; set; }
public Guid UserId { get; set; }
public string? Note { get; set; }
public DateOnly Date { get; set; }
}
I will use PostgreSQL as the database, but the concepts also apply to other databases.
|
|
|
Offset Pagination: The Traditional Approach
Offset pagination uses Skip and Take operations. We skip a certain number of rows and take a fixed number of rows. These usually translate to OFFSET and LIMIT in SQL queries.
Here's an example of offset pagination in ASP .NET Core:
Note that I'm sorting the results by Date and Id in descending order. This ensures consistent results when paginating.
Here's the generated SQL for offset pagination:
SELECT count(*)::int FROM user_notes AS u;
SELECT u.id, u.date, u.note, u.user_id
FROM user_notes AS u
ORDER BY u.date DESC, u.id DESC
LIMIT @pageSize OFFSET @offset;
Limitations of Offset Pagination:
- Performance degrades as offset increases because the database must scan and discard all rows before the offset
- Risk of missing or duplicating items when data changes between pages
- Inconsistent results with concurrent updates
|
|
|
Cursor-Based Pagination: A Faster Approach
Cursor pagination uses a reference point (cursor) to fetch the next set of results. This reference point is typically a unique identifier or a combination of fields that define the sort order.
I'll use the Date and Id fields to create a cursor for our UserNotes table. The cursor is a composite of these two fields, allowing us to paginate efficiently.
Here's an example of cursor pagination in ASP .NET Core:
The sort order is the same as in the offset pagination example. However, the sort order is critical for consistent results with cursor pagination. Because the Date isn't a unique value in our table, we use the Id field to handle ties. This ensures that we don't miss or duplicate items when paginating.
Here's the generated SQL for cursor pagination:
SELECT u.id, u.date, u.note, u.user_id
FROM user_notes AS u
WHERE u.date < @date OR (u.date = @date AND u.id <= @lastId)
ORDER BY u.date DESC, u.id DESC
LIMIT @limit;
Note that there's no OFFSET in the query. We're directly seeking the rows based on the cursor, which is more efficient than offset pagination.
The COUNT query is omitted in cursor pagination because we're not counting the total number of items. This can be a limitation if you need to display the total number of pages upfront. However, the performance benefits of cursor pagination often outweigh this limitation.
Limitations of Cursor Pagination:
- If users need to change sort fields dynamically, cursor pagination becomes significantly more complicated since the cursor must incorporate all sort conditions
- Users can't jump to a specific page number - they must traverse sequentially through the pages
- More complex to implement correctly compared to offset pagination, especially when handling ties and ensuring stable ordering
|
|
|
Examining the SQL Execution Plans
I wanted to compare the execution plans for offset and cursor pagination. I used the EXPLAIN ANALYZE command in PostgreSQL to see the query plans.
Here's the offset pagination query:
SELECT u.id, u.date, u.note, u.user_id
FROM user_notes AS u
ORDER BY u.date DESC, u.id DESC
LIMIT 1000 OFFSET 900000;
I'm intentionally skipping 900,000 rows to exaggerate the performance impact. After that, we fetch the next 1,000 rows.
Here's the query plan for offset pagination:
The total execution time is 704.217 ms for offset pagination.
Here's the query returning the same set of rows using cursor pagination. I had to hardcode the @date and @lastId values for this comparison:
SELECT u.id, u.date, u.note, u.user_id
FROM user_notes AS u
WHERE u.date < @date OR (u.date = @date AND u.id <= @lastId)
ORDER BY u.date DESC, u.id DESC
LIMIT 1000;
Finally, here's the query plan for cursor pagination:
The total execution time for cursor pagination is 40.993 ms .
A whopping 17x performance improvement with cursor pagination compared to offset pagination!
The performance with cursor pagination is consistent regardless of the page depth. This is because we're directly seeking the rows based on the cursor, which is more efficient than offset pagination. It's a huge advantage over offset pagination, especially for large datasets.
|
|
|
Adding Indexes for Cursor Pagination
I also tested the impact of indexes on cursor pagination. I created a composite index on the Date and Id fields to speed up the queries. Or so I thought...
Here's the SQL command to create the composite index:
CREATE INDEX idx_user_notes_date_id ON user_notes (date DESC, id DESC);
The index is created in descending order to match the sort order in our queries.
Let's see the query plan for cursor pagination with the composite index:
We have an Index Scan using the composite index. However, the execution time is 298.955 ms , which is slower than the previous query without the index.
This might be because the dataset is too small to benefit from the index. I have only 1,000,000 records in the table, which might not be enough to see the performance improvement with the index.
But wait, there's more to it!
What if we were to use a tuple comparison in SQL?
Finally, the index is working. The execution time is 0.668 ms , which is significantly faster than the previous queries.
The query optimizer cannot determine whether the composite index can be used for row-level comparison. However, the index is effectively used with a tuple comparison.
How do you translate this to EF Core?
The Postgres provider has EF.Functions.LessThanOrEqual , which accepts a ValueTuple as an argument. We can use it to produce a (u ․date, u ․id) <= (@date, @lastId) comparison in the query. And this will utilize the composite index.
query = query.Where(x => EF.Functions.LessThanOrEqual(
ValueTuple.Create(x.Date, x.Id),
ValueTuple.Create(cursor, lastId)));
|
|
|
Encoding the Cursor
Here's a small utility class for encoding and decoding the cursor. We'll use this to encode the cursor in the URL and decode it when fetching the next set of results.
The clients will receive the cursor as a Base64-encoded string. They don't need to know the internal structure of the cursor.
Here's an example of encoding and decoding the cursor:
|
|
|
Summary
While offset pagination is simpler to implement, it suffers from significant performance degradation at scale. My tests showed a 17x slowdown compared to cursor pagination when accessing deeper pages.
Cursor pagination maintains consistent performance regardless of page depth and works particularly well for real-time feeds and infinite scroll interfaces.
However, cursor pagination comes with tradeoffs. It requires careful implementation, especially around cursor encoding and handling sort orders. It also doesn't provide total page counts, making it unsuitable for interfaces that need to support paged navigation.
The choice between these approaches ultimately depends on your use case:
- Choose cursor pagination for performance-critical APIs, real-time feeds, infinite scroll, or any scenario where users frequently access deep pages
- Stick with offset pagination for admin interfaces, small datasets, or when you need upfront page counts
Remember to use tuple comparisons and appropriate indexes to get the best performance from cursor pagination.
That's all for today.
See you next week.
|
|
|
Whenever you're ready, there are 3 ways I can help you:
|
|
Pragmatic REST APIs: You will learn how to build production-ready REST APIs using the latest ASP .NET Core features and best practices. Join the waitlist to get a special launch discount. Coming out next in a few days! |
|
|
Pragmatic Clean Architecture: This comprehensive course will teach you the system I use to ship production-ready applications using Clean Architecture. Learn how to apply the best practices of modern software architecture. Join 3,900+ engineers |
|
|
|
You received this email because you subscribed to our list. You can unsubscribe at any time.
Update your profile | Dragiše Cvetkovića 2, Niš, - 18000
|
|
|
|
|