Back

Introduction to API Pagination

Introduction to API Pagination

image alt text

When building an app, you’ll likely have a list of elements stored in a database, and you want to show to users. for example, a list of restaurants in a food ordering app, a list of products on an e-commerce app, messages on messaging app, or even tweets on Twitter and posts on Facebook.

Your DB may have hundreds or thousands or even millions of elements, it won’t be a good idea (regarding performance and UI/UX) to show all of the elements to the user in a single view, the answer to this problem is using pagination.

What is pagination?

The goal is to return a subset of results with accessibility to all data. so the list is treated like a book, which is divided into a set of pages, and the user reads page by page, you can’t read the whole book at once, right?

Why use it?

  • User experience: Users can only consume a limited number of elements at once.
  • Performance: Returning less data is faster and cheaper.

How?

These are two famous mechanisms that are widely used:

  • Offset Pagination
  • Cursor Pagination

Offset Pagination

Back-end will perform a Database query with offset value like this one:

SELECT * FROM CARS WHERE MODEL='audi' LIMIT 5 OFFSET 100;

image alt text

The idea is to divide the list into pages, each page will have n elements, and this is achieved using limit and offset where:

  • limit: is the number of elements to be returned
  • offset: is the number of elements to skip before selecting elements to return.

we also have page number which will equal to (offset/limit) + 1

mainly we set the limit to a fixed number (5 for example) and we change offset value to map to different pages, like following:

page = 1 -> offset = 0

page = 2 -> offset = 1

page = 2 -> offset = 2

and so on.

And in the example above, limit = 5 ->we are getting 5 elements only, offset = 100 -> we’ll skip the first 100 elements, and thus page = 21 since page number = (100/5) + 1 = 21

Usually BE return to the clients (FE or Mobile App for example) a key with the response to represent next_page which is the page number of next page that the client can request (if exists), and then Backend can calculate offset from it.

limitations:

1. Performance issue

Offset pagination is the obvious straight forward solution for paginating, and it works great when building MVP and small projects, but sometimes it can have performance limitations working with large scale projects and databases.

Getting pages gets slower as your data grows, especially requesting pages at the end of your list. for example, your car app went viral and you have now thousands of cars that you want to paginate, getting the last page will be very slow.

the reason behind the performance issue is the way offset works is it skips the first n elements in the list, so in the worst case, on the last page it will perform a full table scan to skip the first few thousand cars and then start selecting the next elements.

in our above example, to get page 21 we had to loop through all the 105 elements, similarly in a table of 100 thousand cars, to get the last page offset pagination will loop through the whole table of 100K cars to get the last page.

Time Complexity of getting the last pages will be O(n).

2. Real-time data issue

on real-time applications where data are constantly added or deleted, offset pagination can result in showing duplicate data or skipping some data and never showing them.

cause offset pagination only cares about the positions of the rows and not the actual rows, the positions can point to different rows, any time someone adds or deletes data.

Cursor Pagination

Widely used in real-time applications like paginating posts on Facebook or paginating messages in slack.

it’s used only to get the next page, you can’t jump suddenly to page 42 from page 1, Back-end uses a comparison query with an indexed column (an id or timestamp).

SELECT * FROM `cars`WHERE `cars`.`id` > 1456323 LIMIT 5

image alt text

The reason why this is fast is that selecting elements where id > {cursor} query uses the index on the id column. the index helps query to jump instantly to the id satisfying the condition using the index hash function.

Using the ID column is the most basic way to use cursor pagination, sometimes we don’t want to expose the id of the element for security reasons, so we can use other indexed columns like a timestamp for example, which is close to what is used in companies like Facebook or Slack.

Usually BE return to the clients a key represent cursor which is the page number of next page that the client can request (if exists), and then Backend can calculate offset from it.

Time Complexity of getting the last pages will be O(1)

limitations:

  • You can’t jump to a specific page instantly.
  • Can’t be used on a paginating query that is ordered by columns with no indexes.
  • Cursor pagination can’t be used when you want to sort by multiple columns or non-unique indexed columns(it only works with sequential columns like timestamps or ids).

Conclusion

Cursor pagination is the most efficient method of paging and should always be used where possible, however, it doesn’t cover all use cases so offset pagination is still used.

Both Offset Pagination and Cursor Pagination are widely used and have different use cases. so you need to make sure you are using the best technique to meet your needs.

Please leave a comment if you have any thought about the topic

Thanks for reading ❤️