Skip to main content

Paginated Search

Pagination is a crucial feature in any API that deals with large datasets, and our FHIR API is no exception. When querying resources, it's often impractical or unnecessary to return all matching results in a single response. Pagination allows clients to retrieve results in manageable chunks, improving performance and reducing network load.

Our FHIR API implements two distinct pagination methods:

  1. Offset-based pagination
  2. Cursor-based pagination

Each method has its own use cases, advantages, and limitations, which we'll explore in detail in this documentation.

Pagination Methods

Offset-based Pagination

Offset-based pagination is implemented using the _offset parameter. This method is straightforward and allows clients to skip a specified number of results.

Usage:

  • The _offset parameter accepts an integer value.
  • _offset=0 returns the first page of results.
  • Increasing the offset value skips that many rows in the result set.

Example:

GET /Patient?_offset=20

This request would return results starting from the 21st matching Patient resource.

Limitations:
  • Our server supports _offset values up to 10,000.
  • Offset-based pagination can lead to performance issues with very large datasets.

Implementation Details:

Behind the scenes, the _offset parameter translates to a SQL LIMIT clause. While this is efficient for smaller offsets, it can cause performance and stability issues for larger offsets, hence the 10,000 limit.

When to Use:

Offset-based pagination is best suited for:

  • Smaller datasets
  • Use cases where you need to jump to a specific page number
  • Scenarios where the total number of results is important

Cursor-based Pagination

Cursor-based pagination is implemented using the _cursor parameter. This method uses an opaque string value that represents a pointer to a specific item in the result set.

Usage:

  • The _cursor parameter accepts a string value provided by the server in previous responses.
  • The initial request doesn't include a _cursor parameter.
  • Subsequent requests use the _cursor value from the Bundle.link element with relation="next" in the previous response.

Example:

GET /Patient?_cursor=abc123xyz

Advantages:

  • Supports pagination through very large datasets (millions of resources)
  • More performant than offset-based pagination for large offsets
  • Provides consistent results even if the underlying data changes between requests

Limitations:

  • Currently only supported for searches that are sorted on _lastUpdated in ascending order.
  • The cursor values are opaque and should be treated as black boxes by clients.

Implementation Details:

Cursor-based pagination uses database indexes, making it much faster than offset-based pagination, especially for large datasets. The _cursor values are generated by the server and encode information about the last returned item's position in the result set.

When to Use:

Cursor-based pagination is ideal for:

  • Large datasets
  • Use cases like analytics or data export where you need to iterate through all resources
  • Scenarios where performance is critical

Note: While cursor-based pagination requires sorting by _lastUpdated ascending, it still works with other search filters. For example:

GET /Observation?code=xyz&_sort=_lastUpdated&_cursor=abc123xyz

Alway sort when paginating

When paginating through search results, it is essential to sort the results to ensure consistent output across pages. If you don't sort the results, you may see different resources on each page, which can lead to unexpected behavior.

See "Sorting the Results" for more info.

Setting the page size with the _count parameter

To set the number of items returned per page, use the _count query parameter. In the Medplum API, the default page size is 20, and the maximum allowed page size is 1000.

Here's an example query that sets the page size to 50:

await medplum.searchResources('Patient', { _count: '50' });

In this example, the search will return up to 50 Patient resources per page.

Paginating with Included Resources

Pagination can be difficult when you are including linked resources, as you will not know how many of each resource will be returned. It may make sense to use chained searches instead so that only resources of one type are returned.

Getting the total number of results with _total

To include the total count of matching resources in the search response, you need to use the _total parameter in your search query. This information is particularly useful for pagination and understanding the scope of the data you are dealing with.

The _total parameter can have three values: accurate, estimate, and none.

none (Default)No total is returned
estimateTells the Medplum server that you are willing to accept an approximate count. This is usually faster than the accurate option as it may use database statistics or other methods for estimating the total number without scanning the entire dataset. This option is particularly useful when you need a rough idea of the number of resources without requiring precision.
accurateThe Medplum server will perform additional processing to calculate the exact number of resources that match the search criteria. This can be more time-consuming, especially for large datasets, but you will receive a precise count. Use this option when an exact number is crucial for your use case.
warning

Because computing counts is an expensive operation, Medplum only produces estimated counts above a certain threshold.

  • Medplum first computes an estimated count.
  • If this count is above below the threshold, an accurate count is computed.
  • Otherwise, the estimated count is returned even if _total=accurate is specified.

For customers on the Medplum hosted service, this threshold is set to 1 million entries

For self-hosted customers, this threshold is server-level configuration called accurateCountThreshold (learn more).

By default, the search responses do not include totals. Choosing between accurate and estimate depends on your specific requirements. For large datasets, using estimate can significantly improve response times, but at the cost of precision.

Example Query Here is an example of how to use the _total parameter in a search query:

await medplum.search('Patient', { name: 'Smith', _total: 'accurate' });

This query will search for patients with the name "smith" and will return a Bundle with the accurate total number of matching resources included.

const response: Bundle = {
resourceType: 'Bundle',
id: 'bundle-id',
type: 'searchset',
total: 15,
entry: [
{
fullUrl: 'http://example.com/base/Patient/1',
resource: {
resourceType: 'Patient',
// ...
},
},
{
fullUrl: 'http://example.com/base/Patient/2',
resource: {
resourceType: 'Patient',
// ...
},
},
// ...
],
// ...
};
Note

The Medplum SDK provides the searchResources helper function. This function unwraps the response bundle of your search results and returns an array of the resources that match your parameters. If you want to get the count when using this function, the .bundle property is added to the array. You can access the total using response.bundle.total.

When you perform a paginated search, the response will be a Bundle resource containing a list of resources matching the query. The Bundle resource will also have a link field containing navigation links to help you traverse through the search results.

The Bundle.link field will include the following relations:

  • self: The URL of the current search results page.
  • first: The URL of the first page of search results.
  • previous: The URL of the previous page of search results (if applicable).
  • next: The URL of the next page of search results (if applicable).

Here's an example of how the Bundle.link field may look :

'link': [
{
relation: 'self',
url: 'https://example.com/Patient?_count=50&_offset=60',
},
{
relation: 'first',
url: 'https://example.com/Patient?_count=50&_offset=0',
},
{
relation: 'previous',
url: 'https://example.com/Patient?_count=50&_offset=10',
},
{
relation: 'next',
url: 'https://example.com/Patient?_count=50&_offset=110',
}
];

To navigate between pages, simply follow the URLs provided in the Bundle.link field.

The URLs in the Bundle.link will opportunistically use _cursor pagination if compatible with the search query (see Cursor-based pagination limitations). If _cursor is not compatible, the URLs will use _offset pagination.

It is strongly recommended to use the Bundle.link field to navigate between pages, as it ensures that you are following the correct pagination method for the search query.

The searchResourcePages() method of the MedplumClient provides an async generator to simplify the iteration over resource pages.

for await (const patientPage of medplum.searchResourcePages('Patient', { _count: 10 })) {
for (const patient of patientPage) {
console.log(`Processing Patient resource with ID: ${patient.id}`);
}
}

The array returned by searchResourcePages also includes a bundle property that contains the original Bundle resource. You can use this to access bundle metadata such as Bundle.total and Bundle.link.

The searchResourcePages method uses Bundle.link to navigate between pages, ensuring that you are following the correct pagination method for the search query. The URLs in the Bundle.link will opportunistically use _cursor pagination if compatible with the search query (see Cursor-based pagination limitations). If _cursor is not compatible, the URLs will use _offset pagination.

Setting the page offset with the _offset parameter

To set the page offset, or the starting point of the search results, use the _offset query parameter. This allows you to skip a certain number of items before starting to return results.

Here's an example query that sets the page offset to 30:

await medplum.searchResources('Patient', { _count: '50', _offset: '30' });

In this example, the search will return up to 50 Patient resources per page, starting from the 31st item in the result set.

Using _offset pagination is discouraged for large datasets, as it can lead to performance issues. For large datasets, consider using _cursor pagination instead.