Bulk FHIR API
Medplum supports the Bulk FHIR API 2.0.0. The Bulk FHIR API uses Backend Services Authorization.
Use Cases
- Population Health Reporting: Export data for entire patient populations for analytics and reporting dashboards
- Data Migration: Transfer large datasets between FHIR-compliant systems during system migrations
- Regulatory Compliance: Meet ONC, MIPS, and other regulatory requirements for bulk data access
- Research Data Extraction: Export de-identified or consented patient data for clinical research studies
- Data Warehousing: Populate analytics data warehouses with comprehensive FHIR data exports
The premise of the Bulk FHIR API is that it allows you to create a bulk export of data for multiple patients. There are different ways to export data:
- From a Group of patients, which will export everything in each patient's compartment
- As a system level export of all FHIR resources in a Project
The export process is asynchronous, and you will need to poll a status URL returned when you start the export. After the BulkDataExport resource with the export results is available, it will contain a set of URLs where you can download the exported data in NDJSON format.
Access Policy Requirements
Because the bulk export process is asynchronous, your AccessPolicy must grant you access to the AsyncJob resourceType. This is required to poll the status of the export operation. Without access to AsyncJob, you will not be able to check the status of your export or retrieve the results.
Your AccessPolicy should include an entry like this:
{
"resourceType": "AccessPolicy",
"resource": [
{
"resourceType": "AsyncJob",
"readonly": true
}
]
}
Group Export
To specify which patients need to be included in the export, construct a Group resource and add specific patients as Group.member.entity.
To start the process of exporting the resources, make an HTTP GET request for /fhir/R4/Group/<GROUP_ID>/$export?_outputFormat=ndjson. This initiates a Bulk Data Export transaction and return links to download URLs for requested resources.
curl 'https://api.medplum.com/fhir/R4/Group/<GROUP_ID>/$export?_outputFormat=ndjson' \
-H 'Authorization: Bearer <ACCESS_TOKEN>'
| Resource in Medplum App | Usage in Bulk FHIR |
|---|---|
| Group | All patients you want to include must be included as Group.member.entity |
System Level Export
An export can also be performed for all resources in a Project by making a GET request for /fhir/R4/$export.
import http.client
import time
import json
import os
from typing import Any, TypedDict, List
class ExportOutput(TypedDict):
type: str
url: str
class BulkExportResponse(TypedDict):
transactionTime: str
request: str
requiresAccessToken: bool
output: List[ExportOutput]
error: List[Any]
# You'll need to go through the auth process to get a valid access token,
# see https://www.medplum.com/docs/auth/client-credentials for details
access_token = '[Requires valid access token]'
# Open the connection to the Medplum API
conn = http.client.HTTPSConnection('api.medplum.com')
# Start the bulk export by calling the POST [base]/$export operation endpoint
# This begins the export process, which runs asynchronouosly and may take a while
# to finish. Because of this, the response to this API call does not contain the
# actual exported data, but instead a URL that you can poll to get the status of
# the export operation
conn.request(
'GET', '/fhir/R4/$export', None, {
'Authorization': 'Bearer ' + access_token,
'Content-Type': 'application/fhir+json',
})
init = conn.getresponse()
# No 202 Accepted status code means the export request was not successfully started
if init.status != 202:
raise RuntimeError('Failed to start bulk export')
# Get the status URL from the Content-Location header
status_url: str | None = init.getheader('Content-Location')
if status_url == None:
raise RuntimeError('No status URL found')
# Read and discard the initial response body to allow reusing the connection
init.read()
# Make an initial request for the status of the export
conn.request(
'GET', status_url, None, {
'Authorization': 'Bearer ' + access_token,
})
status = conn.getresponse()
# 202 Accepted status code means the export is still in progress
while status.status == 202:
# Read and discard the response body before making the next request
status.read()
# Wait 1s between requests
time.sleep(1)
# Retry checking the status
conn.request(
'GET', status_url, None, {
'Authorization': 'Bearer ' + access_token,
})
status = conn.getresponse()
# No 200 OK status code means the export failed with an error
if status.status != 200:
raise RuntimeError('Error exporting data')
# Read the JSON body of the response
body = status.read()
export: BulkExportResponse = json.loads(body)
# The response JSON looks like this:
# {
# "transactionTime": "2023-01-01T00:00:00Z",
# "request" : "https://app.medplum.com/fhir/R4/$export
# "requiresAccessToken" : true,
# "output" : [{
# "type" : "Patient",
# "url" : "http://url.to.storage/patient_file_1.ndjson"
# },{
# "type" : "Observation",
# "url" : "http://url.to.storage/observation_file_1.ndjson"
# }],
# "error" : []
# }
def download_export_to_file(export_record: ExportOutput, access_token: str) -> None:
from urllib.parse import urlparse
# Parse the URL to extract the host and path
url: str = export_record['url']
parsed = urlparse(url)
host = parsed.netloc
path = parsed.path
if parsed.query:
path += '?' + parsed.query
# Create a new connection to the host specified in the URL
if parsed.scheme == 'https':
download_conn = http.client.HTTPSConnection(host)
else:
download_conn = http.client.HTTPConnection(host)
# Request the NDJSON export data
download_conn.request(
'GET', path, None, {
'Authorization': 'Bearer ' + access_token,
})
export_data = download_conn.getresponse()
# Read the response once
data: bytes = export_data.read()
# Close the download connection
download_conn.close()
# Append NDJSON data to file on disk in medplum_resources folder
file_path: str = os.path.join('medplum_resources', export_record['type'] + '.ndjson')
with open(file_path, 'w') as f:
f.write(data.decode('utf-8'))
# Create medplum_resources folder if it doesn't exist
os.makedirs('medplum_resources', exist_ok=True)
# Iterate over the output items to download the exported data
for record in export['output']:
# record.type: the resource type contained in the export file
# record.url: a URL pointing to an NDJSON file containing the exported data
download_export_to_file(record, access_token)
Related Reading
- Reporting and Analytics overview
- ONC Certification compliance docs
- Standardized API for patient and population services on HealthIT.gov