Skip to main content

· 2 min read
Joshua Kelly

Claims data is a uniquely rich source of financial and clinical data important to many healthcare workflows. The EDI 837 Health Care Claim transaction is one of the oldest forms of electronic data exchange, stemming from being defined as a required data transmission specification by HIPAA.

Today, we are showcasing Flexpa which connects applications to claims data via direct patient consent and a modern FHIR API powered by Medplum.

How does it work?

Flexpa aggregates and standardizes Patient Access APIs created by payers as required by CMS-9115-F. First, patients authenticate and consent to a data-sharing request from an application.

Then, Flexpa extracts, transforms, and loads payer responses into a normalized FHIR dataset. Flexpa stores data in a temporary FHIR server cache during the period for which a patient has granted access.

Finally, applications receive a patient-specific authorization response which can be used to retrieve data from a FHIR API provided by Flexpa – powered by Medplum.


What problems does Flexpa solve?

Payer FHIR servers offer an extremely variable API experience and implementing against 200+ of them is painful. Using Medplum as a data cache for their own FHIR API allows for a uniform developer experience on top of the underlying network access. Flexpa allows developers to use claims data to deliver risk factor adjustment scoring to value-based care providers, help patients navigate care, join clinical trials, negotiate bills, and more.

How does Flexpa use Medplum?

Flexpa takes advantage of several important features of Medplum’s FHIR implementation:

Medplum’s open source implementation provides Flexpa with the ability to contribute back to the project when improvements or changes are required. Additionally, Medplum’s technology choices and stack align perfectly with Flexpa’s making working with Medplum easy for Flexpa’s development team.

· 2 min read
Reshma Khilnani

Develo has built a full-featured EHR and customer relationship management (CRM) for pediatrics, encompassing core scheduling, clinical, and billing workflows along with family engagement capabilities.

(5 minute demo)

Outpatient pediatrics is uniquely family-centered, longitudinal care-driven, and high volume, with distinct well child check-ups and payor mix that is different from other specialties. Accordingly, the Develo product is beautifully designed with much attention to the nuances that matter to their core independent pediatric practices market.

Beautiful growth chart from develo (A beautiful pediatric growth chart)

Develo has built a full stack solution with key innovations around automating family engagement, reducing administrative tasks, and AI-assisted documentation.

Patient intake (Intuitive patient intake)

They rapidly release new capabilities and take a comprehensive, end-to-end approach to build a full operating system for pediatrics, rather than just optimizing a narrow set of provider workflows.

Scheduling order (Scheduling orders)

Develo EHR is FHIR native and built on Medplum using the following features:

  • Self-hosting: Develo hosts Medplum in their own AWS.
  • Multi-tenant: Develo customers have separate datastores using Medplum projects.

This application is an example of a software company, using Medplum to build a custom EHR that delights pediatricians, patients, and families alike. Some screen shots of the applications are shown below.

Billing experience (Even the billing experience shows attention to detail)

· One min read
Reshma Khilnani

2023 in Review

As we close out 2023, the Medplum team would love to thank our customers and community for joining us on this journey.

We wanted to highlight a few memorable moments and reflect on all that happened during the year. It was a lot of fun, and huge thank you to the team who pushed so hard to make all these things happen.

✅ Added many wonderful customers, and several have written case studies about how they use Medplum.

ONC Certified in March

✅ 99.999% uptime

✅ Launched integrations with many popular platforms like Labcorp and Epic

✅ Enhanced our connectivity with on premise systems with the Medplum agent

✅ Released support for FHIRcast

✅ Doubled the size of our team

✅ Added to our Youtube Channel and Discord Community

✅ Enhanced our our Roadmap

Thank you, dear reader, for being part of our community. See you on Discord.

· 3 min read
Reshma Khilnani

Medplum’s Open-Source FHIRcast Hub Enables Rad AI Omni Reporting's Interactive Measurements

Radiology is a bellwether for innovations in Healthcare IT due to the time-sensitive and data-intensive workflow. Naturally, radiology applications lead the way in adopting real-time functionality like FHIRcast, a WebSockets-based protocol that enables development of highly interactive applications.

Today, we are showcasing the Rad AI Omni Reporting platform, with FHIRcast support through Medplum’s open source FHIRcast hub.

How does it work?

Let’s consider an example: a radiologist makes a tumor measurement from a PACS workstation; that measurement can be sent in real-time to the FHIRcast hub as an event. The event is then forwarded to the radiologist’s report editor, where a context-aware description is automatically filled in describing the tumor findings, all without the radiologist ever needing to touch another application or do dictation.

Why open source?

Proprietary notification systems are a walled garden, and make it difficult or impossible to build highly ergonomic applications. An open-source FHIRcast hub is a foundational community asset, as developers and vendors can focus on building integrations rather than the plumbing. Open source provides a lot of flexibility for prototyping, testing and integrations across organizations.


Integration is a thorny problem in healthcare overall, and the adoption of standards has been a key tool in allowing system interoperability. Specifically for FHIRcast, a reference implementation that partners can prototype against and use without restriction will increase quality and speed of integration.

Rad AI interactive reporting enabled by FHIRcast

Rad AI Omni Reporting uses the Integrated Reporting Application (IRA) spec and Medplum’s open source FHIRcast hub to enable the rich, interactive application seen in the video.

Rad AI is excited to use open source FHIRcast for context syncing and data passing with our imaging and worklist partners. Having an open-source, standards-based FHIRcast hub lowers the barrier of entry for products to work together.

John Paulett Director of Engineering, Rad AI

About Rad AI

Rad AI is the fastest-growing radiologist-led AI company. The company was recently listed on the CB Insights’ Digital Health 50 as one of the top privately-owned companies using digital technology to transform healthcare, Digital Health 150 as one of the most innovative digital health startups, and AI 100 as one of the world’s 100 most promising private AI companies. Rad AI won AuntMinnie’s “Best New Radiology Software” in 2023 for Omni Reporting and “Best New Radiology Vendor” in 2021. In 2022, Black Book ranked Rad AI #1 in Mean KPI score on its survey of 50 emerging solutions challenging the healthcare technology status quo.

Founded in 2018 by the youngest radiologist in U.S. history, Rad AI has seen rapid adoption of its AI platform and is already in use at 8 of the ten largest private radiology practices in the U.S. Rad AI uses state-of-the-art machine learning to streamline repetitive tasks for radiologists and automate workflow for health systems, which yields substantial time savings, alleviates burnout, and creates more time to focus on patient care.

· 6 min read
Rahul Agarwal

Digital health companies are at the forefront of revolutionizing patient experience by combining quality care, at lower costs, and at national scale. Typically, they target a specific healthcare niche, concentrating on top-notch execution. Their ultimate goal? To merge an exceptional patient experience with smooth operations behind the scenes.

When operations are executed right, patients have a seamless experience - everything Just Works TM. At Medplum, we've worked with many excellent digital health implementations, and there are four foundational elements that make their operations truly stand out:

  1. Well-defined Service Menu
  2. Top of License Care
  3. Fifty-state Workflow
  4. Asynchronous / Hybrid Care Models

To implement these elements, companies need IT infrastructure that can be tailored to their service niche.

Traditional EHRs weren't built for this - rather, they were built to serve a broad healthcare domain at a smaller scale, typically within the four walls of a single site.

Well-Defined Service Menu

Successful digital health operations start with a crystal-clear understanding of their clinical service menu. From an IT standpoint, this means defining your codes:

A well-defined scope not only sets the stage for streamlined integration and billing, but also paves the way for a superior clinical experience. Instead of using one-size-fits-all EHRs, developers can build dedicated interfaces for physicians, highlighting only the necessary data for that specific care context.

The result? Reduced data entry, no physician burnout, and easier clinician recruitment. That's why Medplum provides a truly headless platform - to empower developers to create purpose-built physician experiences.

Example: Summer Health

Summer Health had clear understanding of their service menu: non-acute, pediatric care over SMS. They built a provider charting experience from the ground up, with a focus on fast, mobile-first charting.

Some of their design innovations were:

  • Using buttons, not drop-downs, to select from the most common patient complaints reduced data entry mistakes.
  • Integrating an LLM summarization of the SMS exchange between provider / patient exchange significantly reduced typing time.
  • Implementing a paginated workflow for each encounter made efficient use of the mobile screen real estate and reduced physician frustration.

Summer Health

The cumulative effect of these changes was to convert charting from a chore to a delight. Physicians spent more time on patient care, and less time and energy on charting.

Top of License Care

Delivering high quality care at reasonable cost means that everyone is working at the "top of their license". MDs and NPs focus diagnosing, prescribing, and designing care plans; care coordinators handle administrative inquiries.

To implement this model at scale, operations teams need to develop a clear ontology of clinical tasks and roles that mirrors their care model, and manage them in a unified task system. The challenge is ensuring tasks are automatically directed to the professional, while still being able to escalate when needed.

Many traditional EHRs make offer a fixed clinical workflow, or offer limited configurability. Most of them cater to MDs, but don't account for care coordinators, customer success representatives, fulfillment teams, and the host of other roles that make up the digital health care workforce.

Platforms like Medplum offer a layer of programmability on top of the FHIR Task model, which allows developers to implement the precise clinical workflow model. Tasks can be organized into queues and assigned based on credentials and availability. And by integrating these automations (i.e. Bots) into the software development lifecycle, operations teams can test workflow changes before deploying and release with confidence.

For a deeper dive, check out our guide on Task-based workflows.

Fifty-State Workflow

One of the game-changing innovations of digital health was the ability to serve patients across all 50 states. But this evolution brought with it a significant challenge: managing physician licensure and credentials nationwide.

Operations teams need to make sure that enough physicians with the correct licenses are staffed to serve their patient population. They need to account for differing physician specialties, regulatory restrictions, and vacations across service lines.

The key to managing this complexity is having the right data model. How can you manage physician coverage if you don't even know which licenses they have? Traditional EHRs fall short here, as they presume a single-site deployment.

Leveraging the FHIR standard, platforms like Medplum offer the building blocks for tracking physician credentials, specialties, and care teams. For more insights, take a look at our guides on provider organizations, credentials, and payor networks.

Asynchronous Care

The rise of telehealth, catalyzed by the pandemic, was the first step in unlocking the potential of digital care. But digital health is more than just video calls. Not every patient concern needs to be handled face-to-face.

Allowing physicians to deliver care asynchronously allows them to scale themselves more effectively. It also opens up a number of different interaction media for patients: SMS, video message, in-app chat, etc.

While many traditional EHRs have chat functionality, their visit-centric model makes them ill-suited to async-first care. What is considered a "visit" in the age of day-long text chains?

Platforms like Medplum present the tools to craft your care delivery, even if it adopts a partially or completely asynchronous model. Our guide on asynchronous encounters provides an example of how you can build your async care model from FHIR primitives.

The Digital Health Operations Playbook

  1. Define your Service Menu: For each clinical service line, clearly outline the relevant CPT, ICD-10, RxNorm, and LOINC codes. Focus on the essential data entry requirements from patients and physicians and prioritize data-entry efficiency.

  2. Diagram your Operations: Map out the clinical tasks required, identify the roles responsible for each task, and determine data access restrictions for each role. A visual representation, such as a flow-chart, of your end-to-end clinical operations can be immensely beneficial.

  3. Map out your Provider Organization: Have a keen understanding of the availability and qualifications of each provider to ensure availability and access to care.

  4. Define the Encounter Model: In the digital age, redefining the patient encounter is crucial. Understand the mediums, methods, and tools at your disposal and craft a model that best serves your patients and practitioners alike.

Remember, in the rapidly evolving world of digital health, operations are more than just a behind-the-scenes affair. It's the backbone that defines patient experience and sets the stage for a future of healthcare that is efficient, effective, and truly global.

· 3 min read
Reshma Khilnani

Those who have experienced the wait and shuffle of a specialist referral will appreciate the thoughtful and futuristic approach of the team at Titan Intake.

(5 minute demo)


Continuity of care is broken because practices rely on fax and paper referral workflows to send patients to specialists. It is unrealistic to expect practices to change their systems, but patients need referrals and practices want to process them faster and capture all of the incoming clinical data without manual data entry.


Titan provides a novel solution that leverages large language models (LLMs) to normalize unstructured referral data to FHIR, and gives practitioners and staff a button to synchronize data to their EHR (Cerner and others) via FHIR API. This saves manual work by staff and helps patients track the status of their referral. To lighten provider load, the Titan Intake app automatically synchronizes FHIR data to enable faster and more complete chart prepping.

In addition, as part of the intake process, Titan’s Natural Language Processing (NLP) engine detects and predicts the presence of Hierarchical Classification Codes and Elixhauser Comorbities to help both health systems and payors measure and receive reimbursement for the health of their patient populations. These are added to the FHIR Resources as CodableConcepts.

Medplum Solutions Used

  • Enterprise Master Patient Index (EMPI) - As part of their EMPI implementation Titan checks and deduplicates patients, to prevent the fear of hospital IT - that an integration will introduce duplicates into their system and disturb their reporting and workflow.
  • Interoperability Service - From their web application, Titan triggers data synchronization into many downstream EHRs like Cerner, NextGen and others. This uses the Medplum integration engine a natively multi-tenant system that is very scalable and they serve many providers on the same technical stack.

Here is the full list of Medplum Solutions.

Challenges Faced

  • Extracting data from documents/PDFs and structuring the data as FHIR is a very difficult technical problem. The team employs use of LLMs and modern artificial intelligence techniques to structure and tag the data with code systems.

  • Due to the nature of referrals, with a single patient being sent to many different institutions, duplicate Patient resources immediately become an issue. The team built a FHIR native Enterprise Master Patient Index and deduplication pipeline to support this use case.

  • Synchronizing to many downstream EHRs, like Cerner and Epic on an event driven basis is difficult because each EHR has slightly different conventions and requirements to accept data.

Medplum Features Used

· 6 min read

One of the most frequent questions we get from our users is whether they should use Medplum's REST or GraphQL APIs. Both have a FHIR specification, but they offer different tradeoffs for different use cases.

In this post, we'll discuss these tradeoffs and provide some guidance on how you can choose which API is right for you.


GraphQL has surged in popularity in recent years. You can try out FHIR graphql queries on your medplum project using our graphiql sandbox.

In the context of FHIR, one of GraphQL's strongest features is the ability to quickly retrieve multiple linked resources. While the REST API allows similar functionality using the _include and _revinclude search parameters, GraphQL offers a more natural syntax for querying bundles of resources that reference each other.

Patient(id: "patient-id") {
name {
address {
# Get all DiagnosticReports related to this patient
DiagnosticReportList(_reference: patient) {
performer {
code {
# Get all Observation resources
# referenced by DiagnosticReport.result
result {
resource {
... on Observation {
code {
valueQuantity {

In addition, GraphQL also offers very fine grained control for developers to select the exact fields returned in a query, which can reduce your app's network traffic. Unlike the REST API, GraphQL lets you select specific fields, even in deeply nested elements, and provides additional filtering functionality through FHIR Path list filters. This is helpful in applications where bandwidth is at a premium, such as in mobile applications.

Patient(id: "patient-id") {
name {
address {
# Filter the `telecom` field to only contain phone numbers
telecom(system: "phone") {

As with REST batch requests, GraphQL queries and mutations support the retrieval and modification of multiple resources in a single query, respectively.

# Retrieve all Patients
patients: PatientList(name: "Eve", address_city: "Philadelphia") {
name {
address {
# Retrieve all Medications
medications: MedicationList {
code {

However, GraphQL does have some limitations. The FHIR GraphQL specification is under active development, but some parts have not yet reached maturity. For instance, its search specification isn't as detailed as its REST counterpart, though the _filter search parameter is available in both APIs. And FHIR GraphQL does not yet have a specification for PATCH operations, which limits its ability to make field-level updates to a resource.

Lastly, because shape of a GraphQL query's return value depends on the query itself, it's harder to use typescript type definitions from @medplum/fhirtypes to handle return values. Instead, users must defined custom types that match the shape of their query.


The FHIR REST API is the most common way to interact with FHIR-based systems on the market, and enjoys a broad base of support. While REST is an older technology, the FHIR REST API offers a few advantages.

First off, REST offers a relatively richer search specification out of the box, with support for search modifiers, iterated includes, and search result counts .

Moreover, REST supports HTTP PATCH operations, which allows clients to perform targeted, field-level resource updates. This capability is especially useful in high-concurrency environments, where many clients could be editing different parts of the same field.

// This call assigns the Task to the current user
// IF AND ONLY IF the the task has not been modified on the server
await medplum.patchResource('Task',, [
{ op: 'test', path: '/meta/versionId', value: task.meta?.versionId },
{ op: 'replace', path: '/status', value: 'accepted' },
{ op: 'replace', path: '/owner', value: createReference(currentUser) },

And while GraphQL mutations do allow writing multiple resources at once, using FHIR batch requests via the REST API offers more advanced batch writing functionality. The ifNoneExist element can be used to perform a search before creating a resource to prevent duplicate resource creation. Additionally, you can create collection of linked resources that reference each other using the urn:uuid syntax.

resourceType: 'Bundle',
type: 'batch',
entry: [
fullUrl: 'urn:uuid:42316ff8-2714-4680-9980-f37a6d1a71bc',
request: {
method: 'POST',
url: 'Practitioner',
ifNoneExist: 'identifier=|' + identifier,
resource: {
resourceType: 'Practitioner',
identifier: [{ system: '', value: identifier }],
request: { method: 'POST', url: 'ServiceRequest' },
resource: {
resourceType: 'ServiceRequest',
status: 'active',
intent: 'order',
subject: createReference(patient),
code: { coding: [{ system: '', code: '12345-6' }] },
requester: { reference: 'urn:uuid:42316ff8-2714-4680-9980-f37a6d1a71bc' },

Lastly, there are additional APIs that are only available from REST, such as the resource history API, which returns a Bundle of all historical versions of a resource.

However, while REST has more powerful write and search functionality, it has some limitations on reads. You can use the special _elements search parameter to limit which fields in a resource are returned, but this can only be used to filter top-level fields. You cannot specify filter out nested subfields of a complex element.

Additionally, when requesting linked resources using _include and _revinclude with a FHIR search, the REST API will return a flat Bundle of resources. You will have to implement some additional logic in your client to connect linked resources with their base resource, where as GraphQL nests linked resources within their root resource.

Which One Should I Choose?

So which should you choose? Your choice between REST and GraphQL will largely hinge on your specific use-case. Here are three potential paths to consider:

Both (recommended): For those not committed to a specific toolset, blending the best of both worlds is our recommended strategy. GraphQL is great for reading linked resources, and REST offers advanced write, batch, and history management functionality. Using the Medplum Client makes it easy to shift between these two query modalities, and it's what we used when building the Medplum App.

REST API: Using REST is our recommendation if your tasks involve complex searches or filters. Similarly, if you are performing queries that delve into resource history or necessitate targeted updates using PATCH, REST is the way to go. Lastly, REST is the de-facto standard when interacting with multiple FHIR systems.

GraphQL Only: This route may appeal to you if you have invested in building on top of GraphQL tooling such as Apollo. Additionally, if your operations are predominantly read-heavy and bandwidth is at a premium, GraphQL can give you fine-grained control over what is sent over the network.

The decision between REST and GraphQL isn't black and white, and each API offers its own tradeoffs. Medplum aims to offer developers the widest set of options so that they can hone in on the optimal tool for their needs.

· 6 min read
Reshma Khilnani

As a long time YC community member (10+ years) and former Visiting Group Partner, I'm always excited by the great companies that release each Demo Day. For me, it's like the Superbowl 🏈.

Read our coverage on YC S23 Open Source Meetup and Medplum's YC Launch

YC Alumni Demo Day was this Saturday September 2, and I thought the presentations were particularly good. Here's some exciting nuggets we saw in the presentations in the Healthcare and Life Sciences category.

  • Mantle Bio - Snowflake for Biotech - enormous data sets, complex algorithms, and a workforce filled with PhDs who have a lot of domain expertise but little exposure to data engineering make this a fertile area. Great to see an MIT team here as well.
  • Flex - Stripe for HSA/FSA - accepting payments period in healthcare and HSA/FSA in particular requires so many special agreements and there is a ton of float out there. I could see specialized medical services being built around these balances. A team with strong infra chops and big tech background is great to see here too.
  • Decoda - AI medical claims creation. Just Google CPT 99205 (outpatient evaluation and management of new patient) documentation and you'll see why it would be great to have AI do this.
  • Shasta Health - AI Platform for physical therapists. Patients need PT before and after surgery and treatment in so many cases for health reasons and for payers. Having an AI guided workflow here seems so useful.
  • Ohmic Biosiciences - Genetically engineering plants for alternative to roundup. Clear win to getting this to work! A team out of UCB, and reading their launch brought to mind the movie Interstellar.
  • Olio Labs - Therapeutics for tough diseases. There's a lot of great properties of targeting tough diseases. Details are long, but a Startup and FDA article may be of interest.
  • Flair health - Shopify for holistic primary care. It's just a matter of time before there is a huge direct-to-consumer health company the likes of the biggest companies in America, that's the opportunity.
  • Feanix Biotech - genetic testing for animal (cow) breeding. Clear win getting this to work as well! Launch highlights a team at the intersection of big tech and agriculture.
  • Health Harbor - Gen AI to "call insurance" for clinics. The phone, like fax, is a major channel for information exchange. I could see one angle where they payers also put a bot on the other side of the line.
  • Obento Health - Patient engagement for private practice. Patient engagement at the right level is a perennial challenge at all levels. If it's cracked it is a big deal.
  • Andromeda Surgical - autonomous surgical robots. Love the ambition and it would be a breakthrough as trained surgeons are so scarce. Clearly a team savvy about Startups and the FDA.
  • Medisearch - trustworthy medical search. Healthcare is all about trust! Trust is the difference between gold and garbage in this industry. A trusted search engine has immense obvious value. A very technical team working on this.
  • Sohar Health - AI driven clearinghouse for behavioral health. Access to behavioral health is a huge issue at the societal level. Godspeed.
  • Sensible Biotech - mRNA synthesis for therapeutics and vaccines. We all know how powerful mRNA is. Exciting. Related: FDA Orientation for Startups.
  • Wattson Health - Software for managing Rx and automate manual workflow. Healthcare + automation, love it. The quality monitoring benefit seems like a big deal as well. Demo gif tells a good story to those (like us at Medplum) in the industry.
  • Healthtech 1 - Automating repetitive healthcare processes. After fighting our way through interop and "standard" interfaces, we at Medplum know how powerful and compelling RPA can be for practitioners. This is a case in point in solving a burning pain point for practitioners.
  • Synaptic - AI powered training for doctors Well organized, high quality medical knowledge is a sleeper category. UpToDate is the Craigslist of this category and am excited about the future. Also cool to see an interdisciplinary MD + big tech team at work on this.
  • Simbie - AI powered practice in a box for NPs in women's health. I could see myself as a customer of one of their customers.
  • Cleancard - cancer screening as easy as pregnancy test. Clearly valuable and loved their launch. We at Medplum serve several at-home test providers as customers and at home diagnostics resonate a ton with patients. Related: How to start a Biotech on a budget,
  • Nanograb - AI-generated binders for targeted drug delivery. Nothing close to an expert in this space, but am enthusiastic about the descriptive name of this company, and like that they have a neat domain. 70% of drug trials fail because the "grab" doesn't work (poor targeting).
  • MICSI - Higher resolution MRI with faster scan times. MRI (and ultrasound) are really valuable tools, and see many wonderful applications of this technology. MICSI stands for microstructure imaging, and I could see this technology being able to read the mind.
  • Certainly Health - Book doctors and avoid surprise bills. How much (and when) are two frequent unknowns in healthcare. Great to see a technocratic approach here from a technical team.
  • Stellar Sleep - clinic for chronic insomnia. Great use case for a specialized provider, as insomnia really affects quality of life and the standard advice from GP is to reduce coffee intake, neglecting frequent related issues like anxiety, hormonal issues and more.
  • Empirical Health - proactive primary care, scaled with AI. I love the name and the premise. Also, so exciting to see a team with a deep understanding of precision,recall and value based care in this space.
  • Eden Care - Digital health insurance for employers in Africa. Not an expert, but we see a good amount of Medplum community activity in Africa, and am bullish about the opportunity in region.

A huge congrats to YC S23 on your Demo Day!

· 9 min read
Rahul Agarwal

Patient record-keeping systems often have duplicate patient records, which can affect patient care and service delivery. One of the Medplum use cases is the the Enterprise Mater Patient Index (EMPI), database used in healthcare settings to maintain accurate and consistent unique identifiers for patients across various departments and services. A great EMPI implementation will improve patient safety, enhance the quality of care, facilitate data sharing among disparate healthcare systems, AND speed payer contracting.

The Medplum team has had experience with EMPIs across different practices, including telehealth practices, which especially thorny duplication and identity issues as patients may never meet providers in real life.

This video walkthrough summarizes a reference implementation that we have developed based on our experience. It can be used with any identity solutions or matching algorithms.

Our overview of Patient Deduplication Architectures describes the data model and pipelines in detail.


The following points are covered in this implementation:

  • How to trigger the deduplication pipeline by subscribing to changes on the Patient resource, which reduces the maintenance cost of implementation
  • Creating a Task for humans to review high-risk duplicates
  • Creating the Risk Assessment - how likely is this to be a duplicate?
  • Numeric scoring and qualitative scoring for calculating the probability that a record pair is duplicates
  • Workflow for merging two records driven by FHIR Questionnaires
  • Showing how duplicates are deactivated, and creation of a bi-directional link between duplicates
  • How to mark records as "Do not merge"
  • Demonstration of traceability how to audit merges

EMPI Deduplication workflow code



Today I'll go over a simple patient deduplication workflow in Medplum. Patient deduplication is an important problem in healthcare, not just for cleaning up your data, but also for enriching your data when you're pulling patient records from multiple sources. Today's administration will show a human-in-the-loop deduplication pipeline that proceeds in two steps.

First, we will listen for changes to a Patient. Create a set of candidate matches for that patient. Next, we'll have a human review those matches and decide whether to merge or block those matches. So let's get started. You'll see here that we have three patient records, all for people named Alex Smith.

The first two are clearly the same person, but the third one is clearly someone else. Even though they're clearly different ages, they all have the same birthdate. 1970, January 1st as a common placeholder, when the birthdate is not known for our deduplication pipeline, we're going to do a match on first name, last name, date of birth, and zip code, which can be a pretty high fidelity matching pipeline.

In addition to these patient records, we have clinical data associated with each one. So for Mr. Alex Smith, you can see that we have encounters that are linked to that patient. However, we've also gotten medication records in the form of medication requests, also known as prescriptions, but they're assigned to the second Alex Smith record from no, Mr.

So, to trigger our pipeline, we'll first make a change to one of the patient records. She'll then kick off a search for any kind of matching records. Okay, so let's go off and go ahead and kick off this deduplication pipeline. So I'll make a change to Mr. Alex Smith and I'll give him a phone number. So let's just add a phone number here.

We'll say, okay, this will kick off one of our bots. That will look through all patients to find matches. Once it finds a match, it will create a task resource to review the potential duplicate.

So let's, let's look at here. There's not much to the task. The real heart of the real resource that represents the candidate match will be the risk assessment resource, which we'll talk about in a second.

But here we'll see that there's the task. Kind of indicates whether or not this task is active and who should be performing this task. This is great for incorporating the deduplication review into your existing task-based workflow. Let's look at the candidate match in the risk assessment.

So we use a risk assessment resource in a couple different ways. First, We use the method field to indicate what kind of matching rule produced this candidate match here. It was a name, date of birth, and zip code. As I mentioned earlier, the subject is who is considered the source record. That is the the person who triggered the matching process, and we use the basis field for the target record, who we think they match to.

We can also have a, if we look at the JSON, we see that we can have a a numeric score on the probability of match as well as a qualitative assessment. Here we're saying it's 90%, it's almost certain, but we can't be a hundred percent sure. So we'll see. After this first step, we have two candid matches.

One is Mr. Alex Smith, two Alex Smith, and another one is Mr. Alex Smith. To Ms. Alex Smith. The woman patient earlier. Now we're ready for the second part of our pipeline, which is to merge these records. So we'll go to our first one. We'll click on the apps tab, which will show a questionnaire. Questionnaire is a type of FHIR resource associated with this RiskAssessment.

Again, the risk assessment being the risk of a match. And we think that Mr. Alex Smith and Alex Smith are probably a good match, so we'll decide to merge them. So we will not leave the, we will not check this box and then we'll have a couple choices in terms of how we have merge the data. We're gonna merge the names.

We're not gonna do anything with the address because they're the same. And for right now, we won't delete the source patient.

The reason you might wanna do this is after you've done the deduplication, you might want to clean up the old data. However, right now we want to keep the old data round for posterity, so we'll click okay here. Now, if we go back to our patients, we'll see a couple things. We'll see that within these two records.

The Alex Smith record has now become the master record. We see this because it is listed as active. True, but it's the original Mr. Alex Smith record no longer is active. We'll also see that there's this link field that says it replaces Mr. Alex Smith and the other way around Mr. Alice Smith is replaced by Alex Smith, so there's a bidirectional link there.

Additionally, we'll see that the Encounter resources we had before have now been updated. To point from Mr. Alex Smith to Alex, our target resource. So all the clinical data has now been merged to the target patient. Let's go back to our risk assessments, which are our can matches, and let's look at Alex Smith to Mrs.

Alex Smith. Now for this one, we know that they, a human, decide that they're not. The same patient. So we're gonna say, do not merge these records. Let's talk about this dunks. This will add each record to a list such that on the next time we do a match, we know not to make a candidate out of them. So in this case, we say Alex Smith does not, these are are called our do not Match lists.

So in this case, we have Smith. Should not be matched with Mrs. Alex Smith and reciprocally. We have Mrs. Alex Smith and should not be matched with Mr Alex Smith and we can do these. So every patient will have their own do not match list. And when we trigger the first part of our pipeline, again, we will skip over anything on our do not match list.

It can be an arbitrary number of elements on each do not match list. Let's just take a quick look at the bots that perform both of these operations. So we have two bots here. We have our fine matching patients, which is for the first step of the pipeline to generate the tasks and risk assessments, and the second step of our pipeline, which is the merge matching patients.

So as a final step, I'd like to talk about some of the traceability aspects of this deduplication pipeline. So first I'll show you how you can actually see who performed the merge operation. We click on Alex Smith, who was our merged patient. So first off, when we enter the patient page, we'll see in this timeline view that there was a change made to the resource to make this link to the other Alex Smith resource.

And we can see the details here. If we go to the link property, you can say, see that? We are linked to Mr. Alex Smith and vice versa. If we can go to Alex Smith, we can look at their details. Mr. Alex Smith, and you say that they're linked to Alex Smith. You see that they have a reciprocal connection. A Mr.

Alex Smith is replaced by, but Alex Smith replaces Mr. Alex Smith. Next we can look at the history tab to see all the changes that were made. We can actually see that I was the one. If we look here, who added those links? This is a key point I wanna focus on. Even though we use the bot to do it, we actually get the observability that I was the one who triggered the bot.

The way we set that up is that if we go to the bot resource itself, we click on merge match. Patients and go to the details tab. You can see that he has this one flag called Run as users said, to troop for these kind of sensitive deck pipelines. You are gonna wanna turn that on. What that means is that when this bot runs, it keeps track of who triggered it and will show up in the history as that person performing the operations as opposed to the but itself.

So, This is just to give you a quick overview of how even though we performed this merger operation, you can audit when it was done and who it was done by. These bots are stored in our Medplum demo bots repo, and I encourage you to check out that repository to check out the code.

· 6 min read
Reshma Khilnani

YC invites alumni (like us - Medplum is YC S22) to come speak for the current batch and share our experiences. We were excited when our YC Partners Diana and Nicolas invited Medplum to come and speak to Open Source and devtools companies two weeks ago.

YC Open Source Meetup

Jorge, Michel, and me at YC (Nicolas at the edge)

However, I had a moment of panic when I realized that the other panelists were Michel from Airbyte, and Jorge from MindsDB. For context, MindsDB and Airbyte have huge communities. They are frequently featured on Hacker News - which is the New York Times of open source.

Standing side-by-side with these founders, I worried, could not be a flattering comparison.

RepoGithub StarsStage ready?
mindsdb17,600 ⭐😎
airbyte11,500 ⭐😎
medplum789 ⭐😬

A False Comparison

Here's one thing that we believe: 10 or 20 years from now, we will find the fact that we grouped open source startups together funny and peculiar. It's like looking at startups from the late 90's when "Internet" was a startup category - how quaint!

Looking back at our 2023 selves, we will have that same thought about "open source" startups - a grouping representative of the time, and not the startup itself.

YC Founders were quick to pick up the fact that we are not all the same. Below are the questions they asked in session. These answers are from the Medplum perspective, these certainly don't apply globally and the answers you get from Michel and Jorge would no doubt be pretty different.

What Founders Asked

Why did you choose open source?

Healthcare app developers have what we call "the terrible choice", (a) buy an off the shelf health records system and fight with it or (b) invest a ton of time and resources into your medical record system to build your workflow, compliance, certification and integrations.

Minimal/simple medical app implementations are rare, and when they exist - spiral in complexity when the app makes contact with the healthcare establishment (insurance billing, medications ordering, lab integrations etc).

What healthcare devs really need is a composable solution, that enables adding on functionality as needed with great abstractions. Composability is a superpower and is the key attribute of open source.

Open source has some great secondary benefits too. Healthcare devs are very jaded after repeatedly being marketed products that don't really work and/or are largely undocumented black boxes. The ability to test and audit open source in depth has appeal, and helps to build trust.

We actually "learned the open source lesson" a long time ago. The Medplum founding team founded a medical imaging SaaS startup called MedXT (YC W13). MedXT customers were constantly asking for "source code in escrow" or "source access" whenever they would sign a SaaS agreement. We should have put some careful thought as to why they were asking for it. With 20/20 hindsight, we now know that continuity of business and compliance are critical for customers, and open source is a great tool in that regard.

What were some of the most effective tactics for growing your dev mindshare early on?

We have a fundamentally different approach to growing dev mindshare than many other open source startups. We don't do launch week, memes, or advanced Tweeting. (Maybe we should, but that's not top of mind today.)

For us, getting customers to build and run in production was the first step in getting dev mindshare. Electronic Health Records have no "hobbyist" use case, if you are going to use it, it better work in production.

In that way, getting to production really early, before we had 20 stars on our repo, on was our way to establish mindshare. That production usage is something that our customers value, and that in combination with an open source, certified, well documented, compliant offering is novel for the healthcare dev.

Second to running in production, certifications are our second most effective technique to gain developer mindshare. Medplum has common certifications, like SOC 2 and HIPAA as well as some very industry specific ones like ONC and CLIA/CAP. In general devs find certification tasks tedious and un-fun, so a reference implementation appeals.

Citus Data (YC S11) and Posthog (YC W20) have a similar "in production" ethos to Medplum,

YC Open Source Meetup

Umur (VGP YC, Citus Data S11), Tim (Posthog), James (Posthog), me holding the mic

What are your important metrics?

We track the number of projects and apps created, as well as the number that reach certain maturity milestones in their implementation on our hosted service. These are nothing fancy or sophisticated.

We work hard on our customer success and implementations. We track questions and support requests across all channels Github, Discord, Slack and email. We track how many had documentation/content already published, how many were actual bugs and turnaround times. Troubleshooting, helping others build is our customer obsession, and if their implementation is successful, we'll be successful too.

How did you get your first paid customer?

Our first paid customer used our hosted offering.

Medplum is somewhat unique in that we started with our hosted multi-tenant offering. This is the opposite of many open source companies, who start with a self-hosted single tenant offering.

Our initial customers liked what they saw in the repo, and liked that they could self-host, but decided they would rather have a managed solution. We believe that having a managed solution day 1 was crucial for us, as the time to value and feedback loop are much faster.

Do you do professional services? If so, how much?

We do a small amount of professional services in the form of workshops and pilots for prospective enterprise customers. Engineering leaders often have budget for professional services, and this type of engagement helps devs align stakeholders internally as they make the case to their leadership that they should use Medplum.

How did you decide on your license?

We support customers through many complex compliance audits and certification processes. Our license, Apache 2.0 is a reflection of that, it's well understood to stakeholders. There are code scanners everywhere in an audit.

How was fundraising as an open source company?

I mentioned in the beginning how nerve wracking it is to be compared to the Github star machines like MindsDB and Airbyte. We had to do extra preparation and communicate crisply to investors to tell our story, not a version of someone else's.

For Medplum, open source is our way to build trust. I'll compare and contrast that with some other approaches where open source is a form of marketing, and is primarily for growth.

When it comes to fundraising, the narrative is critical: we are open source to build trust with customers and prospective customers. And companies have trusted us with their data, and use us as their primary health datastore in production. The value customers get from having a trusted partner to manage their data is the basis of our business.

Break a leg on your demo day YC S23!