top of page

Queries in Microservice

Writing queries in microservice is challenging. Queries often need to query data that are scattered among the database owned by multiple services.


For example, in a website like zomato (food ordering service), APIs like findOrder() and findOrderHistory() returns the data owned by the multiple services.


There are two different patterns for implementing query operations in a microservice architecture:

  • API composition pattern

  • CQRS pattern


API Composition Pattern


Implement a query that retrieves data from multiple services by querying each service via its API and combining the result.


FIND ORDER OPERATION



API Composition Components

It has two components

  • API Composer: Implements the query operation by querying the provider service.

  • Provider Service: Service that owns some of the data that the query returns.



Factors impacting Queries

  • How data is partitioned

  • Capabilities of API exposed by the service

  • Capabilities of the database used by the service


API Composition Design issue

  • Deciding which component in your service is the query operation’s API composer ( Options: At the client, API gateway, or a dedicated service)

  • How to write efficient aggregation logic.

  • API composer should use reactive programming model


Drawbacks of API composition pattern

  • Increased overhead: Multiple requests and queries.

  • Reduced availability

  • Lack of data consistency


CQRS Pattern


Implement a query that needs data from multiple services by using events to main a read-only view that replicates data from services.


Context

Applications with this architecture leverage the strengths of multiple databases: the transactional properties of the RDBMS and the querying capabilities of the text database.

CRQS is a generalization of this kind of architecture. It maintains one or more view databases that implement one or more of the application queries.


Example (Composition pattern)

PROBLEM STATEMENT

Implement order history view, at first glance it seems simple that API composer will fetch data from individual service and combine it. Then return the result to the client.

That’s not simple, because all services don’t have the attribute which is used for filtering and sorting.

There are two ways to solve this problem:

  • API composer to do an in-memory join. The drawback of this approach is API composers need to fetch large datasets and join, which is inefficient.

  • First, fetch data from services that support the required field, then request other services. This is possible only if the remaining services support bulk fetch based on the required field.


CQRS Overview

There are the following problems in implementing queries in microservice

  • With API composition, expensive and inefficient join queries.

  • The service that owns the data should support the required queries.

  • Separation of concern: (Service that owns the data, doesn’t mean it will support highly scalable query for the same e.g. Restaurant Service)


CQRS (Command Query Responsibility Segregation)

Service will have two components

  • Command: Support Update, Insert and delete operation.

  • Query: Support query operations

Whenever there is a change in the database by command operation it will raise an event that will bring the query database in sync.


Benefits of CQRS

  • Enables efficient implementation of the queries in the microservice.

  • Enables the efficient implementation of diverse queries.

  • Makes querying possible in an event sourcing-based application

  • Improves separation of concern


Drawbacks of CQRS

  • More complex architecture

  • Dealing with replication lag


View Datastore

No SQL limitations

  • A limited form of transactions

  • Less general querying capability


Advantage of No SQL

  • A more flexible data model

  • Better performance and scalability


Common Solutions

Components to build data sync across the write and read datastores

  • Transactional logs: Logical replication logs (bin logs in MySQL + debezium), database stream

  • ETL pipeline to sync the read datastores. An off-the-shelf solution such as Apache storm or can write own solution using Kafka (streaming of events) and services in go lang (to handle high concurrency)


To handle reliability:

  • Duplicate events: Dedupe mechanism or Idempotency

  • Failure: Retries and DLQ, alerts

  • Utilize manual commit of message broker for handling event failures.



Source: Medium


The Tech Platform

0 comments
bottom of page