Queries in Microservice

Writing queries in microservice is challenging. Queries often need to query data that are scattered among the database owned by multiple services.

For example, in a website like zomato (food ordering service), APIs like findOrder() and findOrderHistory() returns the data owned by the multiple services.

There are two different patterns for implementing query operations in a microservice architecture:

API composition pattern
CQRS pattern

API Composition Pattern

Implement a query that retrieves data from multiple services by querying each service via its API and combining the result.

FIND ORDER OPERATION

API Composition Components

It has two components

API Composer: Implements the query operation by querying the provider service.
Provider Service: Service that owns some of the data that the query returns.

Factors impacting Queries

How data is partitioned
Capabilities of API exposed by the service
Capabilities of the database used by the service

API Composition Design issue

Deciding which component in your service is the query operation’s API composer ( Options: At the client, API gateway, or a dedicated service)
How to write efficient aggregation logic.
API composer should use reactive programming model

Drawbacks of API composition pattern

Increased overhead: Multiple requests and queries.
Reduced availability
Lack of data consistency

CQRS Pattern

Implement a query that needs data from multiple services by using events to main a read-only view that replicates data from services.

Context

Applications with this architecture leverage the strengths of multiple databases: the transactional properties of the RDBMS and the querying capabilities of the text database.

CRQS is a generalization of this kind of architecture. It maintains one or more view databases that implement one or more of the application queries.

Example (Composition pattern)

PROBLEM STATEMENT

Implement order history view, at first glance it seems simple that API composer will fetch data from individual service and combine it. Then return the result to the client.

That’s not simple, because all services don’t have the attribute which is used for filtering and sorting.

There are two ways to solve this problem:

API composer to do an in-memory join. The drawback of this approach is API composers need to fetch large datasets and join, which is inefficient.
First, fetch data from services that support the required field, then request other services. This is possible only if the remaining services support bulk fetch based on the required field.

CQRS Overview

There are the following problems in implementing queries in microservice

With API composition, expensive and inefficient join queries.
The service that owns the data should support the required queries.
Separation of concern: (Service that owns the data, doesn’t mean it will support highly scalable query for the same e.g. Restaurant Service)

CQRS (Command Query Responsibility Segregation)

Service will have two components

Command: Support Update, Insert and delete operation.
Query: Support query operations

Whenever there is a change in the database by command operation it will raise an event that will bring the query database in sync.

Benefits of CQRS

Enables efficient implementation of the queries in the microservice.
Enables the efficient implementation of diverse queries.
Makes querying possible in an event sourcing-based application
Improves separation of concern

Drawbacks of CQRS

More complex architecture
Dealing with replication lag

View Datastore

No SQL limitations

A limited form of transactions
Less general querying capability

Advantage of No SQL

A more flexible data model
Better performance and scalability

Common Solutions

Components to build data sync across the write and read datastores

Transactional logs: Logical replication logs (bin logs in MySQL + debezium), database stream
ETL pipeline to sync the read datastores. An off-the-shelf solution such as Apache storm or can write own solution using Kafka (streaming of events) and services in go lang (to handle high concurrency)

To handle reliability:

Duplicate events: Dedupe mechanism or Idempotency
Failure: Retries and DLQ, alerts
Utilize manual commit of message broker for handling event failures.

Source: Medium

The Tech Platform