Understanding Message Based Architectures in AWS

TechOps Examples

Hey — It's Govardhana MK 👋

Welcome to another technical edition.

Every Tuesday – You’ll receive a free edition with a byte-size use case, remote job opportunities, top news, tools, and articles.

Every Thursday and Saturday – You’ll receive a special edition with a deep dive use case, remote job opportunities, and articles.

Kedify - Offering Live Demo of Kubernetes scaling for modern workloads.

You’ll see:

  • How to automate scaling for HTTP, gRPC, queue, and inference workloads

  • A walkthrough of the Kedify platform and observability dashboard

  • Your projected cost savings using ROI calculator

  • Live Q&A with the product or engineering leads

Kedify is Backed by the founder of GitLab - You’re in great hands.

👀 Remote Jobs

📚️ Resources

🧠 DEEP DIVE USE CASE

Understanding Message Based Architectures in AWS

Message based architectures run behind many systems people use daily that power platforms like Netflix, Uber, and Coinbase that process millions of messages every second.

  • Netflix uses message brokers to stream viewing events between services without slowing user experience.

  • Uber moves location and pricing data through queues to match drivers and riders in real time.

  • Coinbase separates transaction creation from blockchain processing using queues to stay reliable during traffic spikes.

The goal is to decouple producers from consumers so each service can scale, recover, and run independently. Let us start understanding the patterns.

Single Queue

In this model, all messages are sent into one queue and arranged based on their priority.

The producer application sends messages, and the queue internally reorders them so high priority items move to the front. Consumers pick messages from the same queue and process them as they become available. This model is simple to build and works well for small systems or predictable workloads.

The challenge appears when message volume grows, as all consumers compete for messages from the same queue, creating a potential bottleneck or uneven processing rate.

Smaller e-commerce platforms and logistics systems often start with this model.

Multiple Queues with Multiple Consumer Pools

Each priority level or message type has its own dedicated queue and a matching pool of consumers.

The producer routes messages based on their priority, sending them directly to the right queue. High priority messages go to a high priority queue processed by fast, high capacity consumers, while low-priority messages are handled by a different pool.

This setup isolates workloads and ensures that time sensitive operations are never delayed by lower priority ones. It also provides flexibility to scale each queue independently depending on its traffic pattern.

Ride sharing platforms like Uber and Ola use this approach when processing trip related events.

Critical actions such as ride acceptance and payment confirmation go to high priority queues with faster consumers, while background events like location pings or promotional updates are handled in lower priority queues.

Multiple Queues with a Single Consumer Pool

In this variation, messages are still split across multiple queues, but all queues share a single consumer pool.

Each consumer is capable of reading from multiple queues and usually polls them in priority order. High priority queues are checked first, ensuring important work is processed ahead of others, but the same consumer fleet handles all workloads.

This design simplifies scaling and resource management since all consumers belong to one group, but it requires careful tuning to prevent low priority tasks from being starved when high priority traffic spikes.

Payment gateways like Stripe and Razorpay use this model for balancing between transaction events and reporting tasks. The same set of consumers handle both queues but always prioritize transaction related events first to maintain payment reliability.

This is how Priority Queue Patterns looks like in the AWS world.

The entire flow is built around message segregation, routing logic, and isolated consumption.

  • Each message is tagged with a priority level (for example, High or Low) before it leaves the application.

  • Lambda evaluates message metadata and publishes it to the respective SQS queue, one for high priority and another for low priority traffic.

    • High priority queue → shorter visibility timeout, faster retry policy, smaller batch size.

    • Low priority queue → longer retention, relaxed delivery rate, cost-optimized consumers.

  • Separate Lambda or ECS services poll each queue. Scaling is controlled by CloudWatch metrics such as queue depth or message age.

  • High priority consumers scale aggressively to drain bursts quickly. Low priority consumers maintain steady concurrency to save cost.

  • Each queue can have its own DLQ, alarm thresholds, and error visibility settings for faster troubleshooting.

Result: Critical workloads stay responsive, background jobs never block, and the system maintains predictable latency under uneven load.

With this basic understanding, let us explore the Machine Learning Architectures in practical.

TOGETHER WITH THE CODE

Free, private email that puts your privacy first

Proton Mail’s free plan keeps your inbox private and secure—no ads, no data mining. Built by privacy experts, it gives you real protection with no strings attached.

Upgrade to Paid to read the rest.

Become a paying subscriber to get access to this post and other subscriber-only content.

Already a paying subscriber? Sign In.

Paid subscriptions get you:

  • • Access to archive of 200+ use cases
  • • Deep Dive use case editions (Thursdays and Saturdays)
  • • Access to Private Discord Community
  • • Invitations to monthly Zoom calls for use case discussions and industry leaders meetups
  • • Quarterly 1:1 'Ask Me Anything' power session