Monitoring Azure Service Bus Topic subscriptions

Azure Monitor does not provide metrics for Azure Service Bus Topics to visualize, alert or autoscale on; in this post, I'll provide some workarounds to make it possible.

Monitoring Azure Service Bus Topic subscriptions

Almost 6 years ago, I wrote my first article on this blog on how you can automatically dead-letter expired messages, what they are and how you can process them.

Today, we will talk about how you can monitor dead-lettered on Azure Service Bus Topics, what the challenges are and how to work around them.

But first… Azure Service Bus Topics?

What are Azure Service Bus Topics & Subscriptions?

Azure Service Bus is one of my favorite services in Microsoft Azure. It allows you to do asynchronous processing by using queues or topics, both of them are referred to as “entities” in your namespace. While queues are very common, topics are used a lot less but they are super powerful when used well.

You can think of topics as queues on steroids - Message producers can publish messages to the topic with the same look and feel; but for the message processors, it’s different.

Subscribers will not process the topic directly, but instead they will use subscriptions that give you a copy of the message only for that subscription. Next to that, you can create message rules on your subscriptions allowing you to reduce the messages to only what you are interested in.

Message processors that are consuming different subscriptions will not compete for messages since they have their own copy, allowing you to use pub/sub and fan the processing out; 1 published message can go to n different processors by using n subscriptions.

When putting all of it together, it looks like this under the hood:

For example, if you have a topic that contains events of what is going on in your platform, you can create a subscription to filter out only the OrderV1 messages to process.

What is the challenge?

When using topics, you will not have dead-lettered messages on the topic itself, but more on an individual subscription level since they have their own copy of every message and are processed separately.

It is highly recommended to make sure that you gain insights when the processing of messages is failing, you can do this by using dashboards and automated alerts. Next to that, when building scalable solutions you’d like to automatically scale in/out based on the work that is waiting to be processed. But in order to do all of that, you need to use metrics and this is where the problem lies…

Azure Service Bus provides a variety of out-of-the-box metrics in Azure Monitor which provides metrics such as “DeadletteredMessages” giving you the total message count for all your queues and topics (also called entities). Luckily, this is a multi-dimensional metric that allows you to go more in detail by splitting or filtering it to get more detailed information.

However; when looking into the details of the metrics, in our case “DeadletteredMessages”, you can see that the only dimension that is available is “EntityName” which is either a queue or a topic; no subscription-level metrics.

Learn more on how to use these metrics in their documentation here.

Those who have worked with Azure Service Bus for a while might still remember the old old SDK which allowed you to get queue metadata which included the message counts for queues, topics & subscriptions (sample); but obviously, we should not rely on that SDK anymore since it’s no longer maintained and only supports .NET Framework.

Read more about Service Bus SDKs in Sean’s great blog post here.

After digging around, it turns out that there is no real way to gain insights on message count for active, dead-lettered, and other messages on subscriptions.

Frankly, I was pretty amazed to find that out since we have a lot of customers who need this and Promitor users are actively asking for them as well. So what can we do?

So… what are our options?

Let’s take a look at a few options, but I want to emphasize that the majority of them mainly apply for dead-lettered messages and less for active messages because it requires moving messages around.

However, these are just patterns so it’s up to you to decide what works best for your scenario!

Sit back and wait

We can wait and vote on the features in UserVoice here and here, but have been open for a very long time, so I’m not getting my hopes up and decided to look at some alternatives.

Automatically forwarding dead-letter messages to a dedicated queue

The cleanest and simple approach is to use the auto-forwarding of dead-letter messages to a dedicated queue.

This allows you to rely on a built-in feature and simply use the built-in Azure Monitor metrics to visualize and alert on - Super easy! 🎉

Here is what it looks like:

That way there is a very low impact on your application infrastructure and no additional things to worry about.

However; if your topic subscription contains different message types you will not have granular insights on how many dead-lettered messages you have per message type.

Enriching & forwarding dead-letter messages a dedicated queue

Do you need more granular insights? Time to do some coding!

Instead of automatically forwarding dead-letter messages, you can write an Azure Function that processes every dead-lettered message to measure metrics with all the dimensions you need and forward it to a dedicated queue.

This approach is super powerful because you gain deep insights on what messages are failing so that you can create dedicated alerts & visualization:

The downside, however, is that we need to deploy, run and operate an Azure Function that can go down and is burning money. If this one fails, so does our monitoring; up to you to decide if you can justify the risk for it.

Next to that, you'll have to pay for the message operations to forward and process them again.

Sounds interesting? Just deploy this Azure Function which is powered by Arcus Observability and you are good to go!

Automatically forwarding all messages to a dedicated queue

Want to go even further? You can forward all messages to a dedicated queue and process everything there.

That way, you can use a topic subscription to filter the messages and have all metrics including active message count for example.

Conclusion

With both workarounds, you can now monitor your Azure Service Bus Topics & Subscriptions, but they both have their trade-offs.

If you want to deploy this yourself and give it a try, you can go to this GitHub repo and use the message simulators to generate some traffic to see it in action.

Don’t forget to vote on these UserVoice items and let’s hope we will have subscription-level metrics in 2021!

Thanks for reading,

Tom.

Cover photo by Willian Justen de Vasconcellos