Sitecore 9 in Azure PaaS offers a robust ecosystem that will allow you to monitor performance of your application and infrastructure. Metrics are made available via the built in metrics of the PaaS environment and metrics logged to Application Insights. Sitecore also captures custom metrics relevant to Sitecore operations and pushes them to App Insights. By default there is a lot of data being captured, so much so in fact it’s worth trimming any metrics that you don’t require. Check out the Sitecore’s post deployment recommendations for App Insights.
Application insights is amazingly powerful, allowing you to build queries (in the Kusto query language), but with power comes great responsibility. Querying to trawl logs is one thing, but at a glance I want to see the health of my Sitecore instance. The Azure portal offers a customisable dashboard system so that you can be greeted with graphs, metrics & labels and then quickly change between other dashboards (possibly for other projects or environments). Graphs and charts can easily be customised in the metrics blade then quickly added to your dashboard by selecting “Pin to dashboard”. Dashboards can also be shared with other users in your Azure subscription. Microsoft have provided some solid documentation on creating/customising and sharing dashboards.
Usually I’ll try to get as much relevant data as possible into 2 sections. One for public facing (eg. CD metrics, uptime, response times etc) and one for Sitecore “behind the scenes” (CM/Processing and reporting metrics). There are some key metrics I like to keep an eye on. These are usually indicators that something might be wrong and it’s time to investigate the cause by drilling down or hitting the logs.
- DTU Percentage
App Services & App Service plans
- CPU Percentage
- Memory Percentage
- Avg response time
- Exceptions (sum, split by role)
- Sitecore.Caching / Cache Misses /s
- Failed requests (exposes all HTTP responses > 400)
- Availability statistics
- Live Stream (just a overview and link)
Adding all of these pretty much fill a decent sized dashboard, but there are some other metrics that are good indicators of health too. So it may be worth baselineing them and adding (email or webhook) alerts above your thresholds as they may not be front and centre (you can do this for almost all metrics!).
- Sitecore.Analytics/ Aggregation Live Interactions Processed /s
- Sitecore.Analytics/ Aggregation Contacts Processed /s
- Search latency
- Search queries /s
The list of available metrics Sitecore logs by default can be found in App_Config/Include/zzz/Sitecore.Cloud.ApplicationInsights.config by default. Take a look and see what may be relevant to your solution for each role.
From here you can continue to add and tweak your dashboards that best suit your solution.