IoT Devices Not Reporting

In the previous chapter we've discussed the usefulness of bucket_script
aggregations which allow for per-bucket computations. When combined with a scripted_metric
aggregation, other practical applications arise.
Let me illustrate.
I have a fleet of devices, each of which posts a message to ES every 10 minutes in the form of:
{
"deviceId": "unique-device-id",
"timestamp": "2021-01-19 06:54:00",
"message": "morning ping at 06:54 AM"
}
I'm trying to get a sense of the health of this fleet by finding devices that haven't reported anything in a given period of time. What I dream of is getting:
- the total count of distinct
deviceIds
seen in the last 7 days - the total count of
deviceIds
NOT seen in the last hour - the IDs of the devices that stopped reporting (→ reported in the last 2hrs but not the last 1h)
PUT fleet_messages
{
"mappings": {
"properties": {
"timestamp": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"message": {
"type": "text"
},
"deviceId": {
"type": "keyword"
}
}
}
}
POST fleet_messages/_doc
{
"deviceId": "device#1",
"timestamp": "2021-01-14 10:00:00",
"message": "device#1 in the last week"
}
POST fleet_messages/_doc
{
"deviceId": "device#1",
"timestamp": "2021-01-20 15:40:00",
"message": "device#1 in the last 2 hours"
}
POST fleet_messages/_doc
{
"deviceId": "device#1",
"timestamp": "2021-01-20 16:52:00",
"message": "device#1 in the last hour"
}
POST fleet_messages/_doc
{
"deviceId": "device#2",
"timestamp": "2021-01-15 09:00:00",
"message": "device#2 in the last week"
}
POST fleet_messages/_doc
{
"deviceId": "device#2",
"timestamp": "2021-01-20 15:58:00",
"message": "device#2 in the last 2hrs"
}
After that, let's assume it's exactly 5 PM on Jan 20, 2021.
We're going to use a range
filter to restrict the timestamp, plus a cardinality
aggregation to obtain the unique device count. In pseudo-code:
"last7d": {
"filter":
"range": "2021-01-13 <= timestamp <= 2021-01-20"
"aggs":
"cardinality": "on the field deviceId"
}
"last7d": {
"filter": {
"range": {
"timestamp": {
"gte": "2021-01-13 00:00:00",
"lte": "2021-01-20 17:00:00"
}
}
},
"aggs": {
"uniq_device_count": {
"cardinality": {
"field": "deviceId"
}
}
}
}
Similarly to #1, we're going to use filter
aggregations. This time, though, we'll need two of them.
On top of that, we'll leverage a filters
aggregation because that's one of the few aggregations that produces multi-bucket results which are required by a bucket_script
aggregation.
This bucket script will then produce our final difference between the number of devices that were reporting between 3:00-3:59 PM but haven't reported anything since 4PM.