$densify
$densify generates synthetic documents to close gaps in numeric or time-based series. Use it after sorting (and optionally partitioning) to ensure downstream stages operate on regular intervals.
📌 Syntax
{
"$densify": {
"field": "<path>",
"partitionByFields": [ "<field>", ... ], // optional
"range": {
"step": <number>,
"unit": "day" | "hour" | ..., // optional (dates)
"bounds": "full" | [ <lower>, <upper> ]
}
}
}
field: Numeric or temporal field to densify. For date/time, supply aunit(millisecond,second,minute,hour, orday).partitionByFields: Partition keys evaluated per document; densification runs independently per partition.bounds:"full"uses the first and last values; arrays extend the range explicitly.
🛒 Example – Fill Missing Daily Observations
{
"$densify": {
"field": "eventDate",
"partitionByFields": ["sku"],
"range": {
"step": 1,
"unit": "day",
"bounds": "full"
}
}
}
If only 2023-01-01 and 2023-01-03 exist for a SKU, $densify inserts a synthetic document for 2023-01-02 with the partition fields present and other fields unset.
📥 Input
[
{ "sku": "A", "eventDate": { "$date": "2024-01-01" }, "quantity": 5 },
{ "sku": "A", "eventDate": { "$date": "2024-01-03" }, "quantity": 7 }
]
📤 Output
[
{ "sku": "A", "eventDate": { "$date": "2024-01-01" }, "quantity": 5 },
{ "sku": "A", "eventDate": { "$date": "2024-01-02" } },
{ "sku": "A", "eventDate": { "$date": "2024-01-03" }, "quantity": 7 }
]
💡 Tips
- Run
$sortbeforehand so documents arrive ordered by partition and densify field. - Combine with
$fillto backfill values (linear interpolation, LOCF, or constants) after generating gap rows. - Bounds array values accept literals, numeric epochs, or ISO date strings.