Skip to content

$substrBytes

The $substrBytes operator returns a substring from a string, measured in bytes, starting at a specified index with a specified byte length.


📌 Syntax

{ "$substrBytes": [ <string>, <startByte>, <byteLength> ] }
  • <string>: The input string
  • <startByte>: Byte offset (starting from 0)
  • <byteLength>: Number of bytes to return

✅ Base Example 1 – Extract First 3 Bytes (ASCII safe)

📥 Input Document

{ "code": "ABC123" }

📌 Expression

{ "$substrBytes": ["$code", 0, 3] }

📤 Output

"ABC"

✅ Base Example 2 – Byte Truncation of File Name

📥 Input Document

{ "filename": "product.csv" }

📌 Expression

{ "$substrBytes": ["$filename", 0, 7] }

📤 Output

"product"

🧱 Ecommerce Example – Shorten Product Codes for Export

📌 Pipeline

[
  { "$unwind": "$items" },
  {
    "$project": {
      "shortCode": {
        "$substrBytes": ["$items.sku", 0, 5]
      },
      "product": "$items.name"
    }
  }
]

📥 Input Document

{
  "items": [
    { "sku": "CODE99999", "name": "Bag" },
    { "sku": "TOOL88888", "name": "Drill" }
  ]
}

📤 Output

[
  { "shortCode": "CODE9", "product": "Bag" },
  { "shortCode": "TOOL8", "product": "Drill" }
]

🔧 Common Use Cases

  • Export byte-limited strings
  • Handling legacy encodings (e.g., ASCII)
  • Efficient slicing in fixed-byte-width systems

  • $substr, $substrCP, $slice, $split, $indexOfBytes

🧠 Notes

  • Works best with ASCII or single-byte encodings.
  • May truncate multi-byte Unicode characters incorrectly — for Unicode-safe use $substrCP.