Skip to content

$substrCP

The $substrCP operator returns a substring from a string, measured in UTF-8 code points (characters), starting at a specified character index with a specified number of characters.


๐Ÿ“Œ Syntax

{ "$substrCP": [ <string>, <startChar>, <charLength> ] }
  • <string>: The input string
  • <startChar>: Character index to start from (0-based)
  • <charLength>: Number of characters (code points) to return

โœ… Base Example 1 โ€“ Extract First 4 Characters

๐Ÿ“ฅ Input Document

{ "title": "Notebook" }

๐Ÿ“Œ Expression

{ "$substrCP": ["$title", 0, 4] }

๐Ÿ“ค Output

"Note"

โœ… Base Example 2 โ€“ Multilingual Safe Slicing

๐Ÿ“ฅ Input Document

{ "label": "เคจเคฎเคธเฅเคคเฅ‡" }

๐Ÿ“Œ Expression

{ "$substrCP": ["$label", 0, 3] }

๐Ÿ“ค Output

"เคจเคฎเคธเฅ"

๐Ÿงฑ Ecommerce Example โ€“ Shorten Product Names

๐Ÿ“Œ Pipeline

[
  { "$unwind": "$items" },
  {
    "$project": {
      "shortName": {
        "$substrCP": ["$items.name", 0, 6]
      }
    }
  }
]

๐Ÿ“ฅ Input Document

{
  "items": [
    { "name": "Bluetooth Speaker" },
    { "name": "Wireless Keyboard" }
  ]
}

๐Ÿ“ค Output

[
  { "shortName": "Blueto" },
  { "shortName": "Wirele" }
]

๐Ÿ”ง Common Use Cases

  • Multilingual-safe substring extraction
  • Character-length-limited output
  • Text formatting for display

  • $substr, $substrBytes, $split, $slice, $strLenCP

๐Ÿง  Notes

  • Use $substrCP over $substr or $substrBytes for Unicode strings.
  • Avoid cutting emojis or multibyte characters with $substrBytes.