Perplexity has open-sourced a rebuilt Unigram tokeniser designed to cut CPU usage by 5-6 times and improve inference efficiency for smaller AI models. The tool focuses on XLM-RoBERTa's 250,000-token vocabulary, widely used in ranking and retrieval tasks. It matches the reference implementation's output while reducing processing overhead by avoiding costly string rebuilding and hash-maps.