TiktokenSwift

1.0.0

Swift bindings for OpenAI's tiktoken tokenizer using UniFFI. Count tokens, estimate costs, and manage context windows in your iOS and macOS apps
narner/TiktokenSwift

What's New

1.0.0

2025-08-12T22:57:42Z

Initial Release of TiktokenSwift

Full Changelog: https://github.com/narner/TiktokenSwift/commits/1.0.0

TiktokenSwift

Native Swift wrapper for OpenAI's tiktoken library, providing fast BPE tokenization for OpenAI models.

TiktokenSwift brings the official tiktoken tokenizer to Swift applications through a lightweight FFI bridge, maintaining the same performance and accuracy as the original Python implementation. It supports all standard OpenAI encodings including cl100k_base (used by GPT-3.5-turbo and GPT-4), r50k_base, p50k_base, o200k_base (used by GPT-4o), and o200k_harmony (used by gpt-oss models).

📱 Check out the example SwiftUI app to see TiktokenSwift in action!

Installation

Swift Package Manager

Add TiktokenSwift to your project:

dependencies: [
    .package(url: "https://github.com/narner/TiktokenSwift.git", from: "0.1.0")
]

Quick Start

import TiktokenSwift

// Load OpenAI's cl100k_base encoding
let encoder = try await CoreBpe.cl100kBase()

// Encode text
let text = "Hello, world!"
let tokens = encoder.encode(text: text, allowedSpecial: [])
print("Tokens: \(tokens)")

// Decode tokens (returns String? directly)
if let decoded = try encoder.decode(tokens: tokens) {
    print("Decoded: \(decoded)")
}

Available Encodings

// cl100k_base - Used by GPT-3.5-turbo and GPT-4
let cl100k = try await CoreBpe.cl100kBase()

// o200k_base - Used by GPT-4o and o3-mini
let o200k = try await CoreBpe.o200kBase()

// o200k_harmony - Used by gpt-oss models (structured output support)
let o200kHarmony = try await CoreBpe.o200kHarmony()

// Other encodings
let r50k = try await CoreBpe.r50kBase()    // GPT-2 and older models
let p50k = try await CoreBpe.p50kBase()    // Codex models

// Load by name
let encoder = try await CoreBpe.loadEncoding(named: "cl100k_base")

Advanced Usage

Encoding with Special Tokens

let textWithSpecial = "Hello <|endoftext|> World"
let tokensWithSpecial = encoder.encode(
    text: textWithSpecial, 
    allowedSpecial: ["<|endoftext|>"]
)

// Or encode ordinary text (without special tokens)
let tokensOrdinary = encoder.encodeOrdinary(text: "Hello <|endoftext|> World")

// o200k_harmony has special tokens for structured output
let harmony = try await CoreBpe.o200kHarmony()
let structuredText = "Analyze <|constrain|> only positive <|return|> result"
let structuredTokens = harmony.encode(
    text: structuredText,
    allowedSpecial: ["<|constrain|>", "<|return|>"]
)

Working with Token Counts

// Get token count for text
let text = "The quick brown fox jumps over the lazy dog"
let tokens = encoder.encode(text: text, allowedSpecial: [])
print("Token count: \(tokens.count)")

// Useful for API rate limiting
let maxTokens = 4096
if tokens.count > maxTokens {
    print("Text exceeds token limit")
}

Model Token Limits

Common token limits for OpenAI models:

  • GPT-4: 8,192 tokens (standard), 32,768 tokens (32k), 128,000 tokens (turbo)
  • GPT-3.5-turbo: 4,096 tokens (standard), 16,385 tokens (16k)
  • GPT-4o: 128,000 tokens
  • o3-mini: 128,000 tokens
  • gpt-oss models: 128,000 tokens

Requirements

  • iOS 13.0+ / macOS 10.15+ / tvOS 13.0+ / watchOS 6.0+
  • Xcode 14.0+
  • Swift 5.9+

Architecture Support

  • iOS: arm64
  • iOS Simulator: arm64, x86_64
  • macOS: arm64, x86_64

License

MIT License - See LICENSE file for details.

Performance

TiktokenSwift uses the same Rust-based core as the official Python tiktoken library, providing:

  • Fast BPE tokenization optimized in Rust
  • Thread-safe encoding/decoding operations
  • Efficient memory usage with lazy vocabulary loading

Troubleshooting

Vocabulary Download Issues

The first time you use an encoding, it will download the vocabulary file (~1-2MB) from OpenAI's servers. These are cached in ~/Library/Caches/tiktoken/ for subsequent use.

If you encounter download issues:

  1. Check your internet connection
  2. Verify the cache directory has write permissions
  3. Try clearing the cache and re-downloading

Acknowledgments

This project provides Swift bindings for tiktoken, originally developed by OpenAI.

Description

  • Swift Tools 5.9.0
View More Packages from this Author

Dependencies

  • None
Last updated: Thu Apr 09 2026 07:26:33 GMT-0900 (Hawaii-Aleutian Daylight Time)