whisperkit

0.13.1

Star

On-device Speech Recognition for Apple Silicon

inference ios speech-recognition swift whisper transformers macos visionos watchos

What's New

v0.13.1

2025-07-31T18:14:38Z

Patch release to fix some issues reported relating to tokenizer loading and config-based logit filters.

Tokenizer downloading now respects downloadBase if specified #339 thanks for the suggestion @Kavi-Gupta
Tokenizer will now load offline with the CLI if it exists in the modelFolder path #340 thanks for reporting @cedricporter
Logits filters were observed never actually being passed to the text decoder if defined in the config, this patch makes sure they are observed by passing them to the text decoder on WhisperKit initialization.

Also includes improved logging contributed by @JimLiu, thanks everyone for helping make WhisperKit ever better! 🚀

What's Changed

feat: enhance verbose logging in WhisperKit CLI by @JimLiu in #335
Pass logitfilters to textdecoder, improve tokenizer loading by @ZachNagengast in #343

Full Changelog: v0.13.0...v0.13.1

WhisperKit

WhisperKit is an Argmax framework for deploying state-of-the-art speech-to-text systems (e.g. Whisper) on device with advanced features such as real-time streaming, word timestamps, voice activity detection, and more.

[TestFlight Demo App] [Python Tools] [Benchmarks & Device Support] [WhisperKit Android]

Important

If you are looking for more features such as speaker diarization and upgraded performance, check out WhisperKit Pro and SpeakerKit Pro! For commercial use or evaluation, please reach out to whisperkitpro@argmaxinc.com.

Installation

Swift Package Manager

WhisperKit can be integrated into your Swift project using the Swift Package Manager.

Prerequisites

macOS 14.0 or later.
Xcode 15.0 or later.

Xcode Steps

Open your Swift project in Xcode.
Navigate to File > Add Package Dependencies....
Enter the package repository URL: https://github.com/argmaxinc/whisperkit.
Choose the version range or specific version.
Click Finish to add WhisperKit to your project.

Package.swift

If you're using WhisperKit as part of a swift package, you can include it in your Package.swift dependencies as follows:

dependencies: [
    .package(url: "https://github.com/argmaxinc/WhisperKit.git", from: "0.9.0"),
],

Then add WhisperKit as a dependency for your target:

.target(
    name: "YourApp",
    dependencies: ["WhisperKit"]
),

Homebrew

You can install WhisperKit command line app using Homebrew by running the following command:

brew install whisperkit-cli

Getting Started

To get started with WhisperKit, you need to initialize it in your project.

Quick Example

This example demonstrates how to transcribe a local audio file:

import WhisperKit

// Initialize WhisperKit with default settings
Task {
   let pipe = try? await WhisperKit()
   let transcription = try? await pipe!.transcribe(audioPath: "path/to/your/audio.{wav,mp3,m4a,flac}")?.text
    print(transcription)
}

Model Selection

WhisperKit automatically downloads the recommended model for the device if not specified. You can also select a specific model by passing in the model name:

let pipe = try? await WhisperKit(WhisperKitConfig(model: "large-v3"))

This method also supports glob search, so you can use wildcards to select a model:

let pipe = try? await WhisperKit(WhisperKitConfig(model: "distil*large-v3"))

Note that the model search must return a single model from the source repo, otherwise an error will be thrown.

For a list of available models, see our HuggingFace repo.

Generating Models

WhisperKit also comes with the supporting repo whisperkittools which lets you create and deploy your own fine tuned versions of Whisper in CoreML format to HuggingFace. Once generated, they can be loaded by simply changing the repo name to the one used to upload the model:

let config = WhisperKitConfig(model: "large-v3", modelRepo: "username/your-model-repo")
let pipe = try? await WhisperKit(config)

Swift CLI

The Swift CLI allows for quick testing and debugging outside of an Xcode project. To install it, run the following:

git clone https://github.com/argmaxinc/whisperkit.git
cd whisperkit

Then, setup the environment and download your desired model.

make setup
make download-model MODEL=large-v3

Note:

This will download only the model specified by MODEL (see what's available in our HuggingFace repo, where we use the prefix openai_whisper-{MODEL})
Before running download-model, make sure git-lfs is installed

If you would like download all available models to your local folder, use this command instead:

make download-models

You can then run them via the CLI with:

swift run whisperkit-cli transcribe --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" --audio-path "path/to/your/audio.{wav,mp3,m4a,flac}"

Which should print a transcription of the audio file. If you would like to stream the audio directly from a microphone, use:

swift run whisperkit-cli transcribe --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" --stream

Contributing & Roadmap

Our goal is to make WhisperKit better and better over time and we'd love your help! Just search the code for "TODO" for a variety of features that are yet to be built. Please refer to our contribution guidelines for submitting issues, pull requests, and coding standards, where we also have a public roadmap of features we are looking forward to building in the future.

License

WhisperKit is released under the MIT License. See LICENSE for more details.

Citation

If you use WhisperKit for something cool or just find it useful, please drop us a note at info@argmaxinc.com!

If you use WhisperKit for academic work, here is the BibTeX:

@misc{whisperkit-argmax,
   title = {WhisperKit},
   author = {Argmax, Inc.},
   year = {2024},
   URL = {https://github.com/argmaxinc/WhisperKit}
}

Description

Swift Tools 5.9.0

View More Packages from this Author

Dependencies

swift-transformers0.1.8
swift-argument-parser1.3.0

Last updated: Wed Aug 20 2025 15:06:58 GMT-0900 (Hawaii-Aleutian Daylight Time)