SwiftXGBoost

master

Swift wrapper for XGBoost gradient boosting machine learning framework with Numpy and TensorFlow support.
kongzii/SwiftXGBoost

codecov Platform Swift Version PRs Welcome Ubuntu MacOS

XGBoost for Swift

Bindings for the XGBoost system library. The aim of this package is to mimic XGBoost Python bindings but, at the same time, utilize the power of Swift and C compatibility. Some things thus behave differently but should provide you maximum flexibility over XGBoost.

Check out:

Installation

System library dependency

Linux

Install XGBoost from sources

git clone https://github.com/dmlc/xgboost
cd xgboost
git checkout tags/v1.1.1
git submodule update --init --recursive
mkdir build
cd build
cmake ..
make
make install
ldconfig

Or you can use provided installation script

./install.sh

macOS

You can build and install similarly as on Linux, or just use brew

brew install xgboost
Note

Before version 1.1.1, XGBoost did not create pkg-config. This was fixed with PR Add pkgconfig to cmake #5744.

If you are using for some reason older versions, you may need to specify path to the XGBoost libraries while building, e.g.:

swift build -Xcc -I/usr/local/include -Xlinker -L/usr/local/lib

or create pkg-config file manualy. Example of it for macOS 10.15 and XGBoost 1.1.0 is

prefix=/usr/local/Cellar/xgboost/1.1.0
exec_prefix=${prefix}/bin
libdir=${prefix}/lib
includedir=${prefix}/include

Name: xgboost
Description: XGBoost machine learning libarary.
Version: 1.1.0
Cflags: -I${includedir}
Libs: -L${libdir} -lxgboost

and needs to be placed at /usr/local/lib/pkgconfig/xgboost.pc

Package

Add a dependency in your your Package.swift

.package(url: "https://github.com/kongzii/SwiftXGBoost.git", from: "0.0.0"),

Import Swifty XGBoost

import XGBoost

or directly C library

import CXGBoost

both Booster and DMatrix classes are exposing pointers to the underlying C, so you can utilize C-API directly for more advanced usage.

As the library is still evolving, there can be incompatible changes between updates, the releases before version 1.0.0 doesn't follow Semantic Versioning. Please use the exact version if you do not want to worry about updating your packages.

.package(url: "https://github.com/kongzii/SwiftXGBoost.git", .exact("0.1.0")),

Python compatibility

DMatrix can be created from numpy array just like in Python

let pandas = Python.import("pandas")
let dataFrame = pandas.read_csv("data.csv")
let data = try DMatrix(
    name: "training",
    from: dataFrame.values
)

and the swift array can be converted back to numpy

let predicted = try booster.predict(
    from: validationData
)

let compare = pandas.DataFrame([
    "Label lower bound": yLowerBound[validIndex],
    "Label upper bound": yUpperBound[validIndex],
    "Prediced": predicted.makeNumpyArray(),
])

print(compare)

This is possible thanks to the PythonKit. For more detailed usage and workarounds for known issues, check out examples.

TensorFlow compability

Swift4TensorFlow is a great project from Google. If you are using one of the S4TF swift toolchains, you can combine its power directly with XGBoost.

let tensor = Tensor<Float>(shape: TensorShape([2, 3]), scalars: [1, 2, 3, 4, 5, 6])
let data = try DMatrix(name: "training", from: tensor)

Note

Swift4TensorFlow toolchains ships with preinstalled PythonKit and you may run into a problem when using package with extra PythonKit dependency. If so, please just add package version with -tensorflow suffix, where PythonKit dependency is removed.

.package(url: "https://github.com/kongzii/SwiftXGBoost.git", .exact("0.7.0-tensorflow")),

This bug is known and hopefully will be resolved soon.

Examples

More examples can be found in Examples directory and run inside docker

docker-compose run swiftxgboost swift run exampleName

or on host

swift run exampleName

Basic functionality

import XGBoost

// Register your own callback function for log(info) messages
try XGBoost.registerLogCallback {
    print("Swifty log:", String(cString: $0!))
}

// Create some random features and labels
let randomArray = (0 ..< 1000).map { _ in Float.random(in: 0 ..< 2) }
let labels = (0 ..< 100).map { _ in Float([0, 1].randomElement()!) }

// Initialize data, DMatrixHandle in the background
let data = try DMatrix(
    name: "data",
    from: randomArray,
    shape: Shape(100, 10),
    label: labels,
    threads: 1
)

// Slice array into train and test
let train = try data.slice(indexes: 0 ..< 90, newName: "train")
let test = try data.slice(indexes: 90 ..< 100, newName: "test")

// Parameters for Booster, check https://xgboost.readthedocs.io/en/latest/parameter.html
let parameters = [
    Parameter("verbosity", "2"),
    Parameter("seed", "0"),
]

// Create Booster model, `with` data will be cached
let booster = try Booster(
    with: [train, test],
    parameters: parameters
)

// Train booster, optionally provide callback functions called before and after each iteration
try booster.train(
    iterations: 10,
    trainingData: train,
    evaluationData: [train, test]
)

// Predict from test data
let predictions = try booster.predict(from: test)

// Save
try booster.save(to: "model.xgboost")

Development

Documentation

Jazzy is used for the generation of documentation.

You can generate documentation locally using

make documentation

Github pages will be updated automatically when merged into master.

Tests

Where possible, Swift implementation is tested against reference implementation in Python via PythonKit. For example, test of score method in scoreEmptyFeatureMapTest

let pyFMap = [String: Int](pyXgboost.get_score(
    fmap: "", importance_type: "weight"))!
let (fMap, _) = try booster.score(featureMap: "", importance: .weight)

XCTAssertEqual(fMap, pyFMap)

Run locally

On ubuntu using docker

docker-compose run test 

On host

swift test

Code format

SwiftFormat is used for code formatting.

make format

Description

  • Swift Tools 5.1.0
View More Packages from this Author

Dependencies

Last updated: Tue Nov 12 2024 09:10:44 GMT-1000 (Hawaii-Aleutian Standard Time)