NDArray is a multidimensional array library written in Swift that aims to become the equivalent of numpy
in Swift's emerging data science ecosystem. This project is in a very early stage and has a long but exciting road ahead!
- Have an efficient multidimensional array interface with common things like indexing, slicing, broadcasting, etc.
- Make
NDArray
and its operationsdifferentiable
so its usable along with Swift for TensorFlow. - Create specialized implementations of linear algebra operations for NDArrays containing numeric types using BLAS, LAPACK, Accelerate, or MLIR depending on the environment.
Tutorial | Last Updated |
---|---|
Basic API | August 13 2019 |
You can install it using SwiftPM:
.package(url: "https://github.com/cgarciae/NDArray", from: "0.0.20")
It might work on other compatible package managers. This package is only tested in Swift 5.1, compatibility with previous version is not guaranteed.
NDArray
is a generic container type just like Array
with the difference that its multidimensional. If its elements conform to certain protocols then certain methods and operators like +
, -
, *
, etc, can be used to efficiently perform computations of the whole collection.
import NDArray
let a = NDArray<Int>([
[1, 2, 3],
[4, 5, 6],
])
let b = NDArray<Int>([
[7, 8, 9],
[10, 11, 12],
])
print((a + b) * a)
/*
NDArray<Int>[2, 3]([
[8, 20, 36],
[56, 80, 108],
])
*/
Here we see that the outcome of (a + b) * a
is also and NDArray
of Int
with shape [2, 3]
. To use operators like +
and *
with NDArrays containing your custom types you just have to make them conform to the proper protocols. For example:
import NDArray
struct Point: AdditiveArithmetic {
let x: Float
let y: Float
...
}
let a = NDArray<Point>([Point(x: 1, y: 2), Point(x: 2, y: 3)])
let b = NDArray<Point>([Point(x: 4, y: 5), Point(x: 6, y: 7)])
print(a + b)
/*
NDArray<Point>[2]([Point(x: 5.0, y: 7.0), Point(x: 8.0, y: 10.0)])
*/
You can also apply generic transformations over the data, the previous could have been written as:
elementwise(a, b, apply: +)
// or
elementwise(a, b) { $0 + $1 }
For heavy computation you can use the parallelized version:
elementwiseInParallel(a, b) {
// code
return c
}
In the future NDArray
should be able to estimate the best strategy (serial/parallelized) based on the type and size of the data.
Except for the Basic API, NDArray's Automatic Differentiation and Linear Algebra Optimization capabilities should be opt-in so all users can have access to the library regardless of their environment, i.e. iOS developers should be able to use it even if they don't have access to TensorFlow's compiler or the Lineal Algebra infrastructure.
The first goal is the definition of the library's basic API using pure Swift with no extra optimization or differentiable capabilities. iOS/OSX developers should be able to use the basic API without additional setup. It will also be important to keeping the NDArray's API in close coordination with Swift for TensorFlow's Tensor API to promote knowledge reuse and free documentation if possible.
The second goal is an obvious must have, Swift for TensorFlow's compiler with automatic differentiation is arguably the future of ML and we should use it.
The third goal is what you would expect from any HPC numeric library, the strategy would be to specialize functions/operations for numeric types by using BLAS, LAPACK, Accelerate, or MLIR to speed computation. On the other hand, if successfully integrated with MLIR, BLAS and LAPACK might not be necessary and NDArray could easily become one of the most performant numeric libraries out there.
- Indexing
- Dimension Slicing
- Dimension Filtering by Indexes
- Dimension Masking
- SqueezeAxis
- NewAxis
- Assignment
- Broadcasting
- Pretty Print
- Elementwise Operations
- Basic Operators:
+
,-
,*
,\
- Reduction Operations
- Subscript Bound Checks
- Fancy Indexing
- > 95% Coverage
- Documentation
This can actually be started at any point, although it wont be that useful until various operations like dot
or reductions like sum
or mean
are implemented.
- Conform
NDArray
toDifferentiable
- Make
NDArrays
operations differentiable.
- Link BLAS and LAPACK
- Specialize operations using BLAS, LAPACK, Accelerate, or MLIR
-
dot
- ...
Cristian Garcia – cgarcia.e88@gmail.com
Distributed under the MIT license. See LICENSE for more information.