This is a Swift language implementation of TableSchema for defining schemas to work with tabular data.
A schema on tabular data defines types, imposes constraints, and creates foreign key relationships on fields as data values move from some physical representation to a logical one and vice versa. For instance, a stored CSV file (physical) can be loaded in-memory along with a corresponding schema descriptor to be transformed from string values to Swift Standard Library types like Date
or Int
(logical).
- Source compatibility with Swift 4.2
- Target platforms
- Apple platforms, specifically iOS and macOS
- Full functionality in iOS >= 10 and macOS >= 10.12
- Linux, limited by features available in
swift-corelibs-foundation
- Apple platforms, specifically iOS and macOS
- Apple's
Foundation
framework is the only dependency - Independent from any one particular physical representation
Although incomplete this library is being used in at least one shipping product over a subset of the available features. Field types that are unsupported will not be transformed resulting in potential data loss on those unsupported types. Please consult the following tables. There is a testing suite to keep what should be working in check. Contributions are welcome.
Feature | Status |
---|---|
Streaming and cast on iteration | Available |
Casting field types and formats | Partial |
[De]serialization | Available in Tabular Data Package |
Schema inference | Missing (Unlikely to implement) |
Strict mode | Missing |
Constraint validation | Missing |
Foreign key validation | Missing |
Rich (RDF) Types | Missing (Unlikely to implement) |
Type | Formats | Additional Properties | Forward Status (Physical to Logical) | Reverse Status (Logical to Physical) |
---|---|---|---|---|
string | default, uri, binary, uuid | N/A | Available | Available |
string | N/A | Unavailable | Unavailable | |
number | N/A | Any | Unavailable | Unavailable |
integer | N/A | bareNumber = false | Available* | Available |
integer | N/A | bareNumber = true | Available | Available |
boolean | N/A | trueValues, falseValues | Available | Available |
object | N/A | N/A | Available | Unavailable |
array | N/A | N/A | Available | Available |
date | N/A | default | Available* | Unavailable |
date | N/A | any, pattern | Unavailable | Unavailable |
time | N/A | default | Available* | Unavailable |
time | N/A | any, pattern | Unavailable | Unavailable |
datetime | N/A | default | Available* | Available* |
datetime | N/A | any, pattern | Unavailable | Unavailable |
year | N/A | N/A | Available* | Unavailable |
yearmonth | N/A | N/A | Available* | Unavailable |
duration | N/A | N/A | Available* | Unavailable |
geopoint | default, array, object | N/A | Available | Unavailable |
geojson | default, topo | N/A | Unavailable | Unavailable |
any | N/A | N/A | Unavailable | Unavailable |
* Only available on Apple products (namely iOS and macOS) due to an incomplete implementation in swift-corelibs-foundation
This project is set up using Swift Package Manager. Ideally add it to your project's SPM dependencies or use Xcode's integrated Swift Package Manager. Alternatively, generate your own Xcode .xcodeproj
to integrate with your build system using:
swift package generate-xcodeproj --xcconfig-overrides ./Configuration.xcconfig
Deserializing of data (from, say, a CSV file) can be accomplished by setting up a Table
with an iterator that provides row information using a TableProvider
data source. This allows for the data source to stream data rather than necessarily loading everything in-memory. Table
is agnostic from the specific data source but expects the data source to convert to String
representations.
let sourcePath = "import.csv"
let sourceDialect = DialectalCSV.Dialect()
let fields = [Field("name", type: .string), Field("birthday", type: .date)]
let schema = Schema(fields)
guard let provider = MyTableProvider(atPath: sourcePath, dialect: sourceDialect) else {
fatalError("Oops")
}
let table = Table(provider: AnyTableProvider(provider), schema: schema)
let objects = table.map { $0 }
And defining MyTableProvider
together with a CSV parsing library like DialectalCSV
:
class MyTableProvider: TableProvider {
private let handler: DialectalCSV.InputHandler
private let streamIterator: DialectalCSV.InputIterator
init?(atPath path: String, dialect: DialectalCSV.Dialect) {
guard let handler = DialectalCSV.InputHandler(atPath: path, dialect: dialect) else {
return nil
}
self.handler = handler
self.streamIterator = handler.makeIterator()
}
// MARK: - TableProvider
var header: Header? {
return self.streamIterator.header
}
// MARK: - Sequence
func makeIterator() -> AnyIterator<[String?]> {
return AnyIterator {
return self.streamIterator.next()
}
}
}
Cast the entire data set in-memory:
let objects = [[Any?]]()
let rows = objects.map { schema.reverseCast(row: $0) }
Or streaming output using a CSV parsing library like DialectalCSV
:
let objects: [[Any?]] = [["River Tam", Date(timeIntervalSince1970: 16725225600)],["Simon Tam", nil]]
let destinationPath = "export.csv"
var destinationDialect = DialectalCSV.Dialect()
destinationDialect.nullSequence = "null"
FileManager.default.createFile(atPath: destinationPath, contents: nil)
guard let outputHandler = DialectalCSV.OutputHandler(atPath: destinationPath, dialect: destinationDialect) else {
fatalError("Oops")
}
let header = schema.fields.map { $0.name }
try? outputHandler.open(header: header)
for object in objects {
let row = schema.reverseCast(row: object).map { $0 }
try? outputHandler.append(records: [row])
}
try? outputHandler.close()