SparkConnect

0.3.0

Apache Spark Connect Client for Swift
apache/spark-connect-swift

What's New

0.3.0

2025-06-04T15:21:09Z

Apache Spark™ Connect Client for Swift language is a subproject of Apache Spark and aims to provide Swift implementation of Spark Connect. 0.3.0 is the third release of Apache Spark Connect for Swift client. This is still experimental.

Website

https://apache.github.io/spark-connect-swift/

Swift Package Index

https://swiftpackageindex.com/apache/spark-connect-swift

Documentation

https://swiftpackageindex.com/apache/spark-connect-swift/0.3.0/documentation/sparkconnect

Full Changelog

0.2.0...0.3.0

Resolved Issues

  • [SPARK-52220] Update README.md and integration test with Apache Spark 4.0.0 RC7
  • [SPARK-52247] Upgrade gRPC Swift Protobuf to 1.3.0
  • [SPARK-52248] Use exact versions of dependency
  • [SPARK-52268] Add variant SQL test and answer file
  • [SPARK-52269] Add cast SQL test and answer file
  • [SPARK-52271] Upgrade Spark to 4.0.0 in CIs and docs
  • [SPARK-52274] Update ArrowReader/Writer with GH-44910
  • [SPARK-52277] Upgrade Docker tags to 4.0.0 instead of 4.0.0-preview2
  • [SPARK-52289] Enable jsonToDdl test in Linux environment
  • [SPARK-52293] Use super-linter for markdown files
  • [SPARK-52298] Publish apache/spark-connect-swift:pi docker image
  • [SPARK-52301] Support Decimal type
  • [SPARK-52302] Improve stop to use ReleaseSessionRequest
  • [SPARK-52317] Identify InvalidTypeException in SparkConnectClient
  • [SPARK-52318] Refactor SparkConnectError to simplify case names
  • [SPARK-52319] Add (Catalog|Schema|TableOrView)NotFound to SparkConnectError
  • [SPARK-52320] Add ColumnNotFound/InvalidViewName/TableOrViewAlreadyExists to SparkConnectError
  • [SPARK-52321] Add SessionClosed, SqlConfNotFound, ParseSyntaxError to SparkConnectError
  • [SPARK-52322] Add publish_image GitHub Action job
  • [SPARK-52340] Update ArrowWriter(Helper)? and ProtoUtil with GH-43170
  • [SPARK-52341] Upgrade Spark to 3.5.6 from 3.5.5 in Spark 3 and Iceberg integration tests
  • [SPARK-52343] Download Apache Spark distributions via ASF Mirrors site
  • [SPARK-52359] Upgrade gRPC Swift NIO Transport to 1.2.2
  • [SPARK-52360] Upgrade gRPC Swift to 2.2.2
  • [SPARK-52361] Support executeCommand in SparkSession
  • [SPARK-52369] Fix Session ID to be lowercased always
  • [SPARK-52370] Update Requirement section to point the official Apache Arrow Swift repository
  • [SPARK-52371] Update Example projects to use the latest main branch always
  • [SPARK-52373] Add CRC32 struct
  • [SPARK-52374] Publish apache/spark-connect-swift:web docker image
  • [SPARK-52376] Support addArtifact(s)? in SparkSession

Apache Spark Connect Client for Swift

GitHub Actions Build Swift Version Compatibility Platform Compatibility

This is an experimental Swift library to show how to connect to a remote Apache Spark Connect Server and run SQL statements to manipulate remote data.

So far, this library project is tracking the upstream changes of Apache Arrow project's Swift-support.

Resources

Requirement

How to use in your apps

Create a Swift project.

mkdir SparkConnectSwiftApp
cd SparkConnectSwiftApp
swift package init --name SparkConnectSwiftApp --type executable

Add SparkConnect package to the dependency like the following

$ cat Package.swift
import PackageDescription

let package = Package(
  name: "SparkConnectSwiftApp",
  platforms: [
    .macOS(.v15)
  ],
  dependencies: [
    .package(url: "https://github.com/apache/spark-connect-swift.git", branch: "main")
  ],
  targets: [
    .executableTarget(
      name: "SparkConnectSwiftApp",
      dependencies: [.product(name: "SparkConnect", package: "spark-connect-swift")]
    )
  ]
)

Use SparkSession of SparkConnect module in Swift.

$ cat Sources/main.swift

import SparkConnect

let spark = try await SparkSession.builder.getOrCreate()
print("Connected to Apache Spark \(await spark.version) Server")

let statements = [
  "DROP TABLE IF EXISTS t",
  "CREATE TABLE IF NOT EXISTS t(a INT) USING ORC",
  "INSERT INTO t VALUES (1), (2), (3)",
]

for s in statements {
  print("EXECUTE: \(s)")
  _ = try await spark.sql(s).count()
}
print("SELECT * FROM t")
try await spark.sql("SELECT * FROM t").cache().show()

try await spark.range(10).filter("id % 2 == 0").write.mode("overwrite").orc("/tmp/orc")
try await spark.read.orc("/tmp/orc").show()

await spark.stop()

Run your Swift application.

$ swift run
...
Connected to Apache Spark 4.0.0 Server
EXECUTE: DROP TABLE IF EXISTS t
EXECUTE: CREATE TABLE IF NOT EXISTS t(a INT)
EXECUTE: INSERT INTO t VALUES (1), (2), (3)
SELECT * FROM t
+---+
| a |
+---+
| 2 |
| 1 |
| 3 |
+---+
+----+
| id |
+----+
| 2  |
| 6  |
| 0  |
| 8  |
| 4  |
+----+

You can find more complete examples including Spark SQL REPL, Web Server and Streaming applications in the Examples directory.

This library also supports SPARK_REMOTE environment variable to specify the Spark Connect connection string in order to provide more options.

Description

  • Swift Tools 6.0.0
View More Packages from this Author

Dependencies

Last updated: Thu Jun 19 2025 00:37:43 GMT-0900 (Hawaii-Aleutian Daylight Time)