How to use CoreML model in iOS app
In the previous blog post, we've already finished training our own CoreML model.
Now let's build iOS app to use it. But first, we need to get the model.
On Create ML, we can go to Output tab and tap Get. Save the .mlmodel.
Project setup
Next, create a new iOS app project in Xcode. For this, we'll setup simple app with a button to pick a photo from Photo Library and another button to run the inference.
Here's the ContentView.swift
struct ContentView: View {
@State private var selectedImage: UIImage?
@State private var imageSelection: PhotosPickerItem?
@State private var showPermissionAlert: Bool = false
init() {
checkPhotoLibraryPermission()
}
private func checkPhotoLibraryPermission() {
switch PHPhotoLibrary.authorizationStatus(for: .readWrite) {
case .notDetermined:
PHPhotoLibrary.requestAuthorization(for: .readWrite) { status in
if status != .authorized {
DispatchQueue.main.async {
self.showPermissionAlert = true
}
}
}
case .restricted, .denied:
showPermissionAlert = true
case .authorized, .limited:
break
@unknown default:
break
}
}
var body: some View {
VStack(spacing: 20) {
// Image display area
if let image = selectedImage {
Image(uiImage: image)
.resizable()
.scaledToFit()
.frame(maxHeight: 400)
} else {
VStack(spacing: 12) {
Image(systemName: "hand.raised.fill")
.resizable()
.scaledToFit()
.frame(width: 100, height: 100)
.foregroundColor(.gray)
.opacity(0.7)
Text("Select a photo to detect palm")
.font(.headline)
.foregroundColor(.gray)
.multilineTextAlignment(.center)
}
.frame(width: 300, height: 400)
.background(
RoundedRectangle(cornerRadius: 12)
.stroke(Color.gray.opacity(0.3), lineWidth: 2)
.background(Color.gray.opacity(0.1))
.cornerRadius(12)
)
}
// Photo picker button
PhotosPicker(
selection: $imageSelection,
matching: .images
) {
Text("Select Photo")
.frame(maxWidth: .infinity)
.padding(.vertical, 4)
.foregroundColor(Color.blue)
.cornerRadius(10)
}
.padding(.horizontal)
.padding(.top, 40)
// Detect palm button
Button(action: {
detectPalm()
}) {
Text("Detect Palm")
.frame(maxWidth: .infinity)
.padding()
.background(selectedImage != nil ? Color.green : Color.gray)
.foregroundColor(.white)
.cornerRadius(10)
}
.disabled(selectedImage == nil)
.padding(.horizontal)
}
.alert("Photos Access Required", isPresented: $showPermissionAlert) {
Button("Open Settings", action: openSettings)
Button("Cancel", role: .cancel) { }
} message: {
Text("Please allow access to your photo library to select photos.")
}
.onChange(of: self.imageSelection) { oldValue, newValue in
guard let item = newValue else { return }
self.loadTransferable(from: item)
}
}
private func detectPalm() {
}
private func openSettings() {
if let settingsUrl = URL(string: UIApplication.openSettingsURLString) {
UIApplication.shared.open(settingsUrl)
}
}
private func loadTransferable(from imageSelection: PhotosPickerItem) {
Task {
do {
if let data = try await imageSelection.loadTransferable(type: Data.self),
let uiImage = UIImage(data: data) {
selectedImage = uiImage
}
} catch {
print("Error loading image: \(error)")
}
}
}
}
Don't forget to add Privacy - Photo Library Usage Description
in the Info.plist, as we need to access photo library.
Once the project build succesfully, let's bring in our CoreML model. Drag the model into Xcode and when you add it, Xcode will automatically generate class for you to use. This make running CoreML inference much easier.
Run CoreML prediction
Now let's implement our detectPalm function.
private func detectPalm() {
let modelConfig = MLModelConfiguration()
modelConfig.computeUnits = .all
guard let palmROI = try? HandPalm(configuration: modelConfig),
let palmROIModel = try? VNCoreMLModel(for: palmROI.model),
let input = self.selectedImage,
let ciImage = CIImage(image: input.fixOrientation())
else {
NSLog("Model got nil")
return
}
let request = VNCoreMLRequest(model: palmROIModel) { request, err in
if let error = err {
NSLog(error.localizedDescription)
}
guard let results = request.results as? [VNRecognizedObjectObservation] else {
NSLog("Failed to get results")
return
}
var boundingBox: [CGRect] = []
for observation in results {
let coordinates = observation.boundingBox
print("Coordinates: \(coordinates)")
print("Coordinates: \(observation.labels)")
boundingBox.append(coordinates)
}
self.selectedImage = drawBoundingBox(input: input, coordinates: boundingBox)
}
request.imageCropAndScaleOption = .scaleFill
let handler = VNImageRequestHandler(
ciImage: ciImage,
orientation: .up
)
do {
try handler.perform([request])
} catch {
print(error.localizedDescription)
}
}
We create a request using VNCoreMLRequest
by passing the CoreML model. The request will hand us a completion closure with the request and error if any.
We cast the result from request
into VNRecognizedObjectObservation
. From there, we can extract the bounding box of detected object.
After we have our CoreML request, we can pass them into VNImageRequestHandler and it will run the request.
Draw bounding box
Let's try to draw this bounding box on the image, we'll define a new function for it. We'll use UIGraphicsImageRenderer with the original
func drawBoundingBox(input: UIImage, coordinates: [CGRect]) -> UIImage {
let renderer = UIGraphicsImageRenderer(size: input.size)
return renderer.image { context in
// Draw the original image first
input.draw(in: CGRect(origin: .zero, size: input.size))
// Get the graphics context
let cgContext = context.cgContext
// Set up the visual properties for the bounding box
cgContext.setStrokeColor(UIColor.green.cgColor) // Green color for the box
cgContext.setLineWidth(2.0) // Line width of 2 pointsx
// Convert normalized coordinates to pixel coordinates
for box in coordinates {
// Convert normalized coordinates to actual image coordinates
let x = box.origin.x * input.size.width
let y = box.origin.y * input.size.height
let width = box.width * input.size.width
let height = box.height * input.size.height
let rect = CGRect(x: x, y: y, width: width, height: height)
// Draw the rectangle
cgContext.stroke(rect)
// Optionally, add a semi-transparent fill
cgContext.setFillColor(UIColor(red: 0, green: 1, blue: 0, alpha: 0.2).cgColor)
cgContext.fill(rect)
}
}
}
The function is quite straightforward. It'll draw a new image from the input image, then grab the x, y, width and height from coordinates input, normalized it on the image size and draw rectangles from that.
However, when we run the project, the bounding box that is drawn is not precisely above our palm. The reason is because CoreML and iOS has different coordinates system. In CoreML the 0,0 coordinates start from bottom left, while in iOS, the 0.0 coordinates start from top left.
So, we need to flip the y value
func drawBoundingBox(input: UIImage, coordinates: [CGRect]) -> UIImage {
print("coordinates", coordinates)
let renderer = UIGraphicsImageRenderer(size: input.size)
return renderer.image { context in
// Draw the original image first
input.draw(in: CGRect(origin: .zero, size: input.size))
// Get the graphics context
let cgContext = context.cgContext
// Set up the visual properties for the bounding box
cgContext.setStrokeColor(UIColor.green.cgColor) // Green color for the box
cgContext.setLineWidth(2.0) // Line width of 2 pointsx
// Convert normalized coordinates to pixel coordinates
for box in coordinates {
// Convert normalized coordinates to actual image coordinates
let x = box.origin.x * input.size.width
let y = (1 - box.origin.y - box.height) * input.size.height // Flip Y coordinate
let width = box.width * input.size.width
let height = box.height * input.size.height
let rect = CGRect(x: x, y: y, width: width, height: height)
// Draw the rectangle
cgContext.stroke(rect)
// Optionally, add a semi-transparent fill
cgContext.setFillColor(UIColor(red: 0, green: 1, blue: 0, alpha: 0.2).cgColor)
cgContext.fill(rect)
}
}
}
If we run the app again, now the bounding box will be drawn correctly in the image.
In this post, we successfully integrated a CoreML model into an iOS app to detect palms in images. We started by setting up an iOS project in Xcode, importing the trained .mlmodel from Create ML, and implementing photo selection using PhotosPicker. Then, we configured and executed a CoreML inference using VNCoreMLRequest and VNImageRequestHandler, processing the model’s output to extract bounding boxes around detected palms. Finally, we implemented a method to draw bounding boxes on the detected regions, adjusting for the coordinate system differences between CoreML and iOS.
From this, you can further improve it by refining the model, handling multiple detections more effectively, or even adding real-time camera-based inference.