How I Built A Performant On-device Instance Retrieval System For iOS

July 9, 2025 Jonathan Badger

Diagram of instance retrieval pipeline. Source: Author

Remember 2020 when the world shut down to slow the spread of Covid and we had to figure out what to do while we waited patiently in our homes for the end of times? During breaks from Daddy daycare and an endless loop of Moana playing on our TV I developed a simple pipeline for on-device image retreaval. It’s fast, compact, and memory efficient. In this article I will share the story of how I developed Board Game Snapshot (BG Snapshot), an iOS App that utilizes the camera to identify board games. Along the way I will provide you with the motivation for working in embedded spaces, talk in brief about contrastive learning methods, give a whorl wind tour of approximate nearest neighbors search, and introduce SwiftAnnoy, an open-source wrapper library that you can use in your own applications.

The genesis of my app idea

For more than a decade a group of some of my closest friends and I have gathered for an extended weekend of non-stop board gaming we call Nerdfest. The games we play don’t have household names like Monopoly, Battleship, or Risk. Some of our favorites have names like Terraforming Mars, Peurto Rico, and Root. The games are often complex, richly themed, and lengthy (sometimes > 3 hrs). For those not familiar with the hobby, you can check out Board Game Geek (BGG). BGG is the Wikipedia for all things board game related and includes pages for tens of thousands of games! In fact the sheer volume of choices was one source of inspiration for BG Snapshot. The other came from my local game store. I found myself constantly using BGG to get ratings and reviews of games that looked interesting as I perused the shelves, but was annoyed by all the typing I had to do. My solution, as you may have guessed from the name BG Snapshot, was to develop a machine learning pipeline that utilizes the camera to detect and identify board games, ditching the keyboard altogether.

On 1 of K classifiers

With an app idea in mind I began to think about what sort of machine learning model might work best for board game recognition. My first thought was to train up a traditional convolutional neural network (CNN) that uses a linear layer with softmax activation for classification. It would have been a fitting choice for 2020. There were and still are a lot of papers being published trying to squeeze out accuracy on datasets like Imagenet (see image below).

Source: paperswithcode accessed 7/5/2023

But there are a number problems with this approach.

The last layer of the network, which is linear, can quickly end up gobbling up a ton of memory if you aren’t careful. BGG, which has more than 100K games would require hundreds of MB for the classification layer alone. Not a suitable option for mobile!
Whenever you want to add an additional class you have to retrain the entire network from scratch…and since new board games come out every day this poses a significant problem
The softmax function becomes computationally expensive with a large number of classes. (longer/more challenging training)
Deep learning models need lots of examples for each class…with board games it was apparent finding a large pool of images to draw from would be a challenge.

The bottom line is this:

One of k classifiers work best when you have a fixed number of classes, when k is relatively small, and when you have plenty of examples for each class.

For my board gaming problem I needed a different apprach. I needed a system that could adapt to new board games (classes) being added, could learn from a single example, and wouldn’t require constant retraining.

Embedding spaces and contrastive learning

The solution, as it turned out, was to use a combination of contrastive learning, image embeddings, and approximate nearest neighbors search. But before I get ahead of myself, let me first give a brief introduction to embedded spaces and contrastive learning.

An embedded space, also known as latent space, provides a low dimensional landscape for describing high dimensional data. In practice, a typical embedded space has anywhere from a few hundred to a few thousand dimensions, a significant decrease from the millions of dimensions for raw text and image data. Data that are projected into this embedded space, which we now call a vector embedding, provide two important advantages over raw data.

The reduction in size and memory footprint make trainig ML models, especially larger ones, tractable
Embeddings carry both spatial and semantic information

The relationships between embeddings can be visualized using t-SNE, a method that further reduces a dataset to just two or three dimensions. Take a look at this plot of vector embeddings that arise naturally from a pre-trained ResNet101 classification model on the Animals10 dataset below.

https://learnopencv.com/t-sne-for-feature-visualization/

Here we can see some pretty nice clusters for each type of animal. As a bit of foreshadowing, if we precomputed all of the vectors from Animals10 and stored them, we could use a nearest neighbors search to classify the vector embedding for a new image of a dog taken from my cellphone. This may work well for classes this model has previously seen, but will likely struggle with out of domain data or if say, I wanted the model to give back the breed of dog in my image not just the fact that it’s a dog. For something like that we need a more robust training method better suited to the task, which brings us to contrastive learning.

Contrastive learning is a form of self-supervised learning that aims to pull similar training examples closer together while pushing dissimilar examples farther apart. For BG Snapshot I decided to use the triplet loss, first introduced in Google Brain’s FaceNet paper. [1]

Visualization of triplets before and after learning. Source: FaceNet Paper

Here, a triplet is formed by selecting one example image as the anchor, a second from the same class as a positive example, and a third from a different class as a negative example. The formal definition for the loss from the paper is:

In essence, minimizing this loss decreases the distance between the anchor and positive examples while pushing negative examples farther away from anchors with a margin 𝛼.

In BG Snapshot I used the original game images as anchors, augmented images (with blur, rotation, etc.) as positive examples, and the remaining images in a given batch as negative candidates.

Sample triplet of board game images used to train BG Snapshot. Source: Author

I used MobileNetV2 for the CNN architecture as it has a relatively small number of parameters (3.5M), but still gives good performance in terms of inference speed and accuracy. Fine-tuning the model took longer than I would like to admit, but that’s often what it takes to get a model from the experimental phase to production ready.

And while triplet loss training worked for my project, it does come with its own set of challenges. One challenge is finding the ‘hard’ triplets to ensure your model converges quickly. Though there has been ample work in ‘online’ and ‘offline’ mining strategies, I stumbled into another challenge. There is an upper limit on model improvement imposed by the triplet loss. Said simply, after some time, most of the triplets in a given batch will no longer contribute to the loss function, even when selecting for the hardest triplets in an online fashion. For a more in-depth discussion on triplet loss, take a look at this post from Yusuf Sarıgöz to learn more.

If, after reading this article, you are thinking about starting your own project I would recommend trying more contemporary methods like SimCLR [2], Barlow Twins [3], or BOYL [4]. These methods forgo explicitly pushing images of differing classes apart and instead focus on pulling similar images together using image augmentation to generate multiple views of each instance.

Approximate Nearest Neighbors

Once I had embeddings with sufficiently small intraclass distances and sufficiently large interclass distances (see image below), I built a crude prototype in iOS to see if I could get my image retrieval system working. To do that I saved the embeddings of 5000 board games, some of which I own, to act as a dictionary for testing. Then I used a simple brute force nearest neighbors search as a lookup mechanism. As a reminder, a stored embedding of a game in our dictionary should be the closest item to a new embedding of that same game captured with the camera, relative to all of the other games in our dictionary.

Histogram showing the separation of intraclass and interclass distributuions. Source: Author

This ‘dictionary’ storage approach is what gives the instance retrieval system it’s flexibility. Adding new board games doesn’t require retraining the model. All we have to do is run inference on any new images we get and add them to our dictionary.

In case I forgot to mention, the distance metric being used in BG Snapshot is Euclidean distance, though cosine similarity is much more popular theses days and tends to work somewhat better in high dimensional spaces.

My prototype worked fairly well right out of the gate and got me really excited, but benchmarking and memory profiling showed a few shortcomings in my original implementation that would make scaling a challenge:

Nearest neighbors using brute force search has a time complexity that scales linearly O(N). Linear scaling isn’t bad (better than polynomial), but for a snappy app I needed a faster lookup mechanism that could scale to hundreds of thousands of examples without slowing down.
The demo app stored the entirety of the dictionary in memory. Though it made lookup faster I really wanted a model with a small memory footprint in the mobile environment.

This is where approximate nearest neighbor (ANN) algorithms come into play. ANN algorithms make a tradeoff. A small amount of accuracy is sacrificed for fast and efficient search speed. The library I ended up selecting is called Approximate Nearest Neighbors Oh Yeah (Annoy), which was developed at Spotify to assist with their music recommendation system.

Why Annoy? Three main reasons:

The search time for queries is O(logN). In practice just a few microseconds when I timed it with my Mac.
Annoy allows you to use memory mapping on your search index (dictionary)…meaning the index does not need to be loaded into RAM.
The Annoy library is a single 45 KB C++ file, making it extremely portable and easy to incorporate into a wrapped Swift libarary.

How does it work? Annoy makes a forest of binary trees that act as an index during lookup. When building the index, splits at each node in a tree are chosen by forming an equidistant hyperplane between two random datapoints. Points falling on each side of the hyperplane are used in subsequent rounds of random splitting until the leaves of the tree have at most k data points. During search, a query vector travels down the nodes in each tree based on its relationship to the hyperplanes defined during tree construction. After a limited number of nodes are searched, the algorithm gathers the datapoints left in leaf nodes from each tree and performs a mini brute force nearest neighbors search.

Example dataset showing a query vector (red x) and traversal down a tree built with Annoy. Source: https://erikbern.com/2015/10/01/nearest-neighbors-and-vector-models-part-2-how-to-search-in-high-dimensional-spaces.html

For a more detailed explanation and additional visualizations see this blogpost from Erik Bernhardsson, the author of Annoy.

Swift Annoy

To make Annoy work in iOS I wrote SwiftAnnoy. It’s a small wrapper package you can use in your own projects. The API is small, so allow me to give you a quick tour.

Create an index

First, you create an AnnoyIndex<T> as in:

let index = AnnoyIndex<Double>(itemLength: 2, metric: .euclidean)

Currently supported types are Float and Double. Distance metrics that you can use include angular (cosine similarity), dot product (inner product), Euclidean (L2 distance), and Manhattan (L1 distance).

Adding items

Next, add some data to your index. There are two functions that can be used to populate an index: addItem and addItems.

var item0 = [1.0, 1.0]
var item1 = [3.0, 4.0]
var item2 = [6.0, 8.0]
var items = [item0, item1, item2]
// add one item
try? index.addItem(index: 0, vector: &item0)
// add multple items
try? index.addItems(items: &items)

Note: Annoy expects indices in chronological order from 0…n-1. If you need/intend to use some other id numbering system create your own mapping as memory will be allocated for max(id)+1 items.

Build the index

In order to run queries on an AnnoyIndex the index must first be built.

try? index.build(numTrees:1)

The parameter numTrees specifies the number of trees you want Annoy to use to construct the index. The more trees you include in the index the more accurate the search results will be, but it will take longer to build, take up more space, and require more time to search. Experiment with this parameter to optimize the tradeoffs.

Running queries

An AnnoyIndex can be queried using either an item index or a vector.

// by item
let results =  index.getNNsForItem(item: 3, neighbors: 3)
print(results)
"Optional((indices: [3, 2, 0], distances: [0.0, 5.0, 8.602325267042627]))"
// by vector
var vector = [3.0, 4.0]
let results2 = index.getNNsForVector(vector: &vector, neighbors: 3)
print(results2)
"Optional((indices: [2, 0, 1], distances: [0.0, 3.605551275463989, 3.605551275463989]))"

Saving an index

To save an index simply pass a URL as in:

let fileURL = FileManager.default.temporaryDirectory.appending(path: "index.annoy")
try? index.save(url: fileURL)Loading an index

Loading an index

To load a previously saved index use:

let index = AnnoyIndex<Double>(itemLength: 2, metric: .euclidean)
let fileURL = FileManager.default.temporaryDirectory.appending(path: "index.annoy")
try? index.load(url: fileURL)

The final pipeline and closing thoughts

There is a bit more to the BG Snapshot developer story, but this article is getting rather long, so let’s wrap things up by taking a look at the complete instance retrieval pipeline.

Diagram of instance retrieval pipeline used in BG Snapshot. Source: Author

The first thing you will notice is that images from the camera feed aren’t pushed directly to MobileNetV2. There is a rectangle detection step in between. It turns out that most board games are rectangle shaped and isolating the ‘box’ component results in images that more closely match images used during triplet training. If you want to learn about how to do rectangle detection using Apple’s Vision framework I wrote a detailed article that you may find helpful.

Another aspect of the pipeline that I didn’t cover was how to determine when you actually have a match. What kind of accuracy and recall does the model give? To be a bit hand-wavy, I used multiple image captures and some heuristics to get the performance I was looking for.

I am really pleased with the final product. The entire machine learning payload (mlmodel, annoy index, and mapping file) takes up less than 40MB of space and a single round of inference only takes 25-30ms on my iPhone 12 Pro.

Thanks so much for reading!

References

F. Schroff, D. Kalenichenko, J. Philbin, FaceNet: A Unified Embedding for Face Recognition and Clustering (2015), Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2015
T. Chen, S. Kornblith, M. Norouzi, G. Hinton, Simple Framework for Contrastive Learning of Visual Representations (2020), ICML 2020
J. Zbontar, L. Jing, I. Misra, Y. LeCun, S. Deny, Barlow Twins: Self-Supervised Learning via Redundancy Reduction (2021), ICML 2021
J. Grill et. al., Bootstrap Your Own Latent
A New Approach to Self-Supervised Learning (2020), NIPS 2020

Apple's WWDC 2023 Keynote Highlights Innovation In Machine Learning

June 8, 2023 Jonathan Badger

When you think about tech companies and their role in advancing machine learning (ML) and artificial intelligence is Apple on your list? Let’s be honest, the biggest and splashiest research in the last eighteen months has come from companies like OpenAI (the GPT family of models, DALL-E 2), the Google Brain Team (Bard, Imagen), and Meta (SAM, LLaMA). But after watching Apple’s WWDC 2023 Keynote and thinking about how Apple applies ML in their software I’ve decided they deserve a spot on the innovators list as well. Let’s go over some of the announcements from the keynote and I will explain my reasoning. FYI, there’s a lot more to it than the reveal of the Vision Pro.

New Mac Hardware

Apple has always had strong ties to the liberal arts and their new lineup of hardware reflects that continued relationship. New models of Mac Studio and Mac Pro were announced that can be configured with Apples latest M2 Max and M2 Ultra chips. Clearly these machines are being marketed primarily to companies in the media industry. They said as much as they went over performance enhancements in video editing software like Adobe After Effects and dropped names of high profile customers like NBC’s Saturday Night Live.

But there was also a brief moment where Apple boasted about the new M2 Ultra chip and applications in ML.

“… in a single system it can train massive ML workloads like large transformer models that the most powerful discrete GPU can’t even process because it runs out of memory.”

— WWDC 2023 Keynote

For those not entrenched in the latest GPU specs, NVIDIA’s top of the line consumer GPU, the RTX 4090, has 24GB of VRAM. When put along side a maxed out M2 Ultra with 192GB of unified memory (shared across all compute tasks) it’s clear you can train larger models with these new machines, but was Apple trying to dangle a carrot to ML practitioners?

There is a lot to unpack here. Just three years ago Apple was still relying on Intel for their CPUs and AMD for their GPUs. I had contemplated using my iMac and an external GPU for ML at that time, but with little software support I ended up building a custom Ubuntu rig instead. However, the technological landscape has changed dramatically since that time. Metal, Apple’s graphics library, has been incorporated into Pytorch and Tensorflow, the two most popular deep learning frameworks, adding GPU accelerated training to the Mac. Apple has also completely transitioned to their own silicon and the latest M2 Ultra chip can be configured with a 24-core CPU, 76-core GPU, and 32-core Neural Engine. So the real question is how does the M2 Ultra stack up against NVIDIA’s GPUs and can I get real ML work done on a Mac?

To answer these questions what we really need are benchmarks, and it turns out, we have some! The folks over at Weights and Biases have been tracking the performance of Mac silicon for a while now. Here is a chart comparing speeds when training a Resnet50 model (23.5M parameters) on the Oxford Pets dataset.

The M1 Ultra 64 GPU core system, Apple’s fastest in this benchmark, is about 3x slower than NVIDIA’s RTX 4090, and approaching the same speed as an RTX 3050 Ti laptop GPU. Before you get too disappointed let me point out a few things. The Ultra series of chips are actually two chips stuck together, so the M1 Ultra 64 is two M1 Max chips and the new M2 Ultra will actually be two M2 Max chips. Using the data from our benchmarks I calculated a 1.7x speed-up going from the M1 Max to M1 Ultra. If we see the same performance improvement going from the M2 Max to the new M2 Ultra, then I think we can expect the M2 Ultra to sit nicely between the Tesla T4 and the GTX 1080Ti. Now we are getting somewhere. In addition, Apple’s silicon is designed to be extremely efficient and uses only a fraction of the wattage usually drawn by NVIDIA GPUs.

To finish the hardware part of the discussion let’s circle back to the quote from the keynote. Unified memory will indeed allow you to train large models on Apple’s silicon as suggested in the keynote, but the benchmark data shows we should be realistic about how long it might take on a Mac. I think most folks are going to stick with NVIDIA for ML workloads, but fine tuning and tinkering with smaller models are definitely things you can do on this new hardware. I believe Apple’s true interest in ML is less about fast hardware and more about applications in software, which we will go over next.

New Software

The keynote this year was literally chalked full of technology utilizing machine learning. I had trouble keeping track so I made a list:

realtime transcription of incoming voice mail in the Phone App
transcription of audio messages in the Messages App
pet recognition added to Photos App
keyboard autocorrection is now powered by a transformer model…what the duck
dictation also uses a new transformer model
ML powered inspiration for writing entries in the new Journal App
automatic textfield identification in PDFs
presenter overlay in video conferencing
adaptive noise canceling and conversation awareness on AirPods
AirPlay learned preferences makes hardware suggestions about device pairing
Smartstack on WatchOS uses ML to show relevant info when you need it

As you can see the list is fairly long, but only covers what was mentioned in new features for this year. ML has become heavily incorporated across Apple’s software. It’s not just sprinkled in a few applications anymore. What’s more, although Apple is much less prolific in terms of academic contributions, they have been busy innovating in their own way. Apple’s ML models run locally on-device where energy efficiency, size, and speed are paramount. And of course we, as end users, also want accuracy. It’s really hard to get all of these things at once and there are always trade-offs.

Direct evidence of innovation, aside from a phone that can do cool things, is hard to come by. Apple tends to keep a lot of their tech and code private, but for those that are interested there is an area on their website highlighting publications and advancements in ML. One of their posts titled On-device Panoptic Segmentation for Camera Using Transformers is a perfect example.

Here, they developed a model that can separate elements from a scene (people, sky, etc.) along with subcomponents such as skin and hair. To paraphrase the article, their technique is fast enough to run in realtime, compact enough to run on mobile, and has a minimal impact on battery life. Work like this isn’t drawing headlines, but it’s being used in millions of devices.

Apple VisionPro

And now we get to the elephant wearing VR goggles in the room. If you watched the keynote I think you will agree that the VisionPro looks like something from a sci-fi movie. The interface uses your voice, eyes, and hands to control applications that appear to hover in front of your furniture. It looks amazing. Clearly ML is being used everywhere. Gestures, pose, and voice in the UI are powered by ML. Detecting people entering your field of vision requires ML, and reconstructing your face so you don’t look like the guys from Daft Punk during FaceTime calls requires ML.

Are the underlying ML models innovative? In terms of the basic building blocks probably not. But as I mentioned earlier, getting models that run in realtime on a device with resource and power constraints is.

Summary

It wasn’t the palpable notes of ‘machine learning’ and ‘neural networks’ sprinkled in with the announcements, or the reveal of the Vision Pro that put Apple on the ML innovators list for me today. It was the realization that after years of quietly investing, developing, and experimenting, Apple has become a leader in deployment of ML across their software and hardware ecosystem.

Swift As A Cross-platform, General-purpose Programming Language in 2022

August 18, 2022 Jonathan Badger

Image rendered using NightCafe’s text-to-image AI using the phrase ‘ring inscribed with Swift’.

Today I would like to talk briefly about Swift’s current ecosystem, some of the libraries and areas of support that I think are missing, and what Swift needs to become a contender as ‘the one language to rule them all’.

Parsing Character Entities from HTML/XML Content In Swift

September 23, 2021 Jonathan Badger

The Web today is a wonderfully addictive mashup of culture, commerce, and technologies, both old and new. As an iOS developer, interacting with the Web is usually trivial. Make an endpoint request of a REST API from a web-server, get back data, decode data. Boom. Done. At least, that’s what I thought, until I ran into odd substrings like ‘"’ and ‘&’ in an xml formatted response I was parsing. These curious bits of text are known as character entity references (CERs) and in this case, stand for the quote ( “) and ampersand (&) characters respectively. In this article I will provide a bit of background about why CERs exist for us non-web developers and give a few practical methods for decoding them in Swift.

What are Character Entity References and why do they exist?

As I mentioned at the start of the article CERs are basically character codes within a string that are sandwiched between an ampersand and semicolon. Your web browser recognizes these codes and automatically replaces them with the appropriate character for rendering on your screen. So, HTML with ‘5 < 6’ renders to ‘5 < 6’. The Worldwide Web Consortium (W3) has a spiffy interactive chart you can have a look at if you are interested in seeing more. For those that want to dig deep you can read the wiki entry or have a close look at the official html spec.

Character Entity References exist for a number of reasons:

To allow for inclusion of reserved characters in HTML. Just like any programming language there are characters that are reserved for the language itself…most of us have at least seen at least a little HTML. Each tag begins with < and ends with >. Is it any surprise these characters are reserved?
To allow for characters not included in the encoding format of the document (90% of web page are UTF-8 these days, so I think this is mostly for edge cases).
As a convenience to the document writers (web devs) for characters that aren’t included on a standard keyboard. Writing ‘©’ is a lot faster and more efficient that going fishing for the copy-write symbol in a special characters library.

Dealing with CERs in Swift

Now that we have the background out of the way I will show you two methods you can use to ‘find and replace’ CERs in Swift.

Option 1: NSAttributedString

If we dip into ObjectiveC (not very Swift-like, I know), NSAttributedString already has a lot of functionality built around parsing html. Here is an alternate initializer for String that handles CERs:

extension String {
    init?(htmlEncodedString: String) {
        guard let data = htmlEncodedString.data(using: .utf8) else {
            return nil
        }
        let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
        ]
        guard let attributedString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else {
            return nil
        }
        self.init(attributedString.string)
    }
}

Here we briefly initialize an attributed string specifying the DocumentType as .html. Character entity references are automatically substituted for the appropriate character on initialization, so all we have to do is return the .string property and we are done! The new initializer can be used like:

let htmlString = "Easy peasy lemon squeezy. &#127819;"
let fixedString = String(htmlEncodedString: htmlString)
print(fixedString)

Easy peasy lemon squeezy. 🍋

Option 2: Regular Expression Matching

For the second technique we will write our own function that use a dictionary of CER -> Character mappings and regular expressions to perform character substitution manually.

Our dictionary will look like this:

let characterEntities : [String: Character] = [

    // XML predefined entities:
    "&quot;"     : "\"",
    "&amp;"      : "&",
    "&apos;"     : "'",
    "&lt;"       : "<",
    "&gt;"       : ">",

    // HTML character entity references:
    "&nbsp;"     : "\u{00A0}",
    "&iexcl;"    : "\u{00A1}", ...]

I’ve left out the full list of CER : character mappings, but you get the idea. As for the rest of the implementation, let’s write a new function as an extension on String so character substation is available whenever we need it. Here is the full code:

extension String {
    func replacingCharacterEntities() -> String {
        func unicodeScalar(for numericCharacterEntity: String) -> Unicode.Scalar? {
            var unicodeString = ""
            for character in numericCharacterEntity {
                if "0123456789".contains(character) {
                    unicodeString.append(character)
                }
            }
            if let scalarInt = Int(unicodeString),
               let unicodeScalar = Unicode.Scalar(scalarInt) {
                return unicodeScalar
            }
            return nil
        }

        var result = ""
        var position = self.startIndex

        let range = NSRange(self.startIndex..<self.endIndex, in: self)
        let pattern = #"(&\S*?;)"#
        let unicodeScalarPattern = #"&#(\d*?);"#

        guard let regex = try? NSRegularExpression(pattern: pattern, options: []) else { return self }
        regex.enumerateMatches(in: self, options: [], range: range) { matches, flags, stop in
            if let matches = matches {
                    if let range = Range(matches.range(at: 0), in:self) {
                        let rangePreceedingMatch = position..<range.lowerBound
                        result.append(contentsOf: self[rangePreceedingMatch])
                        let characterEntity = String(self[range])
                        if let replacement = characterEntities[characterEntity] {
                            result.append(replacement)
                        } else if let _ = characterEntity.range(of: unicodeScalarPattern, options: .regularExpression),
                                  let unicodeScalar = unicodeScalar(for: characterEntity) {
                            result.append(String(unicodeScalar))
                        }
                        position = self.index(range.lowerBound, offsetBy: characterEntity.count )
                    }
            }
        }
        if position != self.endIndex {
            result.append(contentsOf: self[position..<self.endIndex])
        }
        return result
    }
}

So what is this function doing? In essence, we take our original string, look for substrings that match our pattern, iterate over the matches, and build up the result string by using the ranges found in each match to replace any CERs. For those unfamiliar with using NSRegularExpression there is an excellent article written by Matt on NSHipster that offers background, examples, and explanations. And while I’m directing you off of this article I should also recommend regex101.com, an interactive website I use all the time for prototyping regex patterns.

This new function can be called on any string as in:

let htmlString = "Easy peasy lemon squeezy. &#127819;"
print(htmlString.replacingCharacterEntities())
Easy peasy lemon squeezy. 🍋

Conclusion

Thanks for reading. If you found this article interesting and aren’t already a member of Medium, please consider signing up! You will be supporting me (disclosure: I get part of the membership dues) and get access to tons of great content.

References

https://nshipster.com/swift-regular-expressions/
https://www.w3.org/TR/html4/cover.html#minitoc
https://gist.github.com/mwaterfall/25b4a6a06dc3309d9555
https://www.swiftbysundell.com/articles/string-literals-in-swift/

Tips And Tricks For Making The Most Of TextFields In SwiftUI

April 6, 2021 Jonathan Badger

In SwiftUI TextField is the goto View for capturing freeform input from your users. It works great out-of-the-box for capturing strings, but as with any stock API there are limitations and behavior that may catch you off-guard, especially if you try to work with optionals and other data types. This article will provide some observations, tips, and tricks I have learned to help you work effectively with TextField.

Overlapping Navigation Titles In SwiftUI

December 12, 2020 Jonathan Badger

Navigation stacks are a fundamental user interface component in iOS. We use them everyday as we tap in and out of messages in Mail, search our contacts to make a phone call, and adjust the settings on our phones. Being so crucial to the user experience, I was a bit surprised to find a navigation related bug in SwiftUI. Navigation titles from dismissed views were piling up at the top of the navigation bar in an overlapping mess. What the heck?! Here, we will go over sample code that both recreates this behavior and demonstrates current fixes.

A Quick Fix For Misbehaving List Cells In SwiftUI

November 19, 2020 Jonathan Badger

One of the most frustrating aspects of working with SwiftUI is trying to debug unexpected behavior. I recently ran into a noodle-scratcher while adding search and interactivity features to views inside of a List. Here, I will briefly show you a few unexpected behaviors you might run into when working with navigable List views and show you the quick and dirty fix!

Dismissing The Keyboard In SwiftUI 2.0

November 5, 2020 Jonathan Badger

SwiftUI, Apple’s declarative framework for rapid user interface development, is an awesome alternative to UIKit, but still missing some key features. Here, I will discuss a few of the common issues surrounding keyboard dismissal and provide two solutions and workarounds that I have found after an embarrassing amount of googling and combing of StackOverflow.

Making A Bucket Caddy Using OpenSCAD, A CNC Router, And Some Plywood

June 26, 2020 Jonathan Badger

Over the past few years my wife and I have really gotten into gardening. We enjoy planting flowers and vegetables, getting outside, and rearranging our plants to enhance the aesthetic appeal of our yard. But having a garden is a lot of work and maintenance, especially when it comes to pulling weeds. On many occasions I would find myself losing track of my weeders and trowels while working a flower bed. I finally decided enough was enough and designed a small bucket caddy out of plywood that fits around any standard 5 gallon bucket.

Prep Work

Image from http://fivegallonideas.com/bucket-buddy-wheelbarrow-buckets/

Caddies for 5 gallon buckets aren’t a new thing. There are a variety of commercial cloth covers out there, but most of them have a billion pockets, which is overkill for what I need. For gardening I wanted something relatively simple that would strap onto the bucket and provide spots to hold my hand tools. A quick google of 5 gallon bucket dimensions led me to fivegallonideas.com, a site devoted to blinging out 5 gallon pails. Here I learned two things. First, the top diameter of most pails is 30cm. Second, people have figured out how to turn these things into base guitars, beehives, backpacks, and even air cannons. Why have I only just discovered this site?!

3D Modeling

Screenshot of OpenSCAD. The script is on the left and a rendering of the output is on the right.

There are a ton of free and commercial CAD packages out there. Some of the more popular ones for hobbyists are Funsion360 , SketchUp, TinkerCAD, and FreeCAD. Early in my prototyping phase I used SketchUp, but quickly realized that making small adjustments to shapes, spacing, etc. was absolutely painful using the free version. Then I found OpenSCAD. OpenSCAD ( Solid Computer Aided Design) is an open source project that uses scripts to define 2D and 3D geometries that can be exported as STL files for machining. It only took a small amount of tinkering to realize the power of OpenSCAD. You can parameterize whatever you want and create reusable parts (called modules), which makes tweaking dimensions in the design a lot easier than working with a GUI. I can’t say enough nice things about OpenSCAD. If you are a programmer interested in 3D CAD check it out.

CNC Routing, Finishing, and Assembly

The final design wraps a quarter of the way around the bucket and measures 198mm (~7.8 in) square x 13mm (0.51in). I used half-inch plywood and my Shapeoko for the cutout. With a bit of sanding and two coats of polyurethane here is the finished product. A 21” rubber tie down provides a snug fit with no slippage. Mounting the plywood centered and just below one side of the bucket handle keeps tools at a reasonable height and prevents the bucket from tipping during transport.

Make Your Own

If you have a CNC router you can make your own bucket caddy with the following:

1 x half-inch plywood sheet 8.5 x 8.5 in (large enough to clamp down and allow space for your cutter to travel)
1 x 21” rubber tie down
This STL file (units are in mm)

Fixing A Broken Rocking Horse

June 19, 2020 Jonathan Badger

My two-year-old son decapitated a rocking horse at our in-laws earlier this spring and rather than throw it out I decided to fix it while the stay-at-home order from Covid-19 was in effect.

So where to begin? For staters I thought it might be a good time to give the old chap a more handsome look. After doing a bit of digging on the internet for .svg files I found one that I liked. I smoothed out the mane using InkScape to make it easier to cut out on a bandsaw, and then proceeded to print out the new image and pasted it onto a piece of oak.

The bandsaw made quick work of the cutout, but left rather course edges that aren’t suitable for a kids toy, so I also added a 1/4 inch round-over using a trimming router.

A new hole was drilled into the head and the old handle was recycled. The project was finished off by affixing the newly constructed head to the rocker with a few screws and some wood glue!

Implementing Euclidean Distance Matrix Calculations From Scratch In Python

February 28, 2020 Jonathan Badger

Distance matrices are a really useful data structure that store pairwise information about how observations from a dataset relate to one another. In machine learning they are used for tasks like hierarchical clustering of phylogenic trees (looking at genetic ancestry) and in natural language processing (NLP) models for exploring the relationships between words (with word embeddings like Word2Vec, GloVe, fastText, etc.). Here, we will briefly go over how to implement a function in python that can be used to efficiently compute the pairwise distances for a set or sets of vectors.

Detecting Rectangles In Images Using Apple's Vision Framework

February 10, 2020 Jonathan Badger

Apple’s Vision framework can be used to perform useful tasks such as face detection, object classification, and barcode scanning on still images or data captured from a camera. Here, we look at the fundamentals of using VNDetectRectanglesRequest and explore how adjustments to configuration parameters affect output.