We recently upgraded from PHP to Go for our Face Recognition API architecture. In this article we share why we made the move, how it’s radically improved performance, and some lessons learned along the way.
As we strive to be the premier Face Recognition partner to businesses across the world, we constantly improve our platform to upgrade our products and enhance customer experience. And while undertaking a big rewrite such as this was not a small decision, ultimately, it was an easy one to make. Thankfully the impact has been tangible and real, in some cases even greater than we envisioned.
Before we dig in, I will say, this article gets quite technical. If anything here doesn’t make sense and you’re interested to learn more, feel free to send us a message-- we’d love to help!
Ready, Set, Go!
After our 2017 #DiversityRecognition demo become an Internet phenomenon, we had a massive, a continued, growth in traffic-- alerting us to hitherto unknown performance issues, and the subsequent cost of supporting an ultra-high usage app. Change was certainly on our agenda, and this event catapulted that change right to the top of our list.
After researching various options, including moving to PHP 7, Java and other languages, we made the decision to migrate to a pure Go and C solution-- it made the most sense for our objectives. Considering our original face recognition API architecture was written in a combination of PHP, Python and C, streamlining our code base was a major objective. No more band aids!
So, in line with our agile philosophy, we started small and rewrote our PHP demo applications in Go, rather than start with the full API. This approach meant we could deal with any unexpected challenges as we went along, keep our code efficient, and gather rapid feedback from users. In my experience, choosing a programming language is one thing; ultimately how you write your program is equally (if not more) important.
"With the increased efficiencies of Go between the multi-threading, reduced deployment size, smaller memory footprint, and running overall fewer Docker containers, we were able to reduce the count of hosts in our Kubernetes cluster by over 50%.”
- Cole Calistra, CTO, Kairos.
Benefits of upgrading to Go
As I touched on above, there were other options on the table, however, Go represented the greatest net win with regards to the following key aspects:
With our PHP build, customers experienced API transactions that took 1-2 seconds. In moving to Go, the same transactions (or ‘calls’) are now 300ms or less. Remarkably, some lower-level functions are coming in under 100ms. Overall we’re averaging an 8X speed increase since moving to Go.
Ease of deployment
With PHP we had to deploy containers that carried a full Linux distribution, an Nginx install, and PHP-FPM. The containers also had to include a number of libraries and drivers to connect PHP to other services such as MySQL and Redis.
With Go, we can deploy a slimmed down Linux install, Debian-Slim, and only need to distribute the single Go executable that gets generated. We had one 528MB Docker PHP image that was able to be reduced to only 267MB once in Go and another 366MB Docker image that was reduced to 170MB. Essentially reducing our memory footprint by nearly 50%
This may not seem like a huge change for a single container, as many machines can handle running multi-gigabyte containers without issue. However, at scale and trying to pack as many containers into each host, the size becomes a factor both for overall capacity and speed of spinning up new containers.
We leveraged many of the built in features of Go including channels and go routines to more efficiently make the best use of the CPU and memory capacity of each container.
With the increased efficiencies of Go between the multi-threading, reduced deployment size, smaller memory footprint, and running overall fewer Docker containers, we were able to reduce the count of hosts in our Kubernetes cluster by over 50%.
We previously ran over 200 containers of the PHP API to handle our baseline traffic that would top out around 600-700 containers at peak load. Using our new Go deployment, we run less than 20 containers that peaked out around 40-50 with higher load than we saw on our PHP API.
- None of our engineers had production Go experience, but were experts in a number of other languages including PHP, Python, Ruby, Java, C++, and C. This provided an opportunity for us to learn how to develop in Go on a real world project with a big impact for the company and our customers. It also helped to have a familiar and important task to drive our learning. This was preferable to creating simple “Hello, World!” type of application. We needed to get our hands dirty in production.
- Continuous learning is a core tenet of the culture here at Kairos. Making our engineers more marketable to the outside world puts the onus on Kairos to be a better place to work-- it doesn’t help anyone if we root our team in outdated skills. Not to mention, Go is an extremely marketable skill-- according to this study by PayScale it’s the #2 most in-demand language.
- Golang’s mass appeal has sprung many popular open-source projects like Docker and Kubernetes. Having implemented such tools in our infrastructure and learning from it’s coding best-practices, we continue to improve our skills and strategies from other top engineers.
Visual analogies FTW!
Things we learned along the way
With any new software development assignment, the expectation is that you are learning throughout the lifecycle of the project. This was no different. Below are some key takeaways from our experience with leaving PHP for Go:
Memory leaks & garbage collection with Cgo and Go
Do not assume memory is leaking or which objects may be leaking memory. Profiling your code will give you the answers. The garbage collector is responsible for removing any unused objects that are hanging around, as long as there are no references to objects the garbage collector examines. If references to objects do exist, the garbage can’t touch them. This includes Cgo objects that are being referenced in other parts of your code.
Issues with marshalling JSON objects from Redis
The challenge here was to find a way to capture a very large Redis result and conveniently format that result into a data structure. Our main goal being efficiency (speed) and as we discovered, the standard JSON package isn’t feasible for our particular use case. The next way to squash this issue is by processing the RAW data that Redis sends us, to avoid unnecessary mangling of the JSON.
Join a dynamic and ultra-experienced product/technology team. Work each day at the bleeding edge of computer vision. Get paid six figures!-- Go API Engineer - Miami, USA
Gather data before optimizing your code. Use Go’s profiling tools, this will save time
As mentioned, optimizing without the proper context is time wasted. You have to figure out bottlenecks in your code that may be causing your issue. Here, profiling comes to play-- you’ll be able to see or have some idea of where the issue is. This process saves time.
If you are a new Go developer, it is recommended to profile your code often. The more you do, the better you will get at it and your future Go projects will improve.
How Go manages memory internally
Go is a garbage collected language, so it has a runtime that manages memory allocation for you. But be careful, the garbage collector is not a substitute for well-written and efficient code.
Escape Analysis (the way Go determines where memory is allocated)
Go has a function called ‘Escape Analysis’. It is responsible for determining, at compile time, where the runtime will allocate memory-- allocating either to the Stack or the Heap. It’s easy to brush over this, especially in larger apps that deal with vast amounts of data allocations. Concurrent apps, like REST-based APIs are examples of where you will feel this the most.
Re-using objects instead of re-allocating objects that are used over and over
For efficiency, look for opportunities to re-use previously allocated data structures. If not, those data structures or objects will get re-allocated and the garbage collector will have more work do.
Go towards the future
As you can see, Kairos’ experience moving from PHP to Go wasn’t without its challenges. Yet, our collective learnings, combined with the successful outcome of the project, means we get to pass on all the benefits to our customers.
With this new foundation for growth we’re also opening up new engineering roles -- it’s a great time to join our family. Check out our latest Developer Jobs at Kairos.
Cole is the CTO at Kairos, a Human Analytics startup that radically changes how companies understand people. He loves all things cloud and making great products come to life.