Creating Biclomap Technical Foundation

Looking back to this half year since Biclomap project started. Lots of things happened during this time. Things could have advanced quicker, but hey, we are doing this in our free time, when not riding our bikes ;-) Add to this that with this pandemic situation our professional activity took an accelerated pace and our resources got a bit diverted to our “real world” customers. Biclomap is a long-term vision and this foundation step is only one small, but important, step into getting this applications suite on your bikes!

Technical Foundation Vision

Biclomap is intended to be the “Biker Enthusiasts Free Application Suite”. In order to keep it that way, technical foundation must be carefully chosen. Running costs should be kept as low as possible, without hindering the ability to serve the bikers. No technical limitations should come in the way when providing the mobile application to the bikers. Considering these, we came up with this initial set of requirements:

  • Run the infrastructure in the cloud
  • Have the simplest possible architecture
  • Keep architecture cloud-provider agnostic
  • Minimize production servers maintenance
  • Mobile application should run on any user device
  • Keep cloud running costs to the minimum possible
  • Use state-of-art security measures
  • Try and keep the barrier to entry as low as possible for new Biclomap developers

The technical foundation implies working on several directions somehow simultaneously. This is not quite a linear process, requiring one to explore things in several directions at the same time:

  • Choosing the source-code repository location and structure
  • Choosing the CI infrastructure
  • Choosing the technology bricks, starting with the programming languages

The source-code repository

There’s no doubt, when thinking about the source code repository, one would automatically choose Github. This also was our initial thought, but we heard about the Gitlab Opensource Program. We contacted them and had the pleasant surprise to get several Gold Subscriptions for our project. This solved the source-code and the CI infrastructure choice. Next step was to structure the code, and we came up with this project group. It being an open-source project, it comes withouth saying that anyone is welcome to inspect it and contribute to.

The Biclomap group on Gitlab has a wiki page, containing useful information, and it is structured in several repositories:

  • be is the source-code repository for the back-end
  • mobile-app is the source-code repository for the front-end mobile application
  • www contains the source-code of the main website
  • media is dedicated to some media files

GitLab offers also an Issue Tracking infrastructure, allowing us to organize and document the development. Issues can be tracked at the group level, but also at each repository level. You may find in there a wealth of information about the project evolution, but for the moment keep reading this article before returning to inspecting those issues. Here, you’ll get the needed oversight allowing you to dive more into the project’s struture.

Defining the back-end architecture

The cloud architecture is around since a more than a decade, but lately it gained momentum. The traditional player in this field are Amazon Web Services (AWS). They have the free-tier offering, which looks to be very convenient for a free and open source project like Biclomap. So we initially settled on this offering.

AWS came-up with the serverless cloud architecture, where one could simply upload the application’s code on their infrastructure, and then let them run it for us. This looked to be perfectly addressing the issue of the production servers maintenance. Here is why. In a traditional set-up, we should have configured Linux servers. Later on, these servers should have been maintained, patched with security updates, upgraded to the latest Linux packages distribution, backed-up and fixed in case of unforeseen situations. While this could be easily addressed in a professional set-up, where a sysadmin team is specially hired to perform these tasks, we simply cannot afford it when offering a free applications suite.

So off we go with serverless for the Biclomap back-end. AWS calls this lambda

Enter Infrastructure as Code with Terraform

While AWS offer a pretty usable AWS Console allowing one to create all of the needed artefacts using their mouse, we quickly understood that only using it would lead us to a maintenance trap. Indeed, GUI applications appear to be simple to use, but never will give one complete overview and control over the resources it manages. It’s very easy to overlook things in a GUI and, moreover, you don’t get the version tracking with it. You’ll never be able to understand how changes occured in such a GUI. This problem was addressed since a long time by Terraform. So we decided to create the AWS resources using Terraform since the very beginning. Using Terraform versus AWS Cloudformation also addresses the cloud-agnostic principle we laid-up when starting-up this project.

Choosing the back-end programming language

AWS lambda can be programmed in several languages. The resulting functions should, as we laid up the principles, be portable to another cloud infrastructure in order to be future proof. We are an free and open-source project, so we cannot afford to get trapped by some costly feature that may come-up. Another important point to keep in mind while creating lambda functions is the billing mode. AWS bills the execution time. Initialization time is part of this execution time. Imagine you just deployed a new lambda version. For the time being, it sits as a binary in some AWS storage resource. Then a first call to the API Gateway comes-in, trigerring a lambda call and a billing counter. Before serving that first call, the lambda needs to be initialized. Once initialized, the effective lambda call takes place, then control returns to the API Gateway, stopping the billing counter. Subsequent calls will find the lambda already initialized, so the billing counter will be lower. If client requests pause, the AWS lambda infrastructure will eventually cause our lambda to be shut-down, in order to free-up server resources. From that point-on, the next API Gateway call will trigger another initialization sequence. We get that the initialization sequence is something important to keep in mind when creating our lambda function.

Another important important point to keep in mind when selecting the programming language is the community support. The Java programming language is quite widespread, has a large community, but one of it’s main penalty is the initialization time. So we initially chose to go with the Python programming language.

Looks like Python also became overbloated!

Discussions with our Python expert colleagues led us to consider the Flask framework. So off we go trying to set-up our lambda function using Flask. The very first API call would be a ping function, allowing one to check the server state. You can check the code here.

The code state in the above commit shows the last step we reached. The previous steps were:

  • get the application binary in a zip directly deployable with Terraform. This was not possible as the zip size reached a size preventing direct deployment via AWS API (this API is internally used by Terraform)
  • we then modified the Terraform scripts so the zip was firstly uploaded to an AWS S3 bucket, prior to deploying the lambda. This worked, but then we invoked the lambda hitting another limit: unzipped lambda code cannot exceed 160 MB of code! So, our very trivial ping function needed more than 160 MB of blat in order to function.

Hitting those size limits while implementing a very simple ping function made us consider abandoning the Python ecosystem. It looks like the popularity of a language works against it - too many people adding to many bloat leading to oversized functions. This clearly would also have an impact on the initialization time, with billing penalty.

Let’s use GO

The GO language was created by Google with the Cloud and modern parallel computing paradigm in mind from the very beginning. It’s traction keeps going up, helped by it allowing one to create serverless code in a cloud-provider agnostic way. So, we decided to give it a go! (pun intended)

The next commits in the repositorty, after the one mentioned above when trying Python shows it in action. The code gets compiled in a single executable file, having a “reasonable” size of 6MB. This allowed for easy direct deployment and execution was a breeze. Also, AWS Cloudwatch logs show quite speedy initialisation time. From that point-on we settled with the GO language. It turned-up later to be a quite pragmatical programming language, preventing us to come-up wit the usual theoretical garbage some other languages allow (did you ever saw how J2EE was architectured?).

Find it interesting?

This is a first post about the Biclomap’s technical infrastructure. If you find this interesting, or want to know more, just drop us a line. Or simply stay tuned, we’ll follow-up with more technical blog postings.