Convenience functions in Python for saving Keras models directly to S3

Keras is a very popular framework developed by Google for training and using machine learning models, and it has become somewhat ubiquitous in its use within the domain. Now I’m no data scientist or machine learning expert, but in my work I am presented with problems related to building things that make machine learning and its related applications easy for data scientists to use.

Serialized machine learning models are almost binary files, making them not the best to store and version control using conventional version control systems such as git. The solution for this is to put them in an object store such as AWS S3 where they can be stored, updated and used by different data scientists on the same team. However, Keras by default stores its models on a folder structure. Take the simple model:

import numpy as np
from tensorflow import keras
inputs = keras.Input(shape=(32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
model.compile(optimizer="adam", loss="mean_squared_error")"my_model")

which generates the following folder structure:

Easy folder structure generated using

which is only a very simple example of the various folders that Keras could generate depending on the type of model you are creating. By giving the top-level folder name in the fashion of:

model = keras.models.load_model("my_model")

you would be able to load the model for use later.

Problem: Enable easy export of Keras models to S3 without needing to traverse through the generated folder structure in code, and enable easy fetching of a model exported in such a manner so that it can be immediately loaded by Keras.

Solution: Zip up the folder structure generated by Keras in a temporary folder. Upload the zipped file to S3. When loading a model, download the corresponding zip file from S3 in to a temporary folder, unzip it, and load it from there.

Gist for the complete code.

Let’s say we have a simple Keras model like what was outlined above:

inputs = keras.Input(shape=(32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
model.compile(optimizer="adam", loss="mean_squared_error")

We’re going to use Python’s tempfile library to save this model in a temporary location:

with tempfile.TemporaryDirectory() as tempdir:"{tempdir}/{model_name}")

By using the temporary directory with context, with tempfile.TemporaryDirectory() , we ensure that the temporary directory is deleted and forgotten as soon as we leave that context block.

Next, we zip it up:

zipf = zipfile.ZipFile(f"{tempdir}/{model_name}.zip", "w", zipfile.ZIP_STORED)
zipdir(f"{tempdir}/{model_name}", zipf)

This uses a zipdir function which traverses the folder with the Keras model in it, and adds it to the given zip file:

def zipdir(path, ziph):
  # Zipfile hook to zip up model folders
  length = len(path)
  for root, dirs, files in os.walk(path):
    folder = root[length:] # Stop zipping parent folders
    for file in files:
      ziph.write(os.path.join(root, file), os.path.join(folder, file))

Now, we can use an s3fs object to write the zipped file to the S3 bucket we need:

s3fs = s3fs.S3FileSystem(key=AWS_ACCESS_KEY, secret=AWS_SECRET_KEY)
s3fs.put(f"{tempdir}/{model_name}.zip", f"{BUCKET_NAME}/{model_name}.zip")

To get this file back and use it in Keras, we have a simple function that uses all the above libraries to reverse the process:

def s3_get_keras_model(model_name: str) -> keras.Model:
  with tempfile.TemporaryDirectory() as tempdir:
    s3fs = get_s3fs()
    # Fetch and save the zip file to the temporary directory
    s3fs.get(f"{BUCKET_NAME}/{model_name}.zip", f"{tempdir}/{model_name}.zip")
    # Extract the model zip file within the temporary directory
    with zipfile.ZipFile(f"{tempdir}/{model_name}.zip") as zip_ref:
    # Load the keras model from the temporary directory
    return keras.models.load_model(f"{tempdir}/{model_name}")

Put everything together, and we have a simple implementation of saving Keras models in their entirety to S3 and getting them back without having to think about traversing nested folder structures created when saving Keras models.

Generate and track metrics for Flask API applications using Prometheus and Grafana

The code for this entire implementation can be found here:

Flask is a very popular lightweight framework for writing web and web service applications in Python. In this blog post, I’m going to talk about how to monitor metrics on a Flask RESTful web service API application using Prometheus and Grafana. We’ll be tying it all together using docker-compose so that we can run everything using a single command, in an isolated Docker network.

Prometheus is a time-series cloud database, that is extremely popular as a metrics and monitoring database, specially with Kubernetes. Promtheus is really cool because it is designed to scrape metrics from your application, instead of your application having to send metrics to it actively. Coupled with Grafana, this stack turns in to a powerful metrics tracking/monitoring tool, which is used in applications the world over.

To couple Flask with Prometheus and Grafana, we’re going to use the invaluable prometheus_flask_exporter library. This library allows us to create a /metrics endpoint for Prometheus to scrape with useful metrics regarding endpoint access, such as time taken to generate each response, CPU metrics, and so on.

The first thing we need to do in order to set up, is to create our Flask app. Here’s a really simple with the exporter library included:

import logging

from flask import Flask
from flask import jsonify
from prometheus_flask_exporter import PrometheusMetrics

logging.basicConfig(level=logging.INFO)"Setting LOGLEVEL to INFO")

api = Flask(__name__)
metrics = PrometheusMetrics(api)"app_info", "App Info, this can be anything you want", version="1.0.0")

def hello():
    return jsonify(say_hello())

def say_hello():
    return {"message": "hello"}

This code just returns a “hello” message when you access the flask-prometheus-grafana-example endpoint. The important part here is the integration of the prometheus_flask_exporter library. All you have to do is initialize a metrics object using metrics = PrometheusMetrics(yourappname) to get it working. It will automatically start exporting metrics to the /metrics endpoint of your application for the specified endpoint after that. If you go your app’s /metrics endpoint after running it, you’ll be greeted with something like this:

Now to set up Prometheus and Grafana. For Prometheus, you need a prometheus.yml file, which would look something like this:

# my global config
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    - targets: ['example-prometheus:9090']

  - job_name: 'flask-api'
    scrape_interval: 5s
    - targets: ['flask-api:5000']

In this example, we see that Prometheus is watching two endpoints, itself, example-prometheus:9090, and the Flask api, flask-api. These names are arbitrarily set inside the docker-compose config file, which we will get to later.

For Grafana, we need a datasource.yml file;

# config file version
apiVersion: 1

# list of datasources that should be deleted from the database
  - name: Prometheus
    orgId: 1

# list of datasources to insert/update depending
# whats available in the database
  # <string, required> name of the datasource. Required
- name: Prometheus
  # <string, required> datasource type. Required
  type: prometheus
  # <string, required> access mode. direct or proxy. Required
  access: proxy
  # <int> org id. will default to orgId 1 if not specified
  orgId: 1
  # <string> url
  url: http://example-prometheus:9090
  # <string> database password, if used
  # <string> database user, if used
  # <string> database name, if used
  # <bool> enable/disable basic auth
  basicAuth: false
  # <string> basic auth username, if used
  # <string> basic auth password, if used
  # <bool> enable/disable with credentials headers
  # <bool> mark as default datasource. Max one per org
  isDefault: true
  # <map> fields that will be converted to json and stored in json_data
     graphiteVersion: "1.1"
     tlsAuth: false
     tlsAuthWithCACert: false
  # <string> json object of data that will be encrypted.
    tlsCACert: "..."
    tlsClientCert: "..."
    tlsClientKey: "..."
  version: 1
  # <bool> allow users to edit datasources from the UI.
  editable: true

In this file, we are defining datasources.url, which is also derived from the name of the prometheus container on the Docker network via the docker-compose file.

Finally, we have a config.monitoring file:


This basically means we’ll be logging in to our Grafana dashboard using username: admin and password: pass@123.

Next, we’re going to load this all up using a Docker Compose file:

version: "3.5"

      context: ./api
    restart: unless-stopped
    container_name: flask-api
    image: example-flask-api
      - "5000:5000"

    image: prom/prometheus:latest
    restart: unless-stopped
    container_name: example-prometheus
      - 9090:9090
      - ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml
      - '--config.file=/etc/prometheus/prometheus.yml'

    image: grafana/grafana:latest
    restart: unless-stopped
    user: "472"
    container_name: example-grafana
      - example-prometheus
      - 3000:3000
      - ./monitoring/datasource.yml:/etc/grafana/provisioning/datasource.yml
      - ./monitoring/config.monitoring

    name: example-network
    driver: bridge
      driver: default
        - subnet:

Note that we’re creating our own Docker network and putting all our applications on it, which allows them to talk to each other. However, this would be the same if you didn’t specify a network at all. Also important to note is that am using wsgi to run the Flask application.

Once Grafana is up, you should be able to log in and configure Prometheus as a datasource:

Add Prometheus as a datasource for Grafana

Once that’s done, you can use the example dashboard from the creator of the prometheus_flask_exporter library (use Import->JSON) which can be found here:

This gives you a cool dashboard like this:

Grafana dashboard for Flask metrics

As you can see, this gives us a killer implementation of Prometheus + Grafana to monitor a Flask web service application with minimum effort.

I wish everything in software development was this easy.

Using a Redis feeder with Gatling

Gatling is a popular open source load testing tool that allows you to write load tests as code. It is mainly based on Scala which means you get more out of it by writing less code, and allows a great deal of flexibility in terms of how you design your load test scenarios. It can be used to send millions of requests to an application within a small amount of time, emulating different users working with different use cases.

In some cases, it is more practical to send actual data using Gatling without having to randomly generate test data. When sending millions of actual records, Gatling provides the capability of integrating one of many different data feeders, which can be wired up to provide Gatling with a constant stream of data to send to your application.

I was presented with such a need of sending millions of records pulled out of a PostgreSQL database (after some transformation) using Gatling to a set of Lamba-based web services hosted on AWS as part of a load test. Since there was a transformation step and there would be the need of using the same dataset for different scenarios on multiple test runs, I decided to store the data items in Redis after transformation for speed and quick access via Gatling (as Gatling would be running on the same box) I just made sure to use disaster recovery services just in case. 

A quick docker-compose setup for Redis and I had my database:

version: "3.7"
    image: redis:latest
    container_name: data_cache
      - '/etc/redis/data/:/data/'
      - 6379:6379
      - 6379

Now we can move on to initializing and using the Redis feeder within our Simulation class in Gatling. The following imports are needed in the Scala class:

import com.redis._
import io.gatling.redis.Predef._
import io.gatling.redis.feeder.RedisFeederBuilder

Assuming that the items that need to be sent to the application by Gatling are stored in a list on Redis, the feeder can be initialized and integrated as follows:

val dataFeederPool = new RedisClientPool("localhost", 6379)
val myDataFeeder: RedisFeederBuilder = 
  redisFeeder(dataFeederPool, "mydatalist").LPOP

Here we have used the Redis LPOP function, the Gatling Redis feeder provides the capability to use SPOP and SRANDMEMBER as well. (LPOP means items will be popped out of a list in Redis).

Integrating the feeder into a scenario is just as simple:

val myScn: ScenarioBuilder = scenario("Post an item")

After integrating the feeder, you can implement whatever checks, extra parameters or conditional statements that you need.

GSoC 2019 is here!

I’ve always been a huge fan of the Google Summer of Code program, having been involved with it in many capacities over the years.

The 2019 iteration of GSoC has been formally announced and the student application period starts soon.

Here’s the official word from Google:

Google Summer of Code (GSoC) will begin accepting student applications on March 25, 2019. The global, online program is designed to engage university students with open source software communities.  

GSoC is a highly competitive and unique program – in the past 14 years we have had over 14,000 students from 109 countries accepted into the program. Participants are paired with a mentor to help introduce them to an open source community and guide them with their project. GSoC gives students invaluable real world experience and an opportunity to receive a stipend to contribute to open source software.

For the 2019 program 206 open source organizations have been accepted into the program. Now is a great time for students to check out project ideas and reach out to the organizations to learn more!

We would appreciate your help to spread the word about Google Summer of Code to local university students. Check out the resources below:

  • Website:
  • Student application window: March 25 – April 9, 2019
  • Flyers, Slide Decks, Logos:
  • Project Topics Include: Cloud, Operating Systems, Machine Learning, Graphics, Medicine, Programming Languages, Robotics, Physics, Science, Security and many more.
  • Sample Twitter: University students receive mentorship and a stipend to work on open source software through Google Summer of Code. Applications are open March 25 – April 9!

Details of the program are available on the program site and be sure to check out the Advice for Students for quick tips and the Student Guide for more details on the whole program.

Please contact us at if you have any questions.

Commit notifications for Amazon CodeCommit using a Lambda Function and a Telegram Bot

Telegram is one of my favorite chat applications – it provides security, super speed, and a myriad of other features that you don’t find in almost any other chat service. Another such service is their Bot framework, which allows the creation of chat bots for Telegram that can do so many different things. Some of these are bots for services such as Gitlab and Bitbucket, which work off of webhooks to send commit details for repositories to a given chat on Telegram.

Amazon Web Services’ CodeCommit is a managed version control service provided by AWS, which can be configured to use Git as its underlying platform. When using CodeCommit, I wanted to send notifications on new commits to a Telegram group similarly to what the aforementioned bots for Gitlab and Bitbucket do. So I set out to achieve this using triggers from CodeCommit, a Lambda function and a very simple Telegram bot.

On a push to a given repository, CodeCommit will invoke a trigger which calls an AWS Lambda function, which in turn runs a RESTful web service call to the Telegram Bot API

As seen above, I created a Lambda function which is triggered by specific events on the CodeCommit repository, which contains the code needed to send a RESTful GET request to the Telegram Bot API to send a message to a specific group using the Telegram Bot.

So, let’s get down to brass tacks:

  1. Create the Telegram Bot: All you need to do is chat with @BotFather on Telegram. This is a Telegram Bot that helps you create new bots. Through some simple commands, you can get a new bot created for yourself and receive an auth token generated as well:

    Chat with BotFather to get your bot created

    NOTE: While @BotFather insists that it may come back some day to me with a request of its own, it has not done so yet, so I’m hoping I won’t be finding any severed horse heads on my bed anytime soon.

  2. Assuming you already have an AWS CodeCommit repository, create your Lambda function: You can do this the other way around as well, but I prefer to create the Lambda function and then bind the CodeCommit repository to it, as AWS lets you do this very easily. You can specify which repository you want to work with, what to name the trigger etc on the create trigger workflow on the Lambda function:

    Drag and drop a CodeCommit trigger from the left

    Then choose your repository, name your trigger etc below

    Once you’ve configured this stuff, you can go ahead and…

  3. Code your Lambda function: Here’s the sample code:
    var http = require('https');
    var AWS = require('aws-sdk');
    exports.handler = (event, context) => { 
        var codecommit = new AWS.CodeCommit({ apiVersion: '2015-04-13' });
        // Build Telegram Bot URL
        var baseUrl = "";
        // Get the commit ID from the event
        var commits = event.Records[0]
            function(reference) {
                return reference.commit;
        console.log('CommitId:', commits);
        // Get the repository from the event and use it to get details of the commit
        var repository = event.Records[0].eventSourceARN.split(":")[5];
            commitId: commits[0],
            repositoryName: repository
        }, function(err, data) {
            if(err) {
            } else {
                var commit = data.commit;
                var commitDetails = 'New commit to my repo: \nRef: ' + event.Records[0].codecommit.references[0].ref 
                    + '\n' + 'Message: ' + commit.message + 'Author: ' + + ' <' + + '>';
                var url = baseUrl + commitDetails;
                http.get(url, function(res) {
                  }).on('error', function(e) {

    I wanted to send my messages to a specific Telegram group so I used @RawDataBot to get the group ID of that group, which is basically the group where the team members who work on this particular code repository are. @RawDataBot will give you a massive JSON string as soon as it joins the group, in which the chat ID will be included.

    In the code above, I’ve used the aws-sdk npm package to extract the CodeCommit JS API, which can be used to extract things such as commit details from the minimal information that is provided to you by the CodeCommit trigger. Actually, all you can get out of the CodeCommit trigger (the event object that is passed to the Lambda function) is the Ref of the repository that the commit occurred on and the commit ID.

    Then some mediocre object manipulation later, I’m making a very simple http call to the Telegram Bot API endpoint, which in turn invokes my bot to send a message to the group I have specified. In the end, it looks something like this:

    There’s a lot going on here. We have the bot sending messages about commits as well as builds (through a CodePipeline trigger, not covered in this post), and there’s also some gloating from me on some other conversation we were having on the group 😀


    And well, that’s how it’s done.

A boilerplate project for NodeJS + ExpressJS on ECMAScript 6 with MongoDB

With the advent of JavaScript ECMAScript 6/ECMAScript 2016/ES6, a whole bunch of new features were introduced, most of them being game-changers for anyone who wanted to switch over. My colleague Ragavan and I took it on ourselves to convert one of our existing ES5 NodeJS projects to ES6, and I thought it would be good to put together a base project that anyone could use to bootstrap a typical NodeJS + ExpressJS + MongoDB + REST project using the tools that we used.

We have used the following tools to make this work:

The full code can be found on Github. A huge shout out to Ashantha Lahiru who worked really hard to make the code presentable and more generic, as well as the MongoDB integration.


Troubleshooting Blazemeter incompatibility issues

Blazemeter is an awesome wrapper for JMeter that allows you to run JMeter load tests from various locations and generate awesome reports.

One of the coolest features of Blazemeter is that it allows you to upload existing JMeter Test Suites in the form of JMX files to execute those tests with any load that you specify.

However, I found that startup itself was failing on Blazemeter when I uploaded a JMX file generated by JMeter 3.1. Upon consulting the helpful support team, I was told that their compatibility with JMeter 3.1 is experimental, and I should remove a bunch of listeners I had added for reporting from the JMeter file.

This solved the problem, and I was able to run my tests.

Pokémon Go is here!

Now this isn’t necessarily a post on programming, but I’m so excited about this new game that I just had to post somewhere.

As an ardent fan of Ingress, I was very excited when they announced that Niantic Labs who built Ingress would be taking one of my favorite franchises, Pokémon, and building an AR game based on the same principles.

Managed to download and install the Beta testing version of the game today, and try it out. So here are my first impressions:

  • The game follows the same premise as the beloved games from GameFreak; there’s a Professor who gives you a started Pokémon (Squirtle, Charmander or Bulbusaur) and sends you on your way to catch Pokémon all around the world. That’s where the similarities end.
  • You find yourself in an AR world based on Google Maps and very heavily on the locations tagged as portals in Ingress: Some of them are Gyms (You can’t do anything at a gym until you’re Level 5 as a trainer; I’m still Level 4 so don’t know what’ll happen here), and the others are Pokestops, which follow the hacking mechanism in Ingress and give you items when you’re in range; although the item spawn doesn’t seem to be randomized like in Ingress: the Pokestop next to me right now just gives 3 Some Ingress portals are Gyms, others are Pokestops which follow the hack mechanism in Ingress and give items (the item spawn doesn’t seem to be randomized though, the Pokestop right next to me seems to only give 3 Poké Balls). I really like the idea of this AR game, and the fact that it gets you out of the house to play it. You get some exercise in the meantime! That really benefits me since I try to exercise when I can, as well as watch what I eat. I even take a natural health supplement that is very effective, and I highly recommend it. If you would like to check it out for yourself, you can go here for more information.

Some Ingress portals are Gyms, others are Pokestops
Some Ingress portals are Gyms, others are Pokestops

  • So I chose Charmander as my starter Pokémon (ah, the nostalgia!). Pokémon show up on the Pokédex and the Pokémon menu, and seem to have a few stats:


  • So there are two options, Power Up and Evolve. Each Pokémon gives you candy for that type of Pokémon when you catch it (and presumably, when you battle with it). I managed to catch enough Pidgeys to evolve it to Pidgeotto (needed 12 Pidgey candies):

Pidgey evolves!
Pidgey evolves!













  • Evolving my Pidgey seemed to give it new moves but did not improve its stats, which was rather disappointing. CP, or Combat Points, seem to be needed for battles as explained here. I can’t battle until I’m Level 5 though, so I don’t know yet.
  • Catching a Pokémon is pretty exciting, it turns on the camera and shows the Pokémon on screen where you have to throw Poké Balls at it by flicking balls towards them. You can actually miss: I ended up throwing about 10 Poké Balls at a pesky Zubat.

Catching Pokémon in bed
Catching Pokémon in bed

Catching Pokémon gives you trainer experience too! And I managed to level up to Level 4:

Level 4!
Level 4!

  • Like in the console games, you can use Incense to attract Pokémon to where you are; this didn’t seem as effective as the incense in the console games tho, I only got two extra Pokémon and the spawn rate doesn’t seem to get increased much.
  • There is also an Egg mechanic where you get eggs to carry around for a certain distance and then they will hatch:

Eggs (screenshot credit:

  • There’s also the good ol’ Trainer Card (which tracks Trainer Stats) and Journal (which tracks activity):


    Trainer Card
    Trainer Card












  • There is also a store where you can buy items: Monetization strategy! 😀


Pokémon Store
Pokémon Store

And that wraps up my first impressions on the game. Will post more once I can actually get in to a battle with someone. Until then, Gotta Catch ’em All!!!

Running Cloudant Queries through the Cloudant NPM Package

Cloudant is a recently-popular solution for cloud-based NoSQL databases. It is based heavily on CouchDB, and provides a very easy-to-use HTTP verb-based web service interface to carry out database operations.

When using Cloudant with Node.js or Express.js, the Cloudant NPM Package, which is basically a wrapper for the CouchDB NPM package known as nano, comes in handy. But while their documentation states how to execute various operations such as getting a document by its ID, doing bulk operations etc, it is quite obscure on how to execute the extremely useful operations based on Cloudant Query, which allows you to write complex selectors like the following:


  "selector": {
    "_id": "myid",
    "$or" : [
          "$and" : [
                "endDate" : {"$gt" : "2015-11-05"}
                "endDate" : {"$lte" : "2015-11-30"}
          "$and" : [
                  "startDate" : {"$gte" : "2015-11-05"}
                  "startDate" : {"$lt" : "2015-11-30"}

So how do you execute a query like that through the provided functions in the npm package for Cloudant? The secret lies in the find function provided in the package. The above could be executed as;


db.find({"selector": {
    "_id": "myid",
    "$or" : [
          "$and" : [
                "endDate" : {"$gt" : "2015-11-05"}
                "endDate" : {"$lte" : "2015-11-30"}
          "$and" : [
                  "startDate" : {"$gte" : "2015-11-05"}
                  "startDate" : {"$lt" : "2015-11-30"}
}, function(error, result) {});

And simple as that, you can execute any complex query that works on Cloudant Query using the Cloudant npm package.

GSoC 2015 – Moorsp Plugin for Moodle – Wrap Up

It has been a hectic few months as a Google Summer of Code student for 2015 for Moodle, and it has come to a successful conclusion.

It was my greatest pleasure to work on Moodle throughout this period, to get to know and respect my awesome mentor, Dan Marsden, to learn about how Moodle and its community functions, and to eventually be able to help that community with a successfully completed GSoC project.

I set out to develop a skeleton plugin known as Moorsp for Moodle’s Plagiarism Framework, to incorporate the latest and greatest of Moodle’s framework goals within a testable plugin that wouldn’t need commercial logins to run automated tests. At the end of the project, all my code has been successfully integrated in to the Moorsp base code , and I have been awarded a pass by the Moodle community, both in terms of my contribution in code and community engagement. This makes me extremely happy.

I hope to continue my work on Moodle and have already started to help Dan with integrating some of the newer concepts in the Moodle framework in to some of the older plagiarism plugins such as the Urkund Plagiarism Plugin which is maintained by Dan himself. I believe that the most important part of me doing a GSoC project is gaining the ability to integrate and work closely with the Moodle community, an opportunity which I absolutely will not let go to waste.

Finally, I will leave you with the lovely Moodle GSoC 2015 Student Badge awarded to me by Thank you, Moodle, for this lovely token of appreciation. I shall always cherish it.