covid-api - a free and open source API service for COVID-19 data

Posted on Fri 10 April 2020 in Development • Tagged with covid, api, rest, data, covid-19, service, free, open, source

Introduction

In this period of COVID-19 emergency, many countries are publishing COVID related data that is being used by many existing projects and researchers.

The main problem with these data is that they are being released in CSV format on some GitHub repository. While we fully appreciate the opennes of this format, unfortunataly it can introduce an additional work to be done (downloading the data, cleaning it, importing the data into a database, keeping it updated etc...) before someone can consume and analyse the data.

covid-api

covid-api project is a free and open source API service which automatically imports the data from various sources (at the moment we support the John Hopkins CSSE data source) and makes it available as a REST API.

The service is still under development, but an initial version (with regularly updated data) is already available at https://api.covid19data.cloud.

How to use the data

To consume the API you don't need an account nor you need to authenticate in any way. You just need to request the right endpoint using the supported parameters.

Here is an example for Python language:

In [1]: import requests

In [2]: response = requests.get('https://api.covid19data.cloud/v1/jh/daily-reports?last_update_from=2020-04-01&last_update_to=2020-04-03&country=Italy')

In [3]: response.json()
Out[3]:
[{'id': 35343,
'country_region': 'Italy',
'province_state': None,
'fips': None,
'admin2': None,
'last_update': '2020-04-01T21:58:34',
'confirmed': 110574,
'deaths': 13155,
'recovered': 16847},
{'id': 37895,
'country_region': 'Italy',
'province_state': None,
'fips': None,
'admin2': None,
'last_update': '2020-04-02T23:25:14',
'confirmed': 115242,
'deaths': 13915,
'recovered': 18278}]

Further API documentation is available at https://api.covid19data.cloud/docs

Next steps

While we keep polishing the code and improving the existing data import procedure, we are planning to support additional data sources. The next one we are going to support is the Italian Protezione Civile.

If you are aware of an additional data source that you would like to see covered, please let us know (creating a new Issue on GitHub) or send us a pull request.

Contribute to the project

If you are a Python developer and would like to contribute to the project, my advice is to first have a look at the main documentation available in the README.

Then I suggest to have a look at the existing Issues and see where help is needed or in alternative you can open a new Issue or send a pull request with fixes and improvements.

I also recommend to become familiar with our Code of Conduct before sending any contribution.

Sponsors and Thanks

I want to thank Heroku for accepting to sponsor the hosting of this service.

I also want to thank all the volunteers involved in the project for their help and contributions.

Disclaimer

We are doing our best to keep the available data updated, clean (removing duplicates), and to provide a reliable service, but we are not in any way responsible for the accuracy of the data nor for the availability of the service itself. Please use it at your own risk.

Abuse notice: we are currently not requiring any registration or authentication to use this service because we would like to keep it as simple as possible. Please do not abuse the service or you will force us to require a registration (subject to approval) to continue using it.


Machine Learning: Pima Indians Diabetes

Posted on Sat 14 April 2018 in Development • Tagged with Machine Learning, Python, scikit-learn, tutorial

Solving the Pima Indians Diabetes problem with Machine Learning using Python and scikit-learn


Continue reading

Getting latest Ubuntu AMI with Terraform

Posted on Fri 25 August 2017 in Development • Tagged with AWS, Terraform, Ubuntu, devops

When we need to create an EC2 resource on AWS using Terraform, we need to specify the AMI id to get the correct image. The id is not easy to memorise and it changes depending on the zone we are working one. On every new release the id changes again. So, how can we be sure to get the correct ID for our region, of the latest image available for a given Linux distribution?

Getting latest Ubuntu AMI id

In this example I will show how to get the ID for the latest version of Ubuntu 16.04 server, for the London region and create an EC2 instance using that ID.

variable "aws_region" { default = "eu-west-2" } # London

provider "aws" {
    region = "${var.aws_region}"
    access_key = "youraccesskey"
    secret_key = "yoursecretkey"
}

data "aws_ami" "ubuntu" {
    most_recent = true

    filter {
        name   = "name"
        values = ["ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*"]
    }

    filter {
        name   = "virtualization-type"
        values = ["hvm"]
    }

    owners = ["099720109477"] # Canonical
}

resource "aws_instance" "web" {
    ami           = "${data.aws_ami.ubuntu.id}"
    instance_type = "t2.micro"

    tags {
        Name = "HelloUbuntu"
    }
}

output "image_id" {
    value = "${data.aws_ami.ubuntu.id}"
}

After we have initialised our script using terraform init if we run it, we will get the AMI id and the instance will be created:

➜  example1$: terraform apply
data.aws_ami.ubuntu: Refreshing state...
aws_instance.web: Creating...
    ami:                          "" => "ami-03998867"
    associate_public_ip_address:  "" => "<computed>"
    availability_zone:            "" => "<computed>"
    ebs_block_device.#:           "" => "<computed>"
    ephemeral_block_device.#:     "" => "<computed>"
    instance_state:               "" => "<computed>"
    instance_type:                "" => "t2.micro"
    ipv6_address_count:           "" => "<computed>"
    ipv6_addresses.#:             "" => "<computed>"
    key_name:                     "" => "<computed>"
    network_interface.#:          "" => "<computed>"
    network_interface_id:         "" => "<computed>"
    placement_group:              "" => "<computed>"
    primary_network_interface_id: "" => "<computed>"
    private_dns:                  "" => "<computed>"
    private_ip:                   "" => "<computed>"
    public_dns:                   "" => "<computed>"
    public_ip:                    "" => "<computed>"
    root_block_device.#:          "" => "<computed>"
    security_groups.#:            "" => "<computed>"
    source_dest_check:            "" => "true"
    subnet_id:                    "" => "<computed>"
    tags.%:                       "" => "1"
    tags.Name:                    "" => "HelloUbuntu"
    tenancy:                      "" => "<computed>"
    volume_tags.%:                "" => "<computed>"
    vpc_security_group_ids.#:     "" => "<computed>"
aws_instance.web: Still creating... (10s elapsed)
aws_instance.web: Still creating... (20s elapsed)
aws_instance.web: Still creating... (30s elapsed)
aws_instance.web: Creation complete (ID: i-0f58f8bd55b3a7e38)

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Outputs:

image_id = ami-03998867

That's all we need to spin up an EC2 instance on AWS using latest Ubuntu image available.


Creating a production ready API with Python and Django Rest Framework – part 4

Posted on Thu 17 August 2017 in Development • Tagged with API, Django, framework, Python, rest, tutorial

In the previous part of the tutorial we implemented details management, relations between models, nested APIs and a different level of permissions. Our API is basically complete but it is working properly? Is the source code free of bugs? Would you feel confident to refactor the code without breaking something? The answer to all our question is probably no. I can't be sure if the code behaves properly nor I would feel confident refactoring anything without having some tests coverage.

As I mentioned previously, we should have written tests since the beginning, but I really didn't want to mix too many concepts together and I wanted to let the user concentrate on the Rest Framework instead.

Test structure and configuration

Before beginning the fourth part of this tutorial, make sure you have grabbed the latest source code from https://github.com/andreagrandi/drf-tutorial and you have checked out the previous git tag:

git checkout tutorial-1.14

Django has an integrated test runner but my personal choice is to use pytest, so as first thing let's install the needed libraries:

pip install pytest pytest-django

As long as we respect a minimum of conventions (test files must start with test_ prefix), tests can be placed anywhere in the code. My advice is to put them all together in a separate folder and divide them according to app names. In our case we are going to create a folder named "tests" at the same level of manage.py file. Inside this folder we need to create a __init__.py file and another folder called catalog with an additional __init__.py inside. Now, still at the same level of manage.py create a file called pytest.ini with this content:

[pytest]
DJANGO_SETTINGS_MODULE=drftutorial.settings

Are you feeling confused? No problem. You can checkout the source code containing these changes.

git checkout tutorial-1.15

You can check if you have done everything correctly going inside the drftutorial folder (the one containing manage.py) and launching pytest. If you see something like this, you did your changes correctly:

(drf-tutorial) ➜  drftutorial git:(master) pytest
============================================================================================================================= test session starts ==============================================================================================================================
platform darwin -- Python 2.7.13, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
Django settings: drftutorial.settings (from ini file)
rootdir: /Users/andrea/Projects/drf-tutorial/drftutorial, inifile: pytest.ini
plugins: django-3.1.2
collected 0 items

========================================================================================================================= no tests ran in 0.01 seconds =========================================================================================================================
(drf-tutorial) ➜  drftutorial git:(master)

Writing the first test

To begin with, I will show you how to write a simple test that will verify if the API can return the products list. If you remember we implemented this API in the first part of the tutorial. First of all create a file called test_views.py under the folder drftutorial/tests/catalog/ and add this code:

import pytest
from django.urls import reverse
from rest_framework import status
from rest_framework.test import APITestCase


class TestProductList(APITestCase):
    @pytest.mark.django_db
    def test_can_get_product_list(self):
        url = reverse('product-list')
        response = self.client.get(url)
        self.assertEqual(response.status_code, status.HTTP_200_OK)
        self.assertEqual(len(response.json()), 8)

before being able to run this test we need to change a little thing in the catalog/urls.py file, something we should have done since the beginning. Please change the first url in this way, adding the name parameter:

urlpatterns = [
    url(r'^products/$', views.ProductList.as_view(), name='product-list'),
    ...

at this point we are able to run our test suite again and verify the test is passing:

(drf-tutorial) ➜  drftutorial git:(test-productlist) ✗ pytest -v
============================================================================================================================= test session starts ==============================================================================================================================
platform darwin -- Python 2.7.13, pytest-3.0.6, py-1.4.32, pluggy-0.4.0 -- /Users/andrea/.virtualenvs/drf-tutorial/bin/python2.7
cachedir: .cache
Django settings: drftutorial.settings (from ini file)
rootdir: /Users/andrea/Projects/drf-tutorial/drftutorial, inifile: pytest.ini
plugins: django-3.1.2
collected 1 items

tests/catalog/test_views.py::TestProductList::test_can_get_product_list PASSED

=========================================================================================================================== 1 passed in 0.98 seconds ===========================================================================================================================

To checkout the source code at this point:

git checkout tutorial-1.16

Explaining the test code

When we implement a test, the first thing to do is to create a test_* file and import the minimum necessary to write a test class and method. Each test class must inherit from APITestCase and have a name that start with Test, like TestProductList. Since we use pytest, we need to mark our method with @pytest.mark.django_db decorator, to tell the test suite our code will explicitly access the database. We are going to use the client object that is integrated in APITestCase to perform the request. Before doing that we first get the local url using Django's reverse function. At this point we do the call using the client:

response = self.client.get(url)

and then we assert a couple of things that we expect to be true:

self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.json()), 8)

We check that our API returns the 200 status code and that in the returned JSON there are 8 elements.

It's normally a good practice to create test data inside the tests, but in our case we previously created a data migration that creates test data. Migrations are run every time we run tests so when we call our API, the data will be already there.

Wrapping up

I've written a few tests for all the views we have implemented until now and they are available if you checkout this version of the code:

git checkout tutorial-1.17

I've only tested the views but it would be nice to test even the permission class, for example. Please remember to write your tests first, if possible: implementing the code will be much more natural once the tests are already in place.