Alibek Jakupov
- Aug 19, 2022
- 7 min read

Microsoft Computer Vision Recipes: Tips and Tricks

There's an excellent repository on the internet providing examples and best practice guidelines for building computer vision (CV) systems. This repository provides a comprehensive set of tools and examples that leverage the latest advances in CV algorithms, neural architectures with additional utility around loading image data, optimizing and evaluating models, and scaling up to the cloud. Rather than creating implementations from scratch, they draw from existing state-of-the-art libraries and answer common questions, point out frequently observed pitfalls, and show how to use the cloud for training and deployment. The scenarios covered in this repositories include such topics as:

I've recently implemented the Similarity scenario, and faced some challenges while training the model with my own data, exporting the models and running the inference pipeline. This is why I decided to share some useful tips and tricks that will probably save you a couple of hours. Up we go.

Tip 1: Don't install the utils_cv

If you have a look at the SETUP.MD, you will see that it is recommended to install the utils_cv locally.

pip install git+https://github.com/microsoft/ComputerVision.git@master#egg=utils_cv

I would not suggest doing it for several reasons. First, it only works on Azure ML compute instance, and nowhere else. I've tried with both Windows and Linux environments, without success, whereas with the Azure ML it takes only a couple of seconds. Moreover, if you want to serialise your model, in certain cases it may be problematic to create an inference pipeline.

Let's consider a concrete example. Suppose we want to implement the image similarity scenario, demonstrated in this notebook. Here, for Fast Image Retrieval, they suggest using the fastai library. So, if we want to serialise it, we would use the Learner.export() method.

The Learner is saved in self.path/fname, using pickle_protocol. Note that serialization in Python saves the names of functions, not the code itself. Therefore, any custom code you have for models, data transformation, loss function etc. should be put in a module that you will import in your training environment before exporting, and in your deployment environment before loading it. Consequently, if you trained your model using the utils_cv module, you will need to have the same library installed on your inference compute.

To overcome this limitation, instead of importing utils_cv you should simply go the implementation and import the code you need. Simply replace these lines of code:

from utils_cv.classification.data import Urls
from utils_cv.common.data import unzip_url
from utils_cv.common.gpu import which_processor, db_num_workers
from utils_cv.similarity.metrics import compute_distances
from utils_cv.similarity.model import compute_features_learner
from utils_cv.similarity.plot import plot_distances, plot_ranks_distribution

with their corresponding source code. For instance, for the first line, go to the utils_cv>classification and copy the code from data.py file.

import os
import requests
import pandas as pd
from pathlib import Path
from typing import List, Union
from urllib.parse import urljoin

from fastai.vision import ItemList
from PIL import Image
from tqdm import tqdm


class Urls:
    # for now hardcoding base url into Urls class
    base = "https://cvbp-secondary.z19.web.core.windows.net/datasets/image_classification/"

    # ImageNet labels Keras is using
    imagenet_labels_json = "https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json"

    # traditional datasets
    fridge_objects_path = urljoin(base, "fridgeObjects.zip")
    fridge_objects_watermark_path = urljoin(base, "fridgeObjectsWatermark.zip")
    fridge_objects_tiny_path = urljoin(base, "fridgeObjectsTiny.zip")
    fridge_objects_watermark_tiny_path = urljoin(
        base, "fridgeObjectsWatermarkTiny.zip"
    )
    fridge_objects_negatives_path = urljoin(base, "fridgeObjectsNegative.zip")
    fridge_objects_negatives_tiny_path = urljoin(
        base, "fridgeObjectsNegativeTiny.zip"
    )

    # multilabel datasets
    multilabel_fridge_objects_path = urljoin(
        base, "multilabelFridgeObjects.zip"
    )
    multilabel_fridge_objects_watermark_path = urljoin(
        base, "multilabelFridgeObjectsWatermark.zip"
    )
    multilabel_fridge_objects_tiny_path = urljoin(
        base, "multilabelFridgeObjectsTiny.zip"
    )
    multilabel_fridge_objects_watermark_tiny_path = urljoin(
        base, "multilabelFridgeObjectsWatermarkTiny.zip"
    )

    @classmethod
    def all(cls) -> List[str]:
        return [v for k, v in cls.__dict__.items() if k.endswith("_path")]


def imagenet_labels() -> list:
    """List of ImageNet labels with the original index.
    Returns:
         list: ImageNet labels
    """
    labels = requests.get(Urls.imagenet_labels_json).json()
    return [labels[str(k)][1] for k in range(len(labels))]


def downsize_imagelist(
    im_list: ItemList, out_dir: Union[Path, str], dim: int = 500
):
    """Aspect-ratio preserving down-sizing of each image in the ImageList {im_list}
    so that min(width,height) is at most {dim} pixels.
    Writes each image to the directory {out_dir} while preserving the original
    subdirectory structure.
    Args:
        im_list: Fastai ItemList object containing image paths.
        out_dir: Output root location.
        dim: maximum image dimension (width/height) after resize
    """
    assert (
        len(im_list.items) > 0
    ), "Input ImageList does not contain any images."

    # Find parent directory which all images have in common
    im_paths = [str(s) for s in im_list.items]
    src_root_dir = os.path.commonprefix(im_paths)

    # Loop over all images
    for src_path in tqdm(im_list.items):
        # Load and optionally down-size image
        im = Image.open(src_path).convert("RGB")
        scale = float(dim) / min(im.size)
        if scale < 1.0:
            new_size = [int(round(f * scale)) for f in im.size]
            im = im.resize(new_size, resample=Image.LANCZOS)

        # Write image
        src_rel_path = os.path.relpath(src_path, src_root_dir)
        dst_path = os.path.join(out_dir, src_rel_path)
        assert os.path.normpath(src_rel_path) != os.path.normpath(
            dst_path
        ), "Image source and destination path should not be the same: {src_rel_path}"
        os.makedirs(os.path.dirname(dst_path), exist_ok=True)
        im.save(dst_path)


class LabelCsvNotFound(Exception):
    """ Exception if no csv named 'label.csv' is found in the path. """

    pass


class LabelColumnNotFound(Exception):
    """ Exception if label column not found in the CSV file. """

    pass


def is_data_multilabel(path: Union[Path, str]) -> bool:
    """ Checks if dataset is a multilabel dataset.
    A dataset is considered multilabel if it meets the following conditions:
        - a csv titled 'labels.csv' is located in the path
        - the column of the labels is titled 'labels'
        - the labels are delimited by spaces or commas
        - there exists at least one image that maps to 2 or more labels
    Args:
        path: path to the dataset
    Raises:
        MultipleCsvsFound if multiple csv files are present
    Returns:
        Whether or not the dataset is multilabel.
    """
    files = Path(path).glob("*.csv")

    if len([f for f in files]) == 0:
        return False

    csv_file_path = Path(path) / "labels.csv"

    if not csv_file_path.is_file():
        raise LabelCsvNotFound

    df = pd.read_csv(csv_file_path)

    if "labels" not in df.columns:
        raise LabelColumnNotFound

    labels = df["labels"].str.split(" ", n=1, expand=True)

    if len(labels.columns) <= 1:
        return False

    return True

And so on and so forth.

Please note, that for the other functions you don't necessarily have to copy all the contents of a source file. For instance, for the second line:

from utils_cv.common.data import unzip_url

you need to simply copy the imports, the function itself and its dependencies, like _raise_file_exists_error(), data_path() etc.

Tip 2: Export and load trained model

Let's get back to the previous notebook. Here's how you can export your model after training.

learn = cnn_learner(data, models.resnet18, ps=0)
learn.export('/your/local/path/cnn_learner.pkl')

This will save your trained model as a pickle file. Now, if you have well configured the imports (see Tip 1), you will be able to easily load the model for inference.


learner = load_learner(path='/your/local/path/', fname='cnn_learner.pkl')

Attention: besides saving your CNN learner you will need to do some additional exports. Firstly, you will need to export your KNN model, which is in a different format (sklearn)

import joblib
nn = joblib.load('/your/local/path/nn_model.pkl')

joblib goes with scikit learn, so there's no need to pip install it.

Second, you will need to export the data object, as it contains the file names, that you will use after running the inference pipeline.

np.save('/your/local/path/data', data.items)

Numpy will save the array of filenames that you will use for the inference. The result of the nn.kneighbors is the list of indices that you will map to the data items and get the corresponding file name.

So, here is how you can run the inference after saving all the artefacts.


learner = load_learner(path="your/local/path/", fname="cnn_learner.pkl")
embedding_layer = learner.model[1][-2]

test_image = open_image("test.jpg")

query_feature = compute_feature(test_image, learner, embedding_layer)

query_feature /= np.linalg.norm(query_feature, 2)
query_feature = np.reshape(query_feature, (-1, len(query_feature)))

nn = joblib.load('your/local/path/nn_model.pkl')
approx_distances, approx_im_indices = nn.kneighbors(query_feature)

data_items = np.load('your/local/path/data.npy', allow_pickle=True)

approx_im_paths = [str(data_items[i]) for i in approx_im_indices[0]]


plot_distances(list(zip(approx_im_paths, approx_distances[0])), num_rows=1, num_cols=8, figsize=(17,5))

Tip 3 : Attention to the package version

The fastai version used in the notebook is

Fast.ai: 1.0.48

The current version is 2.7.8, and you may have noticed that a lot of the functions are now deprecated. But simply changing the version to the latests will result in error as you will have to rewrite all the functions. I would suggest upgrading to 1.0.58, which requires only slight modifications of your code, and yet it will remain more up-to-date than 1.0.48. For instance, you will need to update this line of code

learner = load_learner(path='your/local/path/', fname='cnn_learner.pkl')

by replacing fname with file

learner = load_learner(path='your/local/path/', file='cnn_learner.pkl')

and nothing else. Moreover, you can keep using the 1.0.48 for training and 1.0.58 for inference. Works like a charm.

Tip 4 : Train with your own data

In the Microsoft's tutorial they use the images from Urls.fridge_objects_path. If you want to replace it with your own data there may be some manipulations to perform. If you are working on your local machine and all the data is stored there, then there's no problem, you simply provide the absolute path and run your code. However, this approach has several limitations. First, the compute on my machine was not sufficient to run the training due to the relatively big number of images (>120k). Thus, I created a compute instance on Azure ML with the following configuration :

Virtual machine size: Standard_NC24 (24 cores, 224 GB RAM, 1440 GB disk)
Processing unit: GPU - 4 x NVIDIA Tesla K80

Second, my images were stored on the cloud and not on the local machine. I didn't succeed to replace the image path with the blob url, so I've used the Azure CLI to download images from blob storage to my compute instance:

az login

az storage fs directory download -f imagedata --account-name <account-name> -s "<folder-name>" -d "destination-folder" --recursive

This will download all the images in the root folder where your notebooks are stored. Then you either copy the images to the fastai package folder, or provide the absolute path using the os.getwd() command. In my case, it looked like this:

/mnt/batch/tasks/shared/LS_root/mounts/clusters/mycomputename/code/Users/my-user-name/images/

So you assign this value to the DATA_PATH variable and execute the rest of the code as explained in the notebook.


random.seed(642)
data = (
    ImageList.from_folder(DATA_PATH)
    .split_none()
    .label_from_folder()
    .transform(size=IM_SIZE)
    .databunch(bs=BATCH_SIZE, num_workers = db_num_workers())
    .normalize(imagenet_stats)
)

I still think it should be possible to run the same code without copying the data to the compute instance, by mounting Azure Data Lake storage to your compute, but I've not succeeded yet.

In this article we've seen some useful tips and tricks that will allow you reuse the code provided on Microsoft's repository and adapt it to your needs.

Here are some refs:

The notebook used in the article
Microsoft Computer Vision recipes
Azure Machine Learning documentation
Fastai documentation
Azure CLI reference

Hope this was useful.

Microsoft Computer Vision Recipes: Tips and Tricks

Tip 1: Don't install the utils_cv

Tip 2: Export and load trained model

Tip 3 : Attention to the package version

Tip 4 : Train with your own data

Recent Posts