Roadmap for online image and video annotation software

New and improved online and in-field platforms for exploration, management and annotation of georeferenced images & video

Overview of the current annotation workflow:

Background/Motivation

Currently only a tiny proportion of the images and video that are collected in marine surveys are analysed. Transforming underwater imagery into quantitative information for science and policy decisions, requires substantial effort by human experts. Different groups tend to handle the collected data in different ways, using different sampling techniques, different annotation tools and even referring to the same taxa by different names. The lack of established channels for data processing often results in significant lags between data collection and scientific discovery. The decentralised, unstandardised nature of existing methods makes comparison across disparate sites difficult, resulting little data re-use and limited collaboration. Having a global repository of data that has been annotated in a consistent format opens up the possibility of answering ‘big picture’ questions and provides opportunities to access large data sets for training machine learning algorithms. These algorithms ultimately have the potential to provide a scalable, cost-effective and collaborative environment for dealing with huge volumes of seafloor imagery and video through input from ecologists, biologists, experts, citizen scientists and machine learning algorithms.

Development & Support

SQUIDLE+ is currently being developed by Ariell Friedman (Greybits Engineering), with support from the Schmidt Ocean Institute, the Integrated Marine Observing System and the Nectar science cloud.

About SQUIDLE+

The information below summarises the core features of the base platform:

  • Flexible data storage

    Requiring that all data is available/uploaded to the centralised web server poses a barrier for adding new data from other sources, duplicating data that is often already available elsewhere online. Many national marine observing programs (for example IMOS through the Australian Ocean Data Network (AODN), or the Marine Geoscience Data System (MGDS) in the USA), are mandated to put data online in an openly accessible location. These distributed data storage facilities should be leveraged to reduce data duplication and inconsistencies, and will also mean that data can be made readily available much more quickly. Using the a framework for interpreting flexible meta data formats, it takes minutes (instead of days) to import datasets and get them ready for detailed annotation.

  • Flexible annotation schemes

    There is no one size fits all when it comes to annotation schemes. Many have tried and failed to create an annotation scheme that suites the needs of all potential users, and now we have a variety of competing standardised schemes that exist. Our approach is to provide a flexible annotation system which allows users to annotate data using their annotation scheme of choice (whether selected from existing standard schemes or a new custom one). And instead of enforcing a single scheme, we will provide the capability to translate between different annotation schemes, which means that data labeled under one scheme can be viewed under another making sure that all annotated data is maintained in a consistent format. In addition, it will allow multiple labels per point as well as apply additional tags and free-form comments per annotation.

  • Collaborative / automated labeling

    Much of our scientific understanding of benthic environments ultimately depends on human interpretation of seafloor imagery. Traditional approaches, largely reliant on manual annotation of a small subsets by human experts, will not scale to the increasing demand for quantitative understanding of marine habitats for science and regulatory compliance, nor to the increasing volume of seafloor images. This platform seeks to enable and manage collaborative human-machine labelling invoking human experts, citizen scientists and machine learning algorithms to achieve validated accuracy and predictable time frames.

  • "Media object" annotation

    The new system will enable the same consistent labels to be applied to different media objects (images, video and large-scale mosaics). It will also offer the capability for defining validation sets based on other annotation sets for assessing annotation quality. There is a widespread global need for a flexible web-based video annotation tool. Using pre-existing video hosting technology, this is a relatively straightforward extension to the current platform.

  • In-field data annotation

    Unannotated data from the field can be considered to be a liability in the sense that it often results in huge repositories of images and video that need to be assessed at a later time. Real-time annotation for video and stills using the same annotation interface running a local caching database that can be easily synchronised with the online system post cruise, would make it possible to better leverage time and resources in the field and help to reduce the “post processing debt”.

  • Education & outreach

    With the advances in high bandwidth communications and social media, education & outreach activities have become commonplace on ocean-bound research cruises. It is possible to leverage the development effort in creating science tools to facilitate outreach goals, opening up the potential to acquire large volumes of crowd-sourced data that can compliment science objectives and engage the general public. We have already had some successes in this area.

More information about Software API ENDPOINTS & DOCUMENTATION