Midokura Technology RadarMidokura Technology Radar

Foundational Model for Data Auto-Annotation

aidatateam:mido/aiad
This item was not updated in last three versions of the Radar. Should it have appeared in one of the more recent editions, there is a good chance it remains pertinent. However, if the item dates back further, its relevance may have diminished and our current evaluation could vary. Regrettably, our capacity to consistently revisit items from past Radar editions is limited.
Backlog

Why?

High-quality training data is essential yet challenging to procure affordably and efficiently. By automating data annotation with foundational models, we aim to:

  • Lower data collection and annotation costs.
  • Enhance dataset creation flexibility and speed.
  • Improve data quality and annotation accuracy.
  • Improve models quality.

What?

Implementing large foundational models for auto-annotation streamlines the preparation of diverse datasets. This approach:

  • Accelerates dataset readiness.
  • Adapts easily to different data types, from text to images.

Action items

  • Evaluate publicly available models
  • Create an interface that allows easy extension to new models (abstract the actual foundation model network inference, so different models can be supported or even a call to Neusphere)