This is a high-level view of the proposed LINEMOD API, mostly stripped of implementation details.
To train templates for one or more objects:
using namespace linemod; // LINE-MOD instance using color gradients and depth normals with default // (VGA-suitable) parameters and two pyramid levels. cv::Ptr<Detector> detector = getDefaultLINEMOD(); // For each (color, depth) view with object in mask, compute a template int class_id = 1; // One object class in this example for (...) { // sources contains one cv::Mat source image per modality, in order. // Default LINE-MOD uses modalities (ColorGradient, DepthNormal). std::vector<cv::Mat> sources; sources.push_back(color); sources.push_back(depth); // Train a new set of templates int template_id = detector->addTemplate(sources, class_id); } // Can then use for detection or write to disk cv::FileStorage fs("objects.yml.gz", cv::FileStorage::WRITE); detector->write(fs);
To detect one or more objects:
using namespace linemod; // Use previously trained templates cv::FileStorage fs("objects.yml.gz", cv::FileStorage::READ); cv::Ptr<Detector> detector = new Detector; detector->read(fs.root()); // Require one source image per modality, in order. std::vector<cv::Mat> sources; sources.push_back(color); sources.push_back(depth); // Optionally search for a subset of the objects std::vector<int> class_ids; class_ids.push_back(1); // Perform matching std::vector<Match> matches; detector->match(sources, threshold, matches, class_ids); // Each Match contains (x,y) location, similarity, class ID, and template ID.
Detector instances can be written to/read from disk. This saves out the modalities (and their parameters) used, similar to OpenCV's features2d. It also saves out the templates for each modality and pyramid level.
Color gradients and depth normals are both represented through the generic linemod::Modality and linemod::QuantizedPyramid interfaces. A linemod::Detector can be created using any combination of modalities. Adding a new modality amounts to implementing two subclasses:
class ColorComparison : public Modality { public: // Read parameters from disk virtual void read(const cv::FileNode& fn); // Write parameters to disk virtual void write(cv::FileStorage& fs) const; protected: // Returns a handle for computing templates over multiple pyramid levels virtual cv::Ptr<QuantizedPyramid> processImpl(const cv::Mat& src, const cv::Mat& template_mask) const; }; class ColorComparisonPyramid : QuantizedPyramid { public: // Return the quantized image at the current pyramid level virtual void quantize(cv::Mat& dst) const; // Extract a template at the current pyramid level virtual bool extractTemplate(Template& templ) const; // Go down one pyramid level. This involves subsampling the source or quantized // image and updating any resolution-dependent parameters. virtual void pyrDown(); };
1.6.3