In addition to video streams, Axis devices can also deliver metadata streams. While the purpose of video streams is self-explanatory, the purpose of metadata might not be.
Metadata is the foundation for gathering intelligence from video. It assigns digital meaning to each video frame by describing the key details in the scene. Using metadata, you can quickly find, evaluate, and act on what is important in large amounts of video. This is why metadata has increasingly become an essential part of efficient security, safety, and business operations.
Visualization of analytics metadata generated by Axis cameras available on AXIS OS 11.11 or later.
The analytics metadata stream, describes the events, content, and characteristics of a scene including Event data and Scene metadata.
Logical rules based on what has occurred at the scene for example, when an object crossing a line is counted, or an object is dwelling in an area.
Describes which objects are in the scene (such as humans and vehicles) and their attributes.
AXIS Scene Metadata enhances scene understanding, providing critical details such as object classes (human or vehicle), clothing and vehicle colors, license plates, and speed data. This in turn enables rapid decision-making, automated actions, and simplified search. Seamlessly integrated with third-party solutions through standardized methods, and delivered directly from Axis cameras, AXIS Scene Metadata helps reduce system and operational costs while ensuring efficiency and precision.
AXIS Scene Metadata includes information about the object types as well as other specific attributes such as clothing and vehicle color, license plate information, speed, location, and timestamp. AI-based analytics, featuring object detection and classification, lets you filter the attained metadata, trigger events, and examine specific elements of the scene.
A complete list of supported object classes and attributes is available below, but please note that the detection and classification capabilities are camera dependent. For detailed information about which object classes and attributes a specific camera model supports, please refer to the camera's data sheet.
In addition to the listed object attributes, the analytics metadata stream may also provide information about other properties of the detected objects such as likelihood, duration, shape, images, speed*, geocoordinates* and position.
*Requires radar or radar-video fusion camera integration.
It is also worth noting that AXIS Scene Metadata generates information such as shape and position about non-classified moving objects in the video scene as well.
Analytics metadata not only provides details about objects in a scene. It also provides context to events and allows large amounts of footage to be quickly sorted and searched. This enables functions that can be broadly categorized into the areas of post-event forensic search, real-time use, and identification of trends, patterns, and insights.
The main consumers of analytics metadata can be summarized as follows:
1) Edge applications
2) Real-time alarm notification and post event forensic searching within a Video Management System
3) Statistical analysis and reporting leveraging IoT and business intelligence platforms, such as data visualization dashboards
4) Advanced analytics requiring additional processing power
Axis devices generate analytics metadata that is conformant with ONVIF Profile M streaming over RTSP to support use-cases related to post-event forensic search. However, this metadata is also accessible through alternative communication protocols and file formats, enabling a straight forward integration with a wide variety of systems covering a large array of use-cases. This guide provides important information such as suggested architecture, development examples, and recommended considerations when designing a solution that consumes this analytics metadata.
This guide also assumes that you are familiar with the concept of video metadata and more specifically with the capabilities that Axis cameras posses for detection of classified objects in a scene. In order to better understand the added value of video metadata, as well as an introduction to the capabilities of Axis cameras in this context, consider visiting the AXIS Scene Metadata product page and reading our white paper if you haven’t done so already.
Axis devices with AXIS OS 11.11 or later support two different methods of generating analytics metadata, each being designed and optimized to support specific use-cases, such as real-time alarm notifications or post-event forensic search and identification of trends, patterns, and insights. To retrieve information regarding the metadata analytics producers on the Axis camera and the kind of metadata they can produce, please refer to the Analytics Metadata Producer Configuration API available in the VAPIX Library.
Consider whether your application requires the device to deliver the information in real-time when selecting which method is most suitable to fulfill your specific use-case. In the following section, we will take a deeper dive into each of the available methods.
The Analytics Scene Description producer generates metadata on a frame-by-frame basis at a frequency of 10 times a second.
Each metadata frame contains information about the moving objects in the scene at that specific point in time, as exemplified below:
The Analytics Object Description producer generates metadata based on the track of a detected object, meaning that it can deliver metadata frames either when an object has entered and left the video scene. The information gathered during the lifetime of the object is combined and included in each metadata frame that is delivered by the camera.
Frame 1 contains Object A and Object B as detected in the scene. Object A is classified as a Human wearing red clothing, while Object B is classified as a Human wearing blue clothing.
Frame 1 contains only information about Object A, when the object was first and last detected and a summary of the trajectory of the object and attributes that the camera was able to detect during the course of the track – so Object A had a 33% likelihood of wearing red clothing and 67% likelihood of wearing blue clothing, as the camera detected both colors during its tracking.
In Frame 2, the camera now determines that Object A is actually wearing blue clothing and that Object B is now wearing yellow clothing. They are the same objects as in frame 1, but with different color attributes which is also reflected in the metadata output.
Similarly, Frame 2 contains all known information about Object B.
Even if the data is being generated based on detected objects instead of on a frame-by-frame basis, it can contain the same information with regards to object trajectory and detected object classes and attributes.
In Frame 3, Object B is no longer present in the scene and the camera can only track Object A in the scene, which is still a Human wearing blue clothing.
When to use frame-by-frame metadata
This method could, for example, be suitable for an edge application running on the Axis camera to trigger real-time events based on the content of the metadata, e.g. when a yellow vehicle is detected and an access gate needs to be opened.
When to use consolidated metadata
This method of generating metadata is best suited for non-real-time applications, such as performing post-incident forensic search or statistical analysis as it eliminates the need to process and store irrelevant information and greatly reduces the amount of logic necessary to develop powerful applications based on object classification metadata.
The analytics metadata streams can also be configured to include cropped images of detected classified objects using the Best Snapshot feature.
The image is represented as a base-64 encoded string within the metadata output. For examples, please refer to sample data frames which include images in the Ways of accessing the metadata section. The Best Snapshot feature must be enabled manually by issuing the following request to the camera:
http://<servername>/config/rest/best-snapshot/v1/enabled
method | PUT PATCH |
JSON Input parameters |
{ "data": true } |
{ "status": "success" } |
In this section, we will provide examples of how different consumer types are capable of accessing and consuming AXIS Scene Metadata.
AXIS Camera Application Platform (ACAP) is an open application platform from Axis. It provides a development platform for software-based solutions and systems built around Axis devices. ACAP is available for various types of Axis products such as cameras, speakers and intercoms.
The ACAP Native SDK is targeted towards users that want to develop plug-in style, event generating applications that fit well into a Video Management System centric environment. From AXIS OS 11.9, ACAP applications can consume AXIS Scene Metadata leveraging the Message Broker to further apply logical filters and rules to the information about the object in the scene in order to, for example, trigger actions based on defined thresholds or specific behaviors.
Example
This example showcases how an ACAP application can consume frame-by-frame or consolidated analytics metadata using the Message Broker API.
The available topics to subscribe to are:
Available | Output | Topic | Sample Data |
AXIS OS 11.9 or later | Frame by frame | com.axis.analytics_scene_description.v0.beta | examples_json_frame_based.json |
AXIS OS 11.11 or later | Consolidated | com.axis.consolidated_track.v1.beta | examples_json_consolidated.json |
Please refer to the following resource for more detailed information on available topics and terminology.
Axis devices send both the analytics metadata and video streams to the Video management system (vms) to enable forensic search integrations. Two examples of these integrations are AXIS Optimizer forensic search for Milestone plugin and AXIS Forensic Search for Genetec.
The Analytics Scene Description metadata stream can be retrieved from an Axis device by opening an RTSP stream that uses the TCP transport protocol according to the following example:
RTSP request | Description | Sample frame |
rtsp://ip-address/axis-media/media.amp?analytics=polygon | Video analytics metadata excluding video stream | detections.xml |
1) Information about the type of metadata stream generated by the device.
2) Frame timestamp crucial for syncing metadata with video (or audio) when you play or query and source field.
3) Information for bounding boxes and polygons, represented in ONVIF coordinate system which is -1 to 1 in the X and Y axes.
4) Bounding boxes and polygons are currently the same if you use Analytics Scene Description as a source.
5) Represents the color (of the car) and the probability value of the object classification. The color is presented before object class due to the ONVIF Profile M format. Object class can be found in section 6.
6) Object class and probability value of the object classification, such as vehicle in this example.
7) In addition to the main category, it presents a sub-category, such as a car in this example.
Additional parameters such as "camera=2" can be added to the above request to e.g. receive metadata events from a different video channel. This is useful when the Axis device supports more than one video source.
Please visit the AXIS OS knowledge base for additional information.
Certain applications benefit from combining edge-based and server-based processing for advanced analyses. Initial pre-processing occurs on the camera, followed by additional processing on a server. This hybrid system enables cost-efficient scalability of analytics by streaming only relevant video or images, along with metadata, to the server.
MQTT is a standard messaging protocol that efficiently and reliably exchanges data between IoT devices and cloud applications. Devices publish messages to an MQTT broker, which then forwards them to subscribing clients based on specified topics. In traditional VMS setups, Axis event notifications are typically streamed via RTSP using VAPIX/ONVIF APIs to a single destination. However, Axis devices running AXIS OS 9.80 or later versions can also distribute notifications using MQTT through their built-in MQTT client. This capability extends beyond VMS ecosystems and is particularly useful over the internet.
This guide does not cover an overview of the MQTT protocol or the specific configuration of the MQTT client on Axis cameras. For more details on these topics, please visit resources on Device integration with MQTT and the VAPIX MQTT Client API.
Axis devices running AXIS OS 12.2 or later can generate frame by frame and consolidated metadata through one of the available data source keys using the Analytics VAPIX Analytics MQTT API.
Output | Protocol | Data Source Key | Sample Data |
Frame-by-Frame | MQTT | com.axis.analytics_scene_description.v0.beta#1 | examples_json_frame_based.json |
Consolidated | MQTT | com.axis.consolidated_track.v1.beta#1 | examples_json_consolidated.json |
Create a new analytics MQTT data publisher by issuing the following request to the camera:
Create an analytics data publisher:
http://<servername>/config/rest/analytics-mqtt/v1beta/publishers
method | POST |
JSON Input Parameters |
{ "data":{ "id":"my_publisher", "data_source_key":"com.axis.consolidated_track.v1.beta#1", "mqtt_topic":"my_mqtt_topic" } } |
{ "status":"success", |
If you would like to retrieve the frame-by-frame metadata instead use data_source_key":com.axis.analytics_scene_description.v0.beta#1". The data after #
, i.e 1
represents the origin of the data, in this case, the camera head that the data originates from. If your camera supports it, #2
, #3
and #4
might also be available.
Optional: You can use Swagger UI to perform further operations on the Analytics MQTT API, such as deleting existing data publishers. Please refer to the following resource:
Connect to MQTT Broker
Once the data publisher has been created, proceed to configure the Axis device to connect to an MQTT server/broker in the web interface as per the instructions found in the AXIS OS Knowledge base.
The MQTT server/broker connection configurations can be found under Settings > System > MQTT in the web interface.
From AXIS OS 10.11 and onwards, it's possible to choose dedicated analytics producers within the Axis camera or to have multiple analytics metadata producers enabled simultaneously. There are currently three methods available to configure the metadata producers within the camera. Additional information about each of these methods can be found below.
Once the metadata stream has been configured according to your needs, you can access the output using one of the available methods as described in the previous section.
You now posses the necessary information to develop solutions based on the analytics metadata content generated by Axis cameras. The components making up a solution can vary drastically depending on the specific needs of your project.
Analytics metadata can be used in real time to help operators respond quickly to situational changes. It can also provide valuable input to support decision making or enable automated action.
Real time edge analytics that work with high-quality metadata can help you secure people, sites, and buildings and protect them from intentional or accidental harm. You can rapidly detect, verify, and evaluate threats so they can be efficiently handled.
The analytics metadata is typical stored in a database type component which can be queried regularly to extract the data that is of interest and visually present it in dashboard.
In the case of leveraging cloud services to perform advanced analysis on a detected object, however, the database component might not be required if there is a need to present the data in as close to real-time as possible.
An example implementation is represented below.
1) An Axis camera with MLPU or DLPU, generating analytics metadata
2) The analytics metadata is transmitted to consumers through the available communication protocols
3) The analytics metadata is further processed and stored, then consumed by different applications
4a) Collected data is visualized in a graphical dashboard to analyze trends and gain insights
4b) Access to a sensitive area is restricted based on license plate information included in the analytics metadata
You can view upcoming changes in the AXIS OS Portal.
AXIS OS 12.12 | Introduced a dedicated VAPIX Analytics MQTT API, making it easier for applications to efficiently use analytics metadata from cameras supporting AXIS Scene Metadata. Read more |
AXIS OS 12.1 | Changes in the analytics metadata stream. Read more |
AXIS OS 12.0 | Changes in the analytics metadata stream. Read more |
AXIS OS 11.9 | Introduced the Message Broker API (in beta). This API lets you build applications that can easily access analytics metadata of detected objects in the scene. |
AXIS OS 11.5 | Upper and lower clothing color has been added as an attribute to the human object class within the analytics scene description metadata stream. Read more |
AXIS OS 11.1 | Vehicle color is included in the Axis analytics metadata stream. Read more |
AXIS OS 11.0 | The source parameter in the Axis analytics metadata stream changed to "AnalyticsSceneDescription". Read more |
AXIS OS 10.11 |
|
AXIS OS 10.10 | Restructure of AXIS Object Analytics and Motion Object Tracking Engine (MOTE) metadata producers. AXIS Object Analytics ACAP is no longer the producer for object classification metadata. The metadata from the Axis Object Analytics provider will add two new classifications and associated bounding boxes for:
Note: The additional classifications will only be available on DL cameras. For releases after AXIS OS 10.10, it is good to be aware that:
For more information, see changes in Metadata Analytics stream AXIS OS 10.10. |
AXIS OS 10.9 | Motion Object Tracking Engine (MOTE) data got a source name. The main difference is Source=VideoMotionTracker, the rest is the same. See example package. |
AXIS OS 10.6 | With the release of Axis OS 10.6, you are able to retrieve object classification data, e.g. humans and vehicles. To be able to retrieve object classification data, you need a device with MLPU or DLPU. Please refer to the Product selector... |