Updated: Mar 31
PwC predicts AI will generate $15,700,000,000,000 by 2030 making it the biggest commercial opportunity in the future economy. Meanwhile Gartner predicts 85% of AI projects will fail. There is immense value in understanding what is and is not possible with AI. This article explores one of the primary use cases driving the 15% of successful commercialization: The AI Camera.
Over the last seventy years in computing history, we’ve seen expensive isolated computing (mainframes) give way to cheaper isolated computing (PCs) give way to centralized computing (cloud). We are now seeing the move to a cheap micro computing (edge computing) + cloud hybrid. This combination captures the scalability and power of cloud and the on-site processing capability of edge computing. The AI Camera uses machine learning models that classify what is seen on the video stream, which requires intense computation. The cloud has enabled the outsourcing of this computing, and edge computing makes it economical. In only the edge + cloud hybrid could the AI camera enter the mainstream.
We’ve also seen the barriers of machine learning model making fall. SageMaker is a prime example. Rather than building a machine learning model with R or Python (and self labelling the data) from scratch, a data science team can label the data, build the model, train it, and deploy in one environment. The workflow is becoming increasingly plug-and-play, just bring data and a problem.
In comes improved image and video recognition models from the data science community and a brand new business use case, The AI Camera, is born.
Where we are at today
Markets across the economy are being impacted by the AI Camera, we will highlight several: home & enterprise security, video conferencing, photography, retail, auto, and a newly emerging “everything” camera. We have put together a timeline of the last eight years in the industry, as you can see it is new and rapidly commercializing.
AI Camera Home & Enterprise Security
The major player in AI home security is Amazon with its 2018 acquisition of Ring. On the enterprise side, Patriot One Technologies has built a widely used full-service security system that includes the AI-powered camera. And of course, Hikvision, arguably the company that has found the most commercial success with the AI Camera, has a wide lineup of smart security products.
Amazon’s Ring collects a live stream of footage, stored in the cloud for a monthly fee, and notifies homeowners of important events like package burglary. The device uses Amazon trained ML models to identify these moments of interest and works with local law enforcement to augment the product’s reach.
The PATSCAN VRS is part of a larger collection of IoT devices deployed by Patriot One Technologies for a full-service enterprise protection. It combines its video feed and in-house ML models with radar, magnet, and chemical detection devices.
The Hikvision AI Cameras use Hikvision’s facial, body, clothing, vehicle, speed, license plate, and other ML detection models to give security personnel a full understanding of the area’s happenings. Hikvision has reportedly 16,000 engineers pushing the boundaries of hardware and machine learning for a superior AI Camera capability.
AI Video Conferencing
AI video conferencing has seen an emergence of both consumer and enterprise use cases. The fundamental innovation is the ability to tailor the participant experience to get one step closer to the in-person experience.
The Meeting Owl Pro uses facial recognition ML models to create a dynamic participant focus experience. The camera identifies the participant speaking and orients the audio input and video output (what the other participants are seeing) around the speaking participant.
Facebook’s Portal uses facial/body recognition for “digital pivoting” where the wide-lens focuses on the area it sees activity on camera without physically moving the lens. An in-house ML model, the Mask R-CNN, was invented for this purpose. The result is a fluid video conferencing experience that feels as if a person is following around the participant. In previous iterations a mechanical system was used in conjunction with facial/body recognition but could not react fast enough to a person’s movements. All processing is done in-unit to address the initial uproar of privacy concerns.
Smartphone companies are in an arms race to build the best camera, and AI provides a competitive edge. AI augments a phone's hardware capability and requires newly specialized chips to match.
Apple began integrating ML models into iPhone cameras with the release of the iPhone 7. Portrait mode, first available on the iPhone 7, uses the phone's neural engine to identify people from background in the split second of the picture capture. This distinction enables the processing of a bokeh effect background. Apple designed the iPhone 7’s A12 chip to accommodate this advanced image (and speech for a superior Siri) processing.
AI Camera Retail
While Amazon is garnering media attention with its fully functional AI Camera integrated stores, the technology is making inroads to the broader retail market. Companies like AnyVision are using existing retailer surveillance networks to provide insights like customer loyalty, automated shoplifting detection, and store hotspots.
Amazon made a custom-built camera unit to do basic computer vision work like motion detection and basic object identification on the edge. These are the cameras you will see built into the ceiling of the Amazon Go store. The cameras route images of interest (ex. customer picks up a drink) to a central processing unit. Once here, the more advanced processing of high confidence facial recognition and item recognition are verified in the customer’s “virtual shopping cart”.
AI Camera Auto
“Keep your eyes on the road!” needs to be fulfilled by something if not a human driver. AI Cameras are shifting that responsibility to the computers and enabling an autonomous driving future. Tesla is the pioneer.
All Tesla cars come equipped with eight AI-enabled cameras for a variety of uses. Most ambitiously these cameras underpin Tesla’s autopilot ambitions, in combination with several other sensors onboard. The cameras provide a 360-degree view of the vehicle’s surroundings with Tesla built object detection models. Tesla’s secret to autonomous driving supremacy lies in the nearly 1 million cars, and object detection trainers, shipped to date. Each Tesla records images, including edge cases like horses, encountered on the road. These images are uploaded to Tesla’s data warehouse at the next wifi connection and used to further develop the models underpinning autonomous driving, like horse detection. It is edge cases like horse detection that are keeping us from a fully operational autonomous driving experience (in addition to political considerations).
Custom-Purpose AI Camera
In each previous product the seller ships proprietary models for specific use cases like identifying a customer picking up a drink, or a burglar stealing a package from the front porch. It takes thousands of images, and expensive processing to build these models so the narrow use cases were warranted. In comes Amazon with its acquisition of SageMaker which enables the easy building of ML models, and AWS Amazon’s cloud arm for easy off-site AI computing power.
“The world’s first wireless, deep-learning camera for developers.” The AWS DeepLens is a consumer product with similar functionality to above AI Camera products with the easy creation of custom ML models.
Where are we headed?
The big tech players are best positioned to continue the commercialization of AI Cameras and have every reason to be. Ring solves Amazon Prime’s petty crime problem. Amazon Go puts Amazon in the arena for offline sales without the brand restrictions of Whole Foods. Facebook Portal gives users a better social simulation. The iPhones AI camera gives Apple a competitive edge. Tesla’s AI camera’s and the company's scale are the most promising route to autonomous driving to date. The niche players Hikvision and Patriot One Technologies are safe so long as big tech stays out of full system security (chemical, metal detection, etc.). The politically sensitive nature of security is an additional competitive moat for these companies. Of course, the Western skepticism, and 2019 U.S. government ban on select Chinese AI players, poses challenges for Hikvision and others.
The bandwidth required to upload video to the cloud has necessitated the edge computing we are seeing today. It is better to send one hour of relevant video to the cloud for processing than 24 hours of mostly irrelevant content. 5G will change the equation. “It will give us access to higher bandwidth on the edge. That will change the game significantly,” Jumbi Edulbehram, former VP of Motorola Global Cloud Services.
For industry, a world of deployed AI Camera’s generating insight is inevitable. But where does a business go after deployment? The answer is an integration of those insights into the business, which won’t always be easy. “If you get a really great heatmap, how is this going to make the business better? Who is the right person to give it to? This requires deep vertical expertise,” Jumbi Edulbehram. AI & video analytics consulting services designed to answer these questions will see new business roll in.
The Big Winner
Amazon. In so many areas Amazon has superior strategic positioning. As data science commoditizes with tools like SageMaker (and Google’s TensorFlow) the winner will be the arms provider. Amazon sells the full workflow taking the data science required skill down from a PhD to a bachelors. Amazon sells the engine behind the work, AWS. And Amazon sells the hardware that brings the power of machine learning into consumer’s hands, the AWS DeepLens. In the words of Jamie Dimon, the CEO behind JPMorgan Chase’s rapid ascension to banking dominance, “size, scale, and staying power matter”.