PwC predicts that AI will generate $15,700,000,000,000 by 2030, making it the biggest commercial opportunity in the economy of the future. At the same time, Gartner predicts that 85% of AI projects will fail. There is immense value, then, in understanding what is and is not possible with AI. This article explores one of the primary use cases driving the 15% of projects that result in successful commercialization: The AI Camera.
Developments Leading to the AI Camera
Over the last seventy years of computing history, we’ve seen expensive isolated computing (mainframes) give way to cheaper isolated computing (PCs), which in turn gave way to centralized computing (cloud). We are now seeing the move to a cheap micro computing (edge computing) + cloud hybrid. This combination captures the scalability and power of cloud and the on-site processing capability of edge computing. The AI Camera uses machine learning models that classify what is seen on a video stream, which requires intense computation. The cloud has enabled the outsourcing of this computing, and edge computing makes it economical. Only in the edge + cloud hybrid could the AI camera enter the mainstream.
We’ve also seen the falling of the barriers to machine learning model making. SageMaker is a prime example. Rather than building a machine learning model with R or Python (and self labelling the data) from scratch, a data science team can label the data, build the model, train it, and deploy in one environment. The workflow is becoming increasingly plug-and-play: just bring data and a problem.
In comes improved image and video recognition models from the data science community, and a brand new business use case--The AI Camera--is born.
Where we are at today
Markets across the economy are being impacted by the AI Camera. We will highlight several: home & enterprise security, video conferencing, photography, retail, auto, and a newly emerging “everything” camera. We have put together a timeline of the last eight years in the industry; as you can see it is new and rapidly commercializing.
AI Camera Home & Enterprise Security
The major player in AI home security is Amazon, in part because of its 2018 acquisition of Ring. On the enterprise side, Patriot One Technologies has built a widely used full-service security system that includes the AI-powered camera. And of course, Hikvision, arguably the company that has found the most commercial success with the AI Camera, has a wide lineup of smart security products.
Amazon’s Ring collects a live stream of footage, which is stored in the cloud for a monthly fee. The Ring notifies homeowners of important events like package burglary. The device uses Amazon trained ML models to identify these moments of interest and works with local law enforcement to augment the product’s reach.
The PATSCAN VRS is part of a larger collection of IoT devices deployed by Patriot One Technologies for a full-service enterprise protection. It combines its video feed and in-house ML models with radar, magnet, and chemical detection devices.
The Hikvision AI Cameras use Hikvision’s facial, body, clothing, vehicle, speed, license plate, and other ML detection models to give security personnel a full understanding of the area’s happenings. Hikvision has reportedly 16,000 engineers pushing the boundaries of hardware and machine learning for a superior AI Camera capability.
AI Video Conferencing
AI video conferencing has seen an emergence of both consumer and enterprise use cases. The fundamental innovation is the ability to tailor the participant experience to get one step closer to the equivalent of in-person experience.
The Meeting Owl Pro uses facial recognition ML models to create a dynamic participant-focused experience. The camera identifies the participant speaking and orients the audio input and video output (what the other participants are seeing) around the speaking participant.
Facebook’s Portal uses facial/body recognition for “digital pivoting,” a process in which the wide-lens focuses on the area it sees activity on camera without physically moving the lens. An in-house ML model, the Mask R-CNN, was invented for this purpose. The result is a fluid video conferencing experience that feels as if a person is following around the participant. In previous iterations, a mechanical system was used in conjunction with facial/body recognition but could not react fast enough to a person’s movements. All processing is done in-unit to address the initial uproar of privacy concerns.
Smartphone companies are in an arms race to build the best camera, and AI provides a competitive edge. AI augments a phone's hardware capability and requires newly specialized chips to match.
Apple began integrating ML models into iPhone cameras with the release of the iPhone 7. Portrait mode, first available on the iPhone 7, uses the phone's neural engine to identify people from background in the split second of the picture capture. This distinction enables the processing of a bokeh effect background. Apple designed the iPhone 7’s A12 chip to accommodate this advanced image (and speech for a superior Siri) processing.
AI Camera Retail
Amazon is garnering media attention with its fully functional AI Camera integrated stores, but the technology is making inroads in the broader retail market. Companies like AnyVision are using existing retailer surveillance networks to provide insights like customer loyalty, automated shoplifting detection, and store hotspots.
Amazon made a custom-built camera unit to do basic computer vision work like motion detection and basic object identification on the edge. These are the cameras you will see built into the ceiling of the Amazon Go store. The cameras route images of interest (ex. customer picks up a drink) to a central processing unit. Once there, the more advanced processing of high confidence facial recognition and item recognition are verified in the customer’s “virtual shopping cart”.
AI Camera Auto
'Keeping the eyes on the road' needs to be fulfilled by something if not a human driver. AI Cameras are shifting that responsibility to the computers and enabling an autonomous driving future. Tesla is the pioneer.
All Tesla cars come equipped with eight AI-enabled cameras for a variety of uses. Most ambitiously, these cameras, in combination with several other sensors onboard, underpin Tesla's autopilot ambitions. The cameras provide a 360-degree view of the vehicle’s surroundings using Tesla-built object detection models. Tesla’s secret to autonomous driving supremacy lies in the nearly 1 million cars and object detection trainers shipped to date. Each Tesla car records images encountered on the road, including edge cases like horses. These images are uploaded to Tesla’s data warehouse at the next wifi connection and used to further develop the models underpinning autonomous driving, like horse detection. It is edge cases like horse detection (as well as political issues) that are keeping us from a fully operational autonomous driving experience.
Custom-Purpose AI Camera
In each previous product, the seller ships proprietary models for specific use cases like identifying a customer picking up a drink or a burglar stealing a package from the front porch. It takes thousands of images and expensive processing to build these models, so the narrow use cases were warranted.
In comes Amazon with its acquisition of SageMaker, which enables the easy building of ML models; and its development of AWS, Amazon’s cloud arm for easy off-site AI computing power.
“The world’s first wireless, deep-learning camera for developers.” The AWS DeepLens is a consumer product with functionality similar to the above AI Camera products.
Where are we headed?
The big tech players are best positioned to continue the commercialization of AI Cameras, and they have every reason to be. Ring solves Amazon Prime’s petty crime problem. Amazon Go puts Amazon in the arena for offline sales without the brand restrictions imposed by Whole Foods. Facebook Portal gives users a better social simulation. The iPhones AI camera gives Apple a competitive edge. Tesla’s AI cameras and the company's scale are the most promising route to autonomous driving to date.
The niche players, Hikvision and Patriot One Technologies, are safe so long as big tech stays out of full system security (chemical, metal detection, etc.). The politically sensitive nature of security is an additional competitive moat for these companies. Of course, Western skepticism, as well as the 2019 U.S. government ban on select Chinese AI players, poses challenges for Hikvision and others.
The bandwidth required to upload video to the cloud necessitates the edge computing we are seeing today. It is better to send one hour of relevant video to the cloud for processing than scan 24 hours worth of mostly irrelevant content. 5G will change the equation. Jumbi Edulbehram, former VP of Motorola Global Cloud Services, argues that, “It will give us access to higher bandwidth on the edge[, which] will change the game significantly,” Jumbi Edulbehram, former VP of Motorola Global Cloud Services.
For industry, a world of deployed AI Cameras generating insight is inevitable. But where does a business go after deployment? The integration of the insights generated by the camera with core business decisions won’t always be easy. Edulbehram again: “If you get a really great heatmap, how is this going to make the business better? Who is the right person to give it to? This requires deep vertical expertise." AI & video analytics consulting services designed to answer these questions will see new business roll in.
The Big Winner
Amazon. In so many areas Amazon has superior strategic positioning. As data science commoditizes with tools like SageMaker (and Google’s TensorFlow), the winner will be the arms provider. Amazon sells the full workflow, taking the data science required skill down from a PhD to a Bachelor's. Amazon sells the engine behind the work: AWS; and Amazon sells the hardware that brings the power of machine learning into consumer’s hands: the AWS DeepLens. In the words of Jamie Dimon, the CEO behind JPMorgan Chase’s rapid ascension to banking dominance, “size, scale, and staying power matter”.