This framework provides developers with optimized code for more accessible and seamless development of vision enabled checkout solutions.
![Retail Innovation Unlocked, Intel,](https://d1qg7561fu8ubi.cloudfront.net/blog/retail-innovation-unlocked-1.jpg)
Photo by Tara Clark on Unsplash
Computer vision is revolutionizing various industries, and retail is no exception. By automatically extracting and analysing important characteristics from images and videos, it has the potential to bring significant changes across multiple scenarios in retail stores.
While the application of artificial intelligence has many benefits, we often don’t see proof-of- concepts (PoCs) scale successfully into larger, implemented deployments. This is because vision workloads are complex, requiring specific infrastructure and a combination of hardware and software expertise to develop, deploy, scale, and maintain these systems.
Intel Corporation invests a lot of hard work to develop and perfect software frameworks that takes advantage of the underlying hardware capabilities. The range of software choices for developers is constantly expanding; Intel aims to continuously improve and optimize various software and frameworks to increase hardware utilization and boost code efficiency.
Use case: Automated Self-Checkout
To make it easier for software developers to understand the optimizations and hardware capabilities, we’ve kicked off an open source initiative for vision-enabled use cases in retail. The Intel® Automated Self-Checkout Reference Package provides optimized code blocks, documentation (including videos and blogposts) and performance data for developers to bootstrap their product and project work.
The first use case features automated self-checkout reference implementation, one of many in the computer vision enabled checkout space.
For details on this open source reference implementation, take a look at these resources:
By using this reference implementation, developers can choose the necessary hardware to minimize the cost of each vision stream. This, in turn, helps speed up software development by utilizing the available core building blocks. The above is the first of many self-checkout use cases.
Software Optimizations
As the number of cameras increase, and the diversity and complexity of AI models increase, the compute requirements increase proportionately. For many of the use cases that require near real time inference data, the popularity of the edge has increased. However, this still requires many decisions to be made by software architects. Where should the software services be deployed or what compute would be needed to enable the use cases. Software architects and developers need to decide on services being deployed on-device or distributed across systems, the quality of the video needed, acceptable latency and the number of video streams that must be handled at a minimum throughput.
The ideal scenario would have the highest-performing GPUs run inferencing against the algorithms developed to fulfil these use cases. However, this isn’t just about running an AI model. A vision workload goes through many stages as shown in figure 1. It starts from the time the video streams are decrypted, decoded, and then pre-processed to make the images ready for feature extraction. Once done, the images are ready for feature extraction and inferencing can be done with the chosen deep learning algorithms. In many workloads, there can be multiple algorithms and each one requires pre-processing steps before the appropriate inference results are sent back to the application.
![Retail Innovation Unlocked, Intel,](https://d1qg7561fu8ubi.cloudfront.net/blog/retail-innovation-unlocked-2.png)
Figure 1: Vision Data Flow. Source: Intel.
Taking the above flow as an example, for decode and pre-processing, Intel provides the Intel® Media SDK and the Intel® oneAPI Video Processing Library (oneVPL) for video decode, encode, processing, format conversions on hardware. This hardware abstraction can enable efficient programming for developers, including network protocols and lower-level libraries such as Libva or DirectX Video Acceleration* (DXVA) that utilizes the hardware at its optimum performance.
For deep learning, the Intel® Distribution of OpenVINO™ toolkit helps make AI models light, optimizes neural networks for accelerated inferencing and reduces latency. Further recent enhancements in OpenVINO enable developers to reduce the footprint of their applications by minimizing the components required to deploy through the toolkit’s runtime plugin inference engine.
These are just a few examples of the software optimisations and frameworks that Intel brings to developers that are highlighted in the reference implementation. The pre-built containers remove compatibility issues between software and hardware architectures.
Hardware for Workloads
Using appropriate hardware is crucial for achieving optimal performance and cost-efficiency as the number and variety of workloads expand. This includes both efficient deployment and scaling processes. Here we provide performance data across a multiple Intel SKUs to highlight key parameters that developers can use to determine where to land their code. Given the specific industry their applications serve, it’s essential to prioritize certain factors. Take retailers, for example, who must work with tight profit margins. Consequently, architects must carefully balance performance and cost, instead of focusing solely on achieving the highest performance.
In the age of heterogenous compute, the way hardware platforms are constructed to handle compute-intensive and memory-intensive applications is a crucial consideration. Analysing the performance data is a valuable initial step towards identifying the necessary hardware and, more significantly, making well-informed choices on how to expand the applications from a small number of self-checkout stations to hundreds of them.
![Retail Innovation Unlocked, Intel,](https://d1qg7561fu8ubi.cloudfront.net/blog/retail-innovation-unlocked-4.png)
Figure 2: Performance Data. Source: Intel benchmark results.
The reference code provides a mechanism for developers to recreate the performance data using the sample models provided. Figure 2 shows the various parameters extracted during our performance runs. These performance scripts can be configured to run with custom pre-trained models, varying precision, resolution and making choices between heterogenous architectures. Stay tuned: There are a bunch of new ones coming up later this quarter.
Get Involved
This reference implementation aims to demonstrate the effectiveness of combining software optimizations and frameworks with heterogeneous architectures. By doing so, it can help deliver optimal performance at low costs, while also providing a user-friendly approach for developing, deploying, and scaling solutions.
This is one of many use cases that will be released as open-source software to increase productivity and reduce complexity for developers.
We’d love to hear your feedback and helping guide our roadmap and invite you to join the Intel Edge Community for updates.
About the Author
Farhaan Mohideen is a Product Manager at Intel’s Network and Edge Group (NEX) leading initiatives for the retail vertical. Previously, he was the Director of Product Management at Aepona Ltd*, an API management and payments startup acquired by Intel. He holds a Ph.D. in applied mathematics and B.Eng. (Hons) in electrical engineering.