Computer vision is a widely growing field within artificial intelligence. Widely used applications include autonomous driving, social media face filters, and medical diagnosis detection (i.e. cancerous tumors), just to name a few. At AppZen, we apply cutting edge technology to reduce business spend in auditing expense reports, invoices, and contracts.
While each computer vision specialization has its own specific nuances, below are the ways we apply similar technologies to the AppZen spend auditing platform.
The biggest difference between AppZen and autonomous driving technology used in self-driving vehicles is the number of external factors that require object detection recognition. In autonomous driving, variances with objects in the road, brightness, weather conditions, and other moving objects creates a more dynamic computer vision solution. With autonomous driving, the goal is a combination of object detection as well as object avoidance to ensure vehicles safely navigate through streets and freeways.
At AppZen, our computer vision focuses on 2D images. The biggest priority in our technology is our ability to clearly extract data from various documents. Effectively extracting data applies technologies such as shadow reduction, background removal, noise reduction and more. AppZen’s advantage in applying computer vision technology is that text documents inherently have consistent structures which allows for increased accuracy.
Social media filters
Social media face filters gained widespread popularity by Snapchat in early 2015 and the technology seems to be here to stay. Children and adults alike love entertaining silly and fun facial feature changes from dog ears, long lashes, to even face swapping with a friend. Social Media face filters apply computer vision to a 2D image using local region coordinates just as AppZen does with expense report documents. The difference in facial recognition technology, however, is that facial recognition applies a single “Active Shape Model” which is a facial model that has been trained on hundreds to thousands of images to determine “average” facial features. Once a person’s facial features are determined, decorative accessories of a face filter are then applied accordingly.
At AppZen, no single “average” model of documents exists in which could apply to all the various submission of receipts our system intelligently reads. Our AI receives meals, hotels, airfare, taxi, retail and various other documents that take many different shapes and sizes. To solve for this, we have several different models available per expense type or various documents to allow for greater accuracy as an expense and AP audit platform.
Medical image processing
Highly promising and patiently awaited are advances in medical image processing to detect medical anomalies and disease, i.e. tumor detection of anomalous growths by reading common medical scans from MRIs, FMRIs, EKGs, etc. In medical image processing, the challenge is to find objects or anomalies that are out of place and within a difficult (but still similar) background. Accurately segmenting and differentiating a tumor from something like the brain or tissue in an MRI image is challenging due to the high similarities in both.
At AppZen, computer vision text extraction is the foundational block of AppZen’s ability to detect fraud in expense reports, contracts and invoices. Our technology must accurately detect text against the noise and background, but thankfully, text has similar structures and sizes while the noise and background does not. The similarities in structure and size of the text allows for more accurate distinction of what pertinent semantic text information should be retained while leaving the rest behind.
A key takeaway from these examples and many others unmentioned is that that image processing and computer vision is used in many different applications and fields today. Each field has its own challenges and has their own tricks to handle them. Here at AppZen, our scope is very specific to document classification and detection. Our AI is able to leverage this and combine it with our complex deep learning models to accurately understand everything that is on a receipt.