Guide to Unlocking Your Automation Potential with 3D Scanning

What is Machine Vision?

Machine vision is an enabling technology in industrial automation that can perceive the environment and help solve real-world problems, transforming the physical world into digital data. It consists of integrating hardware and software products to perform automated visual inspection, guidance or optimization tasks. Typical applications include industrial automation, automated guided vehicles, or even medical imaging.

Manufacturers have long relied on 2D machine vision to perform inspection tasks, making true/false or pass/fail decisions with a minimal margin of error. With technological advancements in machine vision over the past decade, 3D scanning is proving to become an affordable and reliable solution. Spatial information generated by an array of 3D measurement techniques can be reliably used for many industrial analyses, including orientation determination, spatial coordination, and shape/dimensional measurements. The extracted information is powerful in aiding object/feature identification, vision-guided robotics, or process optimization.

With the onset of Industry 4.0, manufacturers are looking for digital strategies that will improve productivity, quality and workplace safety. Machine vision will continue to be at the forefront of advancing industrial automation in every industry. In particular, 3D machine vision systems that embrace the Edge Computing paradigm will have even more processing capability, facilitating seamless integration with other Industrial Internet of Things (IIoT) technologies.

2D Machine Vision

2D machine vision is commonly used in inspection applications where objects are examined for a pass/fail (go/no-go) in attributes such as their features, dimensions, or the presence of any defects. The examination process involves taking two-dimensional images of the desired area of interest and comparing them with pre-programmed thresholds or results stored in memory.

Lighting is a critical factor in achieving sufficient imaging quality. As the system camera takes a snapshot of the target object on a flat surface, it relies on the right combination of lighting sources and techniques to create a clear contrast of the target area. Currently, LED illumination is the most common lighting source due to its low installation cost and long lifespan. Depending on the object’s material, different lighting techniques can be incorporated to amplify the desired feature for inspection. Directional lighting, also known as partial bright field lighting, is the most extensively-used lighting technique for enhancing contrast and revealing details in the area of interest. However, it can be less effective on specular surfaces as it generates hotspot reflections.

Most system designs require taking both the camera sensor and the focal length into consideration. Depending on the application, the system engineer must assess the sensor size and the focal length of the lens suitable for image capturing. These two components will, in turn, determine the resolution, size of the field of view (FoV) and working distance needed for the system. For a 2D machine vision system to maintain image consistency and clarity, all aforementioned critical components have to be appropriately integrated. Without a clear, sharp image, the analysis can frequently result in false-positive results that eliminate quality products along with the defects.

Outlook of 2D machine vision

While 2D machine vision requires lower initial capital investment than 3D machine vision, the system is susceptible to changes in the object’s dimensions. 2D machine vision generally has fixed focal lengths, requiring the object’s dimensions to remain relatively constant. The system may need significant adjustments before redeployment to accommodate different products. In many cases, manufacturers who opt for a 3D system benefit from a higher return on investment as the system requires minimal re-engineering for various applications.

Since a 2D system is dependent on creating a clear contrast of the object’s features, 2D machine vision systems frequently require shielding to prevent interference from ambient lighting or shadowing in the factory. Depending on the object’s material, different lighting techniques, such as backlighting or dark field lighting, may be required to create enough feature contrast.

2D machine vision takes a flat image of a three-dimensional object, losing valuable volumetric information in the process. Without the third dimension, many complex real-world applications that require spatial data, such as robotic guidance or measurement, cannot be carried out with accuracy. However, 2D machine vision will remain an integral technology in factory automation, especially in applications involving print inspection: barcode scanning, label verification or character recognition.

3D Machine Vision

3D machine vision, though requiring more processing power and software engineering, its capabilities to capture the depth information means it is a more realistic representation of the physical world. The additional dimension provides spatial information of the object, making automation of many repetitive tasks possible.

3D machine vision technologies include three different base technologies: interferometry, time of flight (ToF) and triangulation. While interferometry surpasses the other two in resolution and accuracy, it is very sensitive and generally requires expensive equipment. Interferometry is commonly used in the semiconductor industry for assessing film thickness. Due to its high-sensing ability, it is often applied in optics analysis and measuring experimental mechanics where very small changes need to be observed (e.g., measurement of strain).

Time of flight is known for its high speed in measurement and can produce relatively accurate images of distant objects. ToF sensors capture the surroundings by emitting light pulses and calculate the time for the signals to return to the sensor. ToF collects information on a static scene and delivers one X, Y, and Z point per scan. The image quality can suffer from noises caused by the stray reflections of light. In industrial applications that involve medium to long distances and where the target objects remain stationary, such as 3D representations of statues, digital twins of fixed infrastructures, ToF is a suitable scanning technique.


Triangulation is the process of calculating the distance of a point by forming triangles to it from controlled points. As early as the 6th century BC, scholars used triangles along with crude measuring instruments to estimate distances. In the 16th century, Dutch physician and mathematician Gemma Frisius, became the first scholar to describe the method of triangulation as a way of mapmaking. Until the invention of GPS, triangulation remained a reliable method in producing accurate maps for almost four centuries.

Several scanning technologies have been developed based on triangulation principles: structured light, stereovision, laser triangulation, just to name a few. Structured light can be further divided into different techniques according to the pattern it projects: coded, beam, fan, modulated, pulsed and more. Coded structured light is the most commonly used subset of structured light. It projects a set of coded patterns onto a surface. By examining the deformation between the coded pixels and the imaged pixels using the principles of triangulation, the distance of the object can be calculated. This scanning method is fast, and the scan area can be large. Structured light is commonly found in manufacturing and industrial settings. It is also used in consumer electronics, such as Microsoft Kinect and iPhone X (for facial recognition).

Active stereovision typically consists of two cameras and a light source (often a laser) with the baseline of the two cameras calibrated. The cameras’ FoVs overlap to define the scan zone. Successive images are acquired with different light patterns being projected on the target object. By identifying common features and examining the differences between these images, triangulation can be performed to determine the depth of the scene. Stereovision works best indoors on stationary objects and is often employed for robotic bin picking applications.

Laser triangulation is an established way of measuring and capturing 3D images of objects in small to medium distances, a reliable method to what was traditionally carried out by humans. Scanners are designed based on triangulation principles. The baseline of a triangle is formed with a sensor and a light source. The laser is projected onto the object at an angle and comes back to the sensor, forming a triangle with two known angles and one known side. Simple trigonometry can be performed to calculate the distance of the object from the sensor. The result is a set of digital data representing the dimensional information of the object. Laser scanners can configure their system components, such as the region of interest, to accommodate various applications. As a general rule, the larger the baseline relative to the feature size, the better the image’s resolution will be. An additional feature of laser triangulation is the use of narrowband interference filters to reduce the effect of ambient lighting dramatically.

As with structured light, many scanning techniques rely on laser triangulation. For example, a single point laser captures an object’s range data, a sheet of light laser creates a profile, while a projected pattern laser generates point cloud data. A sheet of light scanner projects a laser plane that is viewed at an offset angle with a camera to measure the shape of an object in the scan zone. The scan zone is defined by the intersection of the laser plane and the camera’s FoV. By processing the image captured by the camera, a list of X and Y points representing the profile of the intersection of the object and the laser plane can be generated.

Laser triangulation scanners can recognize one point or a profile of points, making it an ideal solution in industrial settings where objects are in motion. By tracking the relative motion of the object with respect to the scanner, a Z value can be added to each captured profile to render a 3D point cloud. Its ability to produce high-resolution 3D images has created a breadth of complex, moving applications that were previously impossible with 2D machine vision. However, as all scanning methods mentioned above rely on calculating the reflection angle, a specular surface or transparent objects can pose a problem since they have no component directed at the image sensor. Most industrial applications employ ToF, structured light, or laser triangulation, which output three-dimensional point cloud data that can be easily rotated, translated and stitched.

What is a point cloud?

A point cloud is a digital representation of the objects being scanned. A point cloud consists of a collection of points where each point represents a singular spatial coordinate in space. Some scanners can also provide reflectance or color values with each data.

By capturing the object’s depth value, the shape, position and orientation of an object can be accurately determined and compared against a 3D CAD model. Typically, the raw data generated by a 3D scanner is called a point cloud. Understanding how to work with point cloud data helps system engineers expand the capabilities of a machine vision system beyond what 2D machine vision could achieve.

When working with offline point clouds, many types of file formats exist for storing point cloud data. They are categorized into two types: ASCII and binary. The most common ASCII Point Cloud file format is produced in *.XYZ and *.PLY. Each line of text in an ASCII file represents a scan profile that includes the target’s X, Y and Z spatial data. Many existing machine vision software platforms can process ASCII files, which is why this is generally the preferred formatting method as it guarantees interoperability across platforms. ASCII files can be large and sometimes contain less metadata than their binary counterpart.

How to work with point cloud data?

Point cloud is a non-intrusive representation of the target object, meaning the data can be manipulated without destroying its integrity. Commonly used methods include filtering, truncating, or even stitching the data. Filtering and stitching can be especially useful in industrial applications where the scanners are designed to monitor a process in real-time.

Filtering the point cloud data helps remove outlier points that interfere with the evaluation of a task. The 3D filtering of point cloud data will remove unwanted data while preserving the fine details of the target objects. The data can then be fed downstream to an image processing software for analysis.

Point cloud stitching, a process also known as registration, involves aligning two point clouds together to create a complete scene. Stitching is exceptionally robust in factories where real-time monitoring plays a crucial role in quality control, waste reduction, and process optimization. As the production process moves at its top speed, a stream of 3D geometric data is captured by the scanner. The data is continually being clocked out by the scanner and sent downstream for analysis. 3D machine vision systems often consist of multiple scanners in order to maximize coverage of the target. Data captured by different sensors can be rotated and translated into a common coordinate system, enabling the stitching of data from all scanners. This eliminates visual occlusion as well as allowing the scanning of large objects. Vision-guided robots rely on the stitched data as they move relative to an object. Stitched point cloud data is a valuable asset in enabling process optimization and, eventually machine learning.

Industry 4.0 and Beyond

Driven by consumers’ ever-changing demands, manufacturers nowadays are looking to incorporate sensing technology as one of their initiatives to migrate towards Industry 4.0. Over the years, 3D printing, the Industrial Internet of Things (IIoT) and machine vision technologies have made many industrial applications previously unattainable a reality. These technologies add flexibility or even portability to production lines, enabling factories to respond efficiently according to customers’ preferences. The trend to demand higher throughput, better quality and safer working conditions will endure as the onset of Industry 4.0 propels manufacturers to embrace connectivity between smart components.

Machine vision plays an integral role in capturing the physical world and transforming it into networked, digital data. The process of digitization enables companies to create new values for their customers. Manufacturers have already seen a significant upturn in waste reduction and profit margin with the integration of 3D machine vision. Taking one step further in driving industrial applications forward is to embrace Edge computing, the capability to move computer processing closer to sensors to decrease latency and improve efficiency. Machine vision technologies must then innovate to incorporate as many computing, storing, and connectivity capabilities as possible for a future-proofing solution that understands translates, and interacts with the real world.

Recommended for you

How to Formulate a Future-ready, Long-term Industry 4.0 Strategy

Finding the right technology to automate is essential, but how do manufacturers manage the change?

How Hermary’s 3D Scanners Enabled Smart Food Automation

Automating this critical part of the meat cutting process was impossible until one machine builder partnered up with Hermary.

What objects can be scanned with laser-triangulation scanners?

A wide array of objects can be accurately measured and inspected with laser triangulation.