Fujitsu Laboratories has announced the development of a technology for automatically generating image-recognition programs that accurately detect the positions of components as captured by cameras in automated assembly processes by utilizing images of electronic components and IT equipment. Automatically generated image-processing programs that use machine learning have not been able to detect positions up until now, requiring that experts individually develop image-recognition programs. As a result, any changes to the manufacturing setup, such as a machine's operating parameters, could involve more than a week's time spent revising the program, during which time the production line would sit idle. What Fujitsu Laboratories has done is to develop a technique for automatically generating image-processing programs that detect positions by controlling the order in which the various image-processing functions that make up a program are combined, and using machine learning based on the similarity of shapes. Samples of the object to be detected are presented as teaching materials, and this makes it possible to automatically generate an image-recognition program in roughly eight hours, or one-tenth the time previously required. Fujitsu Laboratories plans to use this technology to help make production lines better able to respond to changes in their operating environment without long downtime.
Details of this technology are being presented at the Autumn Meeting of the Japan Society for Precision Engineering, opening September 16 in Tottori, Japan.
Modern production lines use automated assembly equipment and automated inspection equipment that rely on cameras. Processing the images from these cameras has in the past required custom software developed by experts, which meant a lengthy period of time to get a line into operation and makes it impossible to respond quickly in the course of operations or to peripheral conditions. This has led to a clear need for a technology that can automatically generate these image-processing programs and thereby enable continuous line operation (Fig. 1).
Automatically generated recognition programs already use genetic algorithms, a type of machine learning (Fig. 2). With genetic algorithms, two programs, which will become the parents, are randomly selected from multiple image-recognition programs - the parent generation. The parents are then merged together to create a number of child image-recognition programs. Each of these child programs then undergoes an evaluation that uses previously prepared training images and a set of target pictures, which show the features that are intended to be extracted from the pictures (the correct-answer set). The training pictures are fed into the child programs, and the output pictures are scored based on how well they match up with the target pictures. The child programs are then culled based on their scores, with the highest-scoring ones becoming the parents to the next generation. Those that pass become the next parent generation, and the process is repeated until a child with a passing score appears through an evolutionary process. This method is well-suited to image manipulations, such as emphasizing a target region of the image, but the time required for the machine-learning process is highly dependent on the number of members in the parent generation, and therefore can take a long time. Furthermore, this method cannot be used to automatically generate programs that will accurately detect the position of a component within an image.
About the New Technology
To make the machine-learning process more efficient, Fujitsu Laboratories devised three building blocks: the teacher, the grader, and the teaching material. Key features of the technology are as follows.
- Creating a tree structure for automatically generated programs (teacher building block) A program that finds the desired image can be constructed by combining image-processing functions in a tree structure, as shown in Fig. 3, but with conventional methods, the combinations would approach infinite size, making this technique impractical. What Fujitsu Laboratories has done here is to impose limits on the order of processes when forming the tree structure, based on expert knowledge of feature emphasis, such as the type of process and the flow. This dramatically reduces the number of combinations so that the target program can be generated quickly (the results from three educational-material building blocks combined cut the process to one-tenth the time).
- Evaluating the generated programs (grader building block) Rather than evaluating image quality, as has been done in the past, Fujitsu Laboratories developed a technology that evaluates the programs created in the automatic-generation process on the basis of the shape of the component and its similarity to the target, in order to detect position. For example, if a program detects a straight line, and if the line's angle (θ) and position (ρ) match up well with the target, the program is considered to be good. This makes it possible to automatically generate programs that can be used for positional detection even if image quality is poor (Fig. 4).
- Selecting training data (teaching-material building block) In order to reduce learning time by minimizing the number of pairs of training images (used as training data) and target images, Fujitsu Laboratories categorized multiple learning-candidate images based on feature values for factors such as brightness, contrast, and detail. Representative images were selected from each category to ensure that programs would be generated that could handle the full variety of images based on a relatively small amount of training data (Fig. 5).
In trials to assess positional detection of components during assembly, something that has not previously been amenable to automation, recognition rates, which previously had been stuck below 50%, dramatically improved to 97% or higher. The time required to revise image-recognition programs was also dramatically reduced, to one-tenth the previous time. Additional benefits of this very high recognition rate are that positional deviations during component assembly can be halved and assembly time can be reduced to two-thirds. This trial demonstrates that it will be possible to run a production line with stable recognition rates, without stoppages, resulting in high-quality, high-efficiency manufacturing.
This technology has potential applications beyond component assembly to machining, inspecting, and other industrial processes where image processing can be used on a production line. Fujitsu Laboratories plans to continue refining the performance of this technology with the goal of practical implementations on Fujitsu's own production lines during this fiscal year. At the same time, the company plans to investigate broader applications of this technology as a solution in vehicle onboard cameras, monitoring cameras, and cameras for medical purposes.
Explore further: Google team rises to 2014 visual recognition challenge