A steel pipe manufacturing company provides sturdy pipes used in oil drilling. The pipes they make have to be shaped at the end based on specific drilling needs. Hence, they make pipes, keep them in the warehouse. Once they get the order, they carve the ends suitably and ship them to the client.
The pipe may not be straight enough for carving. Hence, they subject the pipes to a “straightening” process (heating, straightening through rolling and cooling) once they get the order but before carving. The cooling of the pipe after straightening takes time (roughly 45 minutes). So, by the time they can measure the output straightness of the pipes, they will have manufactured several pipes.
If they find that the output is not satisfactory, they will have straightened many pipes under the same conditions and may have lost most of them.
Even after finding that the straightness is not satisfactory, they guess a different input condition (temperature, pressure etc.) and repeat. That may also be not the right set for getting the best straightness.
The client wanted us to provide a better solution.
Prima facie, this looks like a regression problem where one needs to predict the output straightness as a function of input straightness, “straightening” process conditions and other parameters (like the type of steel etc.).
However, the firm had a different numbering system for the pipes in both departments (pipe making and pipe straightening). So, they could match only at a batch level but not an individual pipe level. So, there was no way for the data scientists to know what was the input straightness of a pipe. We are given 100 input pipes and their straightness and 100 output pipes and their straightness and Ids of both sets are different!
This is a classic scenario that illustrates the issues that data scientists face while solving real-world problems
To have INSOFE faculty and data scientists solve your business problems, prep your engineering teams to face the real world complexities, visit here
INSOFE developed an original optimization solution (a variant of classical assignment problem) to map the inputs to outputs. We assigned costs for various input straightness and output straightness after thoroughly surveying with operations teams. Once developed, the engine, provided a simple solution to map pipe IDs between departments. This took 35% of the total project time!
A host of regression techniques were tested. Gradient boosting machines provided the best solution.
A lookup table was prepared for plant operations engineers with variables they can play with for a variety of input conditions. The app recommends the conditions for best output and also allows operators to play with the recommendations to see how outputs change.
The client productized the system and used it as a guide for the production operators. The failed components reduced drastically (more than 50%) with the solution