BERL Lightning Talk at SimBuild 2026

Ryan Dubois will be presenting a preview of his work on his upcoming paper: Do Large Multimodal Models Understand Construction Drawings? An Evaluation of Visually Grounded Workflows during a lighting talk at SimBuild 2026. The full notes paper will be published as a part of the proceeding of BuildSys 2026 later this year.

Abstract:

Automated information extraction from construction drawings is a critical yet challenging task. While traditional approaches, from image processing to deep learning, have made great progress, they often suffer from inconsistent performance, extensive task-specific training, and narrow scope. Large Multimodal Models (LMMs) offer a promising alternative due to their vast knowledge bases and instruction-following capabilities, however, they struggle with spatial reasoning and dense high-fidelity images. This preliminary study evaluates the performance of two off-the-shelf Multimodal Generative AI (GenAI) workflows against two specialized architectures: an image-based Retrieval Augmented Generation (RAG) system and an image-based Multi-Agent workflow. By evaluating these workflows against a targeted set of benchmarking questions, this work provides a preliminary assessment of current GenAI capabilities in drawing interpretation and highlights specific areas for improvement in developing more robust automated workflows.

By
Ryan Dubois, Bianca Howard
April 21, 2026
Columbia Affiliations
The Department of Mechanical Engineering