BERL Lightning Talk at SimBuild 2026

By
Ryan Dubois, Bianca Howard
April 21, 2026

Ryan Dubois will be presenting a preview of his work on his upcoming paper: Do Large Multimodal Models Understand Construction Drawings? An Evaluation of Visually Grounded Workflows during a lighting talk at SimBuild 2026. The full notes paper will be published as a part of the proceeding of BuildSys 2026 later this year.

Abstract:

Automated information extraction from construction drawings is a critical yet challenging task. While traditional approaches, from image processing to deep learning, have made great progress, they often suffer from inconsistent performance, extensive task-specific training, and narrow scope. Large Multimodal Models (LMMs) offer a promising alternative due to their vast knowledge bases and instruction-following capabilities, however, they struggle with spatial reasoning and dense high-fidelity images. This preliminary study evaluates the performance of two off-the-shelf Multimodal Generative AI (GenAI) workflows against two specialized architectures: an image-based Retrieval Augmented Generation (RAG) system and an image-based Multi-Agent workflow. By evaluating these workflows against a targeted set of benchmarking questions, this work provides a preliminary assessment of current GenAI capabilities in drawing interpretation and highlights specific areas for improvement in developing more robust automated workflows.

Columbia Affiliations
The Department of Mechanical Engineering