[DL] [CFP] VINALDO: 3rd edition of International workshop on Machine vision and NLP for Document Analysis
rafika boutalbi
boutalbi.rafika at gmail.com
Thu Mar 19 06:22:36 CET 2026
*VINALDO: 3rd edition of International workshop on Machine vision and NLP
for Document Analysis*
*Website:*
https://sites.google.com/view/vinaldo3rdeditionofinternation/call-for-papers
*In conjonction with ICDAR 2026 <https://icdar2026.org/> - Aug 30- Sep 04,
2026 • Vienna, Austria*
*https://icdar2026.org/ <https://icdar2026.org/>Document understanding is
essential in areas such as invoice extraction, medical record analysis, and
legal document processing. While many workshops focus on pure vision-based
tasks (OCR, layout analysis) or pure NLP tasks, VINALDO emphasizes the
synergistic integration of computer vision and natural language processing
for structured information extraction and semantic understanding of
documents.This third edition of VINALDO highlights structured knowledge
extraction from documents using multimodal approaches, with a focus on: -
Knowledge Graphs (KGs) built from visual and textual cues- Integration of
Large Language Models (LLMs) with visual document understanding- Multimodal
representation learning for semantic retrievalOur goal is to move beyond
traditional document analysis by exploring how vision and language jointly
enable structured, relational understanding particularly in complex
documents like invoices, forms, and reports.Novelty for this edition:After
the success of VINALDO 2023
<https://sites.google.com/view/vinaldo-workshop-icdar-2023/home>,
and VINALDO 2024
<https://sites.google.com/view/vinaldo-workshop-icdar-2024/home>, in this
third edition of the VINALDO workshop, we encourage the description of
novel problems or applications for document analysis in the area of
information retrieval that has emerged in recent years. In the last
edition VINALDO 2024
<https://sites.google.com/view/vinaldo-workshop-icdar-2024/home> we
highlighted a particular topic namely “Knowledge Graphs and Multimodal
approaches”. In this new edition, we aim to encourage novel and recent
research on document analysis including, but not limited to, approaches
that intersect with areas such as Large Language Models (LLMs), Knowledge
Graphs (KGs), and Natural Language Processing (NLP). The VINALDO workshop
focuses on the joint exploitation of visual and textual information for
document understanding, while remaining open to a wide range of methods and
perspectives.In particular, we highlight the growing importance
of structured representations such as Knowledge Graphs extracted from
document context, which are still underexplored despite their relevance
across many application domains. We therefore welcome contributions that
explore the combination of computer vision, NLP, and structured knowledge
representations, as well as works that integrate NLP and vision techniques
in innovative ways.We also encourage submissions that introduce new
datasets, benchmarks, or real-world applications related to document
analysis. Overall, the VINALDO workshop aims to bring together researchers
and practitioners from academia, industry, and applied research to exchange
ideas, share experiences, and discuss ongoing challenges and advances
in document analysis at the intersection of Computer Vision and
NLPResearchers and practitioners all over the world, from both academia and
industry, working in the areas of document and textual analysis. Topics of
interest include, but are not limited to, the following: - Multimodal
Knowledge Graph Construction from documents- Vision-Language Models for
Document Understanding- Joint Entity and Relation Extraction from visual
and textual content- Structured Document Understanding with LLMs-
Multimodal Document Representation Learning- Graph-Based Spatial and
Semantic Reasoning in documents- Integration of Knowledge Graphs and Vision
Transformers- Multimodal Invoice and Form Analysis- Cross-modal Retrieval
in Document Collections- Benchmarks and Datasets for Multimodal Document
UnderstandingNote: Topics that are purely vision-based (e.g., OCR, table
detection, handwriting recognition) or purely NLP-based are better suited
to other ICDAR workshops. VINALDO focuses on their intersection.Important
DatesSubmission Deadline: May 5th, 2026 at 11:59pm AoE TimeDecisions
Announced: May 15th, 2026, at 11:59pm AoE TimeCamera Ready Deadline: July
1, 2026Contact:Rim Hantach <rim.hantach at gmail.com>Rafika Boutalbi
<boutalbi.rafika at gmail.com> Karima Boutalbi <karima.boutalbi1 at gmail.com>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.zih.tu-dresden.de/pipermail/dl/attachments/20260319/6a58e1e8/attachment-0001.htm>
More information about the dl
mailing list