Tabular Representation, Noisy Operators, and Impacts on Table Structure Understanding Tasks in LLMs

Ananya Singha; José Cambronero; Sumit Gulwani; Vu Le; Chris Parnin

Tabular Representation, Noisy Operators, and Impacts on Table Structure Understanding Tasks in LLMs

Ananya Singha ,
José Cambronero ,
Sumit Gulwani ,
Vu Le ,
Chris Parnin

Table Representation Learning Workshop at NeurIPS 2023 | December 2023

Download BibTex

Large language models (LLMs) are increasingly applied for tabular tasks using in-context learning. The prompt representation for a table may play a role in the LLMs ability to process the table. Inspired by prior work, we generate a collection of self-supervised table structure understanding tasks (e.g. navigate to a cell and row; transpose the table) and evaluate the performance differences when using eight formats. In contrast to past work, we introduce eight noise operations inspired by real-world messy data and adversarial inputs, and show that these can impact LLM performance across formats for different structural understanding tasks.