Large Language Models (LLMs) are increasingly used to rewrite, simplify, and rephrase scientific content for different audiences and communication goals. While these reformulations improve accessibility and engagement, they also introduce challenges related to content attribution, provenance tracking, and source recovery.
This shared task focuses on title reconstruction and retrieval from LLM-generated reformulations, where participants must infer the original research title from multiple stylistic variants generated by LLMs. The dataset consists of original research paper titles and their corresponding reformulated versions created under three controlled stylistic cues.
Style Categories
Technical
Methodology-focused titles using domain-specific vocabulary, precise terminology, and formal academic language targeting expert readers.
Accessible
Reader-friendly titles written in plain language, designed to be understood by a broad audience without deep domain expertise.
Catchy
Creative and engaging titles that use hooks, analogies, and evocative language to draw readers in and spark curiosity.
Each original title is associated with 30 reformulations — 10 variants for each of the three style categories.