Paolo Papotti
Associate Prof at EURECOM and 3IA Côte d'Azur Chair of Artificial Intelligence. ELLIS member.
Data management and NLP/LLMs for information quality.
eurecom.fr/~papotti/
- 🛑 𝐒𝐭𝐨𝐩 𝐭𝐡𝐫𝐨𝐰𝐢𝐧𝐠 𝐚𝐰𝐚𝐲 𝐲𝐨𝐮𝐫 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 𝐬𝐜𝐨𝐫𝐞𝐬. RAG uses embedding scores to pick Top-K, then treat all retrieved chunks as equal. Parallel Context-of-Experts Decoding (PCED) uses retrieval scores to move evidence aggregation from attention to decoding. 🚀 180× faster time-to-first-token!
- New PhD position on Tool-Augmented LLMs for Enterprise Data AI 🚨 Starting in early 2026 under my academic supervision and hosted by the fantastic team at AILY LABS in Madrid or Barcelona Details reported in the link - please ping me for any question! www.linkedin.com/jobs/view/43...
- Reposted by Paolo Papotti[Not loaded yet]
- Reposted by Paolo Papotti[Not loaded yet]
- Ask any LLM for a single fact and it’s usually fine. Ask it for a rich list and the same fact is suddenly missing or hallucinated because the output context got longer 😳 LLMs exceed 80% accuracy on single-value questions but accuracy drops linearly with the # of output facts New paper, details 👇
- 🚨 𝐖𝐡𝐚𝐭 𝐡𝐚𝐩𝐩𝐞𝐧𝐬 𝐰𝐡𝐞𝐧 𝐭𝐡𝐞 𝐜𝐫𝐨𝐰𝐝 𝐛𝐞𝐜𝐨𝐦𝐞𝐬 𝐭𝐡𝐞 𝐟𝐚𝐜𝐭-𝐜𝐡𝐞𝐜𝐤𝐞𝐫? new "Community Moderation and the New Epistemology of Fact Checking on Social Media" with I Augenstein, M Bakker, T. Chakraborty, D. Corney, E Ferrara, I Gurevych, S Hale, E Hovy, H Ji, I Larraz, F Menczer, P Nakov, D Sahnan, G Warren, G Zagni
- Platforms like X are outsourcing fact-checking to users via tools like Community Notes. But what does this mean for truth online? We argue this isn’t just a technical shift — it’s an epistemological transformation. Who gets to define what's true when everyone is the fact-checker?
- Reposted by Paolo Papotti[Not loaded yet]
- Our new @sigmod2025.bsky.social paper tackles a fundamental challenge for the next gen of data systems: "Logical and Physical Optimizations for SQL Query Execution over Large Language Models" 📄 As systems increasingly use declarative interfaces on LLMs, traditional optimization falls short Details 👇
- Presenting at #NAACL2025 today (April 30th) 🎤 ⏰ 11:00 Session B Our work, "An LLM-Based Approach for Insight Generation in Data Analysis," uses LLMs to automatically find insights in databases, outperforming baselines both in insightfulness and correctness Paper: arxiv.org/abs/2503.11664 Details 👇
- Think2SQL: Bridging the Reasoning Gap in Text-to-SQL for Small LLMs Leveraging RL with our reward mechanism, we push Qwen-Coder-2.5 7B to performance on par with much larger LLMs (>400B) on the BIRD dataset! 🤯 Model: huggingface.co/simone-papic... Paper: huggingface.co/papers/2504.... Details 👇
- 🗜️New LLM compression paper "Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning" RAG struggles with broad, multi-hop questions. We surpass RAG by up to 20 absolute points in QA performance, even with extreme cache compression (64x smaller)! Details 👇
- NOVAS is a new venue for your paper bridging the gap between data management and generative AI research! It will be in Berlin, June 22th, together with @sigmod2025.bsky.social Submission deadline: 28 March 2025
- Tropes, such as "Hidden Motives", are recurring narrative elements used to evoke familiar patterns in communication Our #COLING paper uncovers that tropes are used in 37% of the social posts debating immigration and vaccination 📄 coling-2025-proceedings.s3.us-east-1.amazonaws.com/main/pdf/202... 👇