See full post

Nick Vincent

nickmvincent.bsky.social

Followers · Following

Studying people and computers (https://www.nickmvincent.com/) Blogging about data and steering AI (https://dataleverage.substack.com/)

Joined April 2023

Posts Replies Media Original posts Likes

Nick Vincent nickmvincent.bsky.social · Jan 27
OpenAI launching an overleaf competitor seems like it could be a big deal, and particularly interesting in wake of the NeurIPS hallucinations discourse (an important issue, but a lot of the back-and-forth I saw seemed to be missing a lot of important factors): openai.com/index/introd...
Introducing Prism

Accelerating science writing and collaboration with AI.

openai.com

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jan 12
Here's the post: dataleverage.substack.com/p/the-coding...
The Coding Agent Data Deal

On user data control, coding agents as retrievers, and the value of your coding transcripts

dataleverage.substack.com
- Nick Vincent nickmvincent.bsky.social · Jan 11
  Writing a follow up post on data aspects of coding agents. One thing that's really under-discussed, IMO -- as far as I can tell, NO coding agent allows for consumer users to trigger server-side deletion of transcripts or even metadata. Anyone seen anything to the contrary?
View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jan 11
Writing a follow up post on data aspects of coding agents. One thing that's really under-discussed, IMO -- as far as I can tell, NO coding agent allows for consumer users to trigger server-side deletion of transcripts or even metadata. Anyone seen anything to the contrary?

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jan 9
Seems plausible that some motivation for labs to restrict usage of subscription auth tokens is the value of structured data from using the official app, but unfortunate that the current data control for agents is super limited (30 days or 5 yrs, no indiv deletions, etc.)

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jan 5
Coding agents are (1) a big deal, (2) very relevant to data leverage, and (3) able to help build tools that support data leverage! dataleverage.substack.com/p/coding-age...
Coding agents are (1) a big deal, (2) very relevant to data leverage, and (3) able to help build tools that support data leverage!

Sharing an early reaction to recent coding agent discourse and two relevant projects

dataleverage.substack.com

View on Bluesky Show all post labels

Reposted by Nick Vincent
B! 🐝 Cavello (they/them) b-cavello.bsky.social · Dec 19, 2025
Replying to bahradlaw
A bunch of us are working to advance #PublicAI: AI that is publicly accountable, accessible, and sustainable. A lot of us are interested in local-first, community-governed, and more open models of what this technology could be. We welcome allies in the @publicai.network! publicai.network/whitepaper/
Public AI Network

A coalition to build public AI

publicai.network

View on Bluesky Show all post labels

Reposted by Nick Vincent
Hanlin Li hanlinliii.bsky.social · Dec 6, 2025
Happening now! Join us in Upper Level Room 4 for our workshop on Algorithmic Collective Action #NeurIPS2025 We will have stellar talks to kick off the day, followed by contributed talks and posters by authors before lunch break.

View on Bluesky Download image Show all post labels

Reposted by Nick Vincent
B! 🐝 Cavello (they/them) b-cavello.bsky.social · Dec 4, 2025
TODAY is the first-ever #NeurIPS position paper track! Come hear thoughtful arguments about “digital heroin,” the nature of innovation, protecting privacy, machine unlearning, & how we can do ML research better as a community. See you: ballroom 20AB from 10-11a & 3:30-4:30p! #NeurIPS2025 #NeurIPSSD

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Nov 26, 2025
Longer blog post: AI companies and data creators actually have aligned incentives re: establishing clearer "Data Rules" (norms, rules, contracts that control use of both "fresh" data and of model outputs). Good Data Rules can also support commons! dataleverage.substack.com/p/almost-eve...
Almost Everybody -- Including Both Data Creators and AI Companies -- Stands to Benefit from Clearer "Data Rules".

In fact, anyone who doesn't think they will be a "big winner" long term benefits from clear rules, even if it means training data costs more in the short term.

dataleverage.substack.com

View on Bluesky Show all post labels

Reposted by Nick Vincent
Xnet - Instituto para Digitalización Democrática xnet-x.net · Oct 28, 2025
"There are many challenges to transforming the AI ecosystem and strong interests resisting change. But we know change is possible, and we believe we have more allies in this effort than it may seem. There is a rebel in every Death Star." 🗣️ @b-cavello.bsky.social in our #4DCongress

View on Bluesky Download image Show all post labels

Nick Vincent nickmvincent.bsky.social · Oct 19, 2025
Heading to AIES, excited to catch up with folks there!

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Oct 14, 2025
New blog (a recap post): "How collective bargaining for information, public AI, and HCI research all fit together." Connecting these ideas + a short summary of various recent posts (of which there are many, perhaps too many!). On Substack, but also posted to leaflet

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · Oct 14, 2025
Substack: dataleverage.substack.com/p/how-collec... Leaflet: dataleverage.leaflet.pub/3m2wpj7l7c22w
How collective bargaining for information, public AI, and HCI research all fit together

Another recap post for the Data Leverage newsletter!

dataleverage.substack.com

View on Bluesky Show all post labels

Reposted by Nick Vincent
Ronen Tamari ronentk.me · Oct 10, 2025
V interesting twist on MCP! “user data is often fragmented across services and locked into specific providers, reinforcing user lock-in” - enter Human Context Protocol (HCP): “user-owned repositories of preferences designed for active, reflective control and consent-based sharing.” 1/

View on Bluesky Download image Show all post labels

Nick Vincent nickmvincent.bsky.social · Sep 18, 2025
Anyone compiling discussions/thoughts on emerging licensing schemes and preference signals? eg rslstandard.org and github.com/creativecomm... ? externalizing some notes here datalicenses.org, but want to find where these discussions are happening!
RSL: Really Simple Licensing

The open content licensing standard for the AI-first Internet

rslstandard.org

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Aug 14, 2025
Excited to be giving a talk on data leverage to the Singapore AI Safety Hub. Trying to capture updated thoughts from recent years, and have long wanted to better connect leverage/collective bargaining to the safety context.

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Aug 14, 2025
About a week away from the deadline to submit to the ✨ Workshop on Algorithmic Collective Action (ACA) ✨ acaworkshop.github.io at NeurIPS 2025!
About the workshop – ACA@NeurIPS

acaworkshop.github.io

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Aug 8, 2025
🧵In several recent posts, I speculated that eventually, dataset details may become an important quality signal for consumers choosing AI products. "This model is good for asking health questions, because 10,000 doctors attested to supporting training and/or eval". Etc.

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · Aug 8, 2025
It looks like some skepticism was warranted (not much progress towards this vision yet). I do think "dataset details as quality signals" is still possible though, and could play a key role in addressing looming information economics challenges.

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · Aug 8, 2025
The core challenge: many inputs into AI are information, and thus hard to design efficient markets for. Info is hard to exclude (pre-training data remains very hard to exclude, but even post-training data may be hard without sufficient effort)

View on Bluesky Show all post labels
View full thread
Nick Vincent nickmvincent.bsky.social · Aug 8, 2025
Follow up, tying together "AI as ranking chunks of human records" with "eval leverage" and "dataset details as quality signals": dataleverage.substack.com/p/how-do-we-... And related, "eval leverage": dataleverage.substack.com/p/evaluation...
How do we know our AI output is good? Double checks, bar charts, vibes, and training data.

Connecting evaluation and dataset documentation via the lens of "AI as ranking".

dataleverage.substack.com

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jul 16, 2025
Around ICML with loose evening plans and an interest in "public AI", Canadian sovereign AI, or anything related? Swing by the Internet Archive Canada between 5p and 7p lu.ma/7rjoaxts
Oh Canada! An AI Happy Hour @ ICML 2025 · Luma

Whether you're Canadian or one of our friends from around the world, please join us for some drinks and conversation to chat about life, papers, AI, and...…

lu.ma

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jun 24, 2025
[FAccT-related link round-up]: It was great to present on measuring Attentional Agency with Zachary Wojtowicz at FAccT. Here's our paper on ACM DL: dl.acm.org/doi/10.1145/... On Thurs Aditya Karan will present on collective action dl.acm.org/doi/10.1145/... at 10:57 (New Stage A)
Algorithmic Collective Action with Two Collectives | Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency

You will be notified whenever a record that you have chosen has been cited.

dl.acm.org

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · Jun 24, 2025
These blog posts expand on attentional agency: - genAI as ranking chunks of info: dataleverage.substack.com/p/google-and... - utility of AI stems from people: dataleverage.substack.com/p/each-insta... - connection to evals: dataleverage.substack.com/p/how-do-we-...
Each Instance of "AI Utility" Stems from Some Human Act(s) of Information Recording and Ranking

It's ranking information all the way down.

dataleverage.substack.com

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · Jun 24, 2025
And we have a blog post on algorithmic collective action with multiple collectives! dataleverage.substack.com/p/algorithmi...
Algorithmic Collective Action With Two Collectives [crosspost]

This post was written by Aditya Karan, with support from Nick Vincent and Karrie Karahalios to accompany a FAccT 2025 paper. It was originally published on Jun 19, 2025 via the Crowd Dynamics Lab blog...

dataleverage.substack.com

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · Jun 24, 2025
Finally, I recently shared a preprint that relates deeply to the above ideas, on Collective Bargaining for Information: arxiv.org/abs/2506.10272, and have a blog post on this as well: dataleverage.substack.com/p/on-ai-driv...
On AI-driven Job Apocalypses and Collective Bargaining for Information

Reacting to a fresh wave of discussion about AI's impact on the economy and power concentration, and reiterating the potential role of collective bargaining.

dataleverage.substack.com

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jun 24, 2025
“Attentional agency” — talk in new stage b at facct in the session right now!

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jun 21, 2025
Off to FAccT; Excited to see faces old and new!

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jun 5, 2025
Another blog post: a link roundup on AI's impact on jobs and power concentration, another proposal for Collective Bargaining for Information, and some additional thoughts on the topic: dataleverage.substack.com/p/on-ai-driv...
On AI-driven Job Apocalypses and Collective Bargaining for Information

Reacting to a fresh wave of discussion about AI's impact on the economy and power concentration, and reiterating the potential role of collective bargaining.

dataleverage.substack.com

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · May 27, 2025
New data leverage post: "Google and TikTok rank bundles of information; ChatGPT ranks grains." dataleverage.substack.com/p/google-and... This will be post 1/3 in a series about viewing many AI products as all competing around the same task: ranking bundles or grains of records made by people.
Google and TikTok rank bundles of information; ChatGPT ranks grains.

Google and others solve our attentional problem by ranking discrete bundles of information, whereas ChatGPT ranks more granular chunks. This lens can help us reason about AI policy.

dataleverage.substack.com

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · May 27, 2025
This has implications for Internet policy, for understanding where the value in AI comes from, and for thinking about why we might even consider a certain model to be "good"! This first post leans heavily on recent work with Zachary Wojtowicz and Shrey Jain, to appear at this upcoming FAccT

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · May 27, 2025
arxiv.org/abs/2405.14614 Follow ups coming very soon (already drafted): would love to discuss these ideas with folks. Is this all repetitive with past data labor/leverage work? Are some aspects obvious to you?
Push and Pull: A Framework for Measuring Attentional Agency on Digital Platforms

We propose a framework for measuring attentional agency, which we define as a user's ability to allocate attention according to their own desires, goals, and intentions on digital platforms that use s...

arxiv.org

View on Bluesky Show all post labels
View full thread
Nick Vincent nickmvincent.bsky.social · May 28, 2025
Post 2: dataleverage.substack.com/p/each-insta...
Each Instance of "AI Utility" Stems from Some Human Act(s) of Information Recording and Ranking

It's ranking information all the way down.

dataleverage.substack.com

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · May 2, 2025
Sharing a new paper (led by Aditya Karan): there's growing interest in algorithmic collective action, when a "collective" acts through data to impact a recommender system, classifier, or other model. But... what happens if two collectives act at the same time?

View on Bluesky Download image Show all post labels
Nick Vincent nickmvincent.bsky.social · May 2, 2025
Pre-print now on arxiv and to appear at FAccT 2025: arxiv.org/abs/2505.00195 "Algorithmic Collective Action with Two Collectives -- Aditya Karan, Nicholas Vincent, Karrie Karahalios, Hari Sundaram"
Algorithmic Collective Action with Two Collectives

Given that data-dependent algorithmic systems have become impactful in more domains of life, the need for individuals to promote their own interests and hold algorithms accountable has grown. To have ...

arxiv.org

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Apr 3, 2025
New early draft post: "Public AI, Data Appraisal, and Data Debates" "A consortium of Public AI labs can substantially improve data pricing, which may also help to concretize debates about the ethics and legality of training practices." dataleverage.substack.com/p/public-ai-...
Public AI, Data Appraisal, and Data Debates

A consortium of Public AI labs can substantially improve data pricing, which may also help to concretize debates about the ethics and legality of training practices.

dataleverage.substack.com

View on Bluesky Show all post labels

Reposted by Nick Vincent
Alek Tarkowski alek.bsky.social · Apr 2, 2025
“Algo decision making systems are “leviathans”, harmful not for their arbitrariness or opacity, but systemacity of decisions" - @christinalu.bsky.social on need for plural #AI model ontologies (sounds technical, but has big consequences for human #commons) www.combinationsmag.com/model-plural...
Model Plurality

Current research in “plural alignment” concentrates on making AI models amenable to diverse human values. But plurality is not simply a safeguard against bias or an engine of efficiency: it’s a key in...

combinationsmag.com

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Mar 3, 2025
New Data Leverage newsletter post. It's about... data leverage (specifically, evaluation-focused bargaining) and products du jour (deep research, agents). dataleverage.substack.com/p/evaluation...
Evaluation Data Leverage: Advances like "Deep Research" Highlight a Looming Opportunity for Bargaining Power

Research agents and increasingly general reasoning models open the door for immense "evaluation data leverage".

dataleverage.substack.com

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Feb 14, 2025
I have some new co-authored writing to share, along with a round-up of important articles for the "content ecosystems and AI" space. I'm doing an experiment with microblogging directly to a GitHub repo that I can share across platforms...

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · Feb 14, 2025
Here's my round-up as a markdown file: github.com/nickmvincent... Here's the newsletter post, Tipping Points for Content Ecosystems: dataleverage.substack.com/p/tipping-po...
https://github.com/nickmvincent/blogs/blob/main/microblogs/2025-02-13_weeklyupdate.md

github.com

View on Bluesky Show all post labels

Reposted by Nick Vincent
Collective Intelligence Project cip.org · Feb 10, 2025
Global Dialogues has launched at the Paris #AIActionSummit. Watch @audreyt.org give the announcement via @projectsyndicate.bsky.social youtu.be/XkwqYQL6V4A?... (starts at 02:47:30)
PS Events: AI Action Summit

YouTube video by Project Syndicate

youtu.be

View on Bluesky Show all post labels

Nick Vincent nickmvincent.bsky.social · Jan 31, 2025
On Mon, wrote a post on the live-by-the-sword, die-by-the-sword nature of the current data paradigm. On Wed, there was quite a development on this front -- OpenAI came out with a statement that they have evidence that DeepSeek "used" OpenAI models in some fashion (this was faster than I expected!)

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · Jan 31, 2025
Given it seems clear that data protection technologies (such as the techniques OpenAI used to gather this evidence) will play a role in the near-term, I put together another post with a simple proposal that could reduce some of the tension in the current paradigm.

View on Bluesky Show all post labels
Nick Vincent nickmvincent.bsky.social · Jan 31, 2025
AI labs and tech companies should open-source their data protection techniques so that content creators can benefit from new and old advances in this space: dataleverage.substack.com/p/ai-labs-co...
AI Labs Could Open Source Data Protection Technologies

There's still incredible tension in the current data paradigm, but sharing "data protection" technologies, like those used by OpenAI to accuse DeepSeek of model theft, can help cut a path forward.

dataleverage.substack.com

View on Bluesky Show all post labels

An unhandled error has occurred. Reload 🗙