Max Fürst
Asst. Prof. Uni Groningen 🇳🇱
Comp & Exp Biochemist, Protein Engineer, 'Would-be designer' (F. Arnold) | SynBio | HT Screens & Selections | Nucleic Acid Enzymes | Biocatalysis | Rstats & Datavis
https://www.fuerstlab.com
orcid.org/0000-0001-7720-9
- [Not loaded yet]
- Must admit I love it
- seeing that even claude code consistently doesn't get git commands right the first time round makes me feel much less stupid
- Getting asked about how academics can continue to do science & inspire trainees even in the midst of a continued (escalated) assault on science, reason, truth, & human rights. I don’t have great answers. I would love to hear from mentors about advice they’re giving to trainees/ colleagues.
- Sadly, here in Europe, one advise to ECRs is to, whatever action you decide to take, think very hard before posting about it on social media, because, you know, you may want to go to a GRC at some point. Is this hypocritical or just survival mode?
- [Not loaded yet]
-
View full threadYou can mess up a kinase structure in any which way with MD and claim you’ve “discovered” a new state. Doesn’t make it real or relevant or interesting. Has to correlate with experimental data.
- with MD, with AI MDemulators, with AF hacks, ...
- [Not loaded yet]
- Thanks for sharing. Not surprised but good to see data also for antibodies that squares really well with our analysis on protein design more generally bsky.app/profile/maxf...
- New preprint🚨 Imagine (re)designing a protein via inverse folding. AF2 predicts the designed sequence to a structure with pLDDT 94 & you get 1.8 Å RMSD to the input. Perfect design? What if I told u that the structure has 4 solvent-exposed Trp and 3 Pro where a Gly should be? Why to be wary🧵👇
- The only format worse than pdb is cif
- [Not loaded yet]
- Exactly
- Industrial computational protein engineering position in the south of the Netherlands synsilico.com/storage/app/...
- [Not loaded yet]
- Yea this is not the same of course, but it just popped up in my feed and was close enough in topic ;) Still, there might be something there. Sample unbiased random flex and score with IF?
- Personally skeptical of a paper making broad claims about such methods capturing mutation effects when their data is a single mutation in a single enzyme where effect is extensively studied & captured in many training data structures. AI structure/dynamics prediction tools are very poor VAEs imo
- Inverse folding much more promising bsky.app/profile/alic...
- Caught between excitement that "one of us" appears on a major podcast and cringe for parts of the discussion (mainly the self promotion, sadly a hallmark of otherwise enjoyable #hardfork)
- I'm really excited to break up the holiday relaxation time with a new preprint that benchmarks AlphaFold3 (AF3)/“co-folding” methods with 2 new stringent performance tests. Thread below - but first some links: A longer take: fraserlab.com/2025/12/29/k... Preprint: www.biorxiv.org/content/10.6...
- Fantastic! Had 3 Qs after thread & think I understood from paper myself 1 were the 500 ligands v similar? - No, quite diverse 2 did they do "fair" real-life docking when comparing - i think so 3 is one limitation that all results are for one receptor & might not generalize? - probably Would u agree?
- [Not loaded yet]
- Nice first step indeed. But were the ribosomal proteins ever the bottleneck? I'd expect rRNA and tRNAs first and can't rebuild without all the RNA modification enzymes. But probably IVTT anyway usually stops due to ATP depletion, right?
- New preprint🚨 Imagine (re)designing a protein via inverse folding. AF2 predicts the designed sequence to a structure with pLDDT 94 & you get 1.8 Å RMSD to the input. Perfect design? What if I told u that the structure has 4 solvent-exposed Trp and 3 Pro where a Gly should be? Why to be wary🧵👇
- [Not loaded yet]
- Precisely. And that's what we want to encourage: implement custom scoring suitable for your goal to try to mitigate the shortcoming of refolding metrics. Or, for novices who may not be able to do much in this regard, at least be skeptical of these metrics
- [Not loaded yet]
- Well many sequences require chaperones or only get structured in complex with something, but this is not so easy to compile data. Synthetic nonsense would probably be easiest but whether this can be well implemented in training without compromising positive performance is a question.
- [Not loaded yet]
- Ohh nice, hadn't seen that then! Might have reconsidered terminology in the title if I had.. 😅
- [Not loaded yet]
- I'd say so - AF2 just hasn't ever seen any negative data. We discuss this a bit in the paper although refrain from speculating to widely.
- Thanks to Seva and Kerlen for their tireless effort to get the data as watertight as possible. Here is the preprint link again: www.biorxiv.org/content/10.6... 19/19
- 20/19 Or, you know, if bioRxiv is down / extremely slow again, download the pdf here: www.fuerstlab.com/uploads/2025...
- [Not loaded yet]
- MSA struc as temp to ss I'd expect similar to MSA but maybe worth a shot. Energy based I guess will not fold nonsense, thus hard to implement in our tests except maybe in very low mutation regime; could test that in pMPNN background. Will think, thanks!
- These results reveal a broader issue: scRMSD is a very unflexible metric (pun intended). If not-being-as-stable-as-a-rock is a design goal of yours (as it should be, if you want to eventually design enzymes), this metric makes little sense as is. 17/19
- Taken together, we hope our results highlight the current limitations of the self-consistency evaluation that is so commonly used in the field, and thus encourage to establish new/additional criteria, or at least be more aware of the downsides of AF & co metrics when assessing designs. 18/19
- Indeed, we show that if you truncate AFDB strucs to their corresponding PDB entry length, you get the same designability. Without a PDB to compare to, you can also rescue via a simple trick: just compute statistical outliers from an initial alignment, realign without them, and use median RMSD 15/19
- Interestingly, even though this demonstrates that the low designability of AFDB is essentially an artefact from RMSD calcs after alignment, using more sophisticated aligners (TMalgin, Sheba) do not achieve the same rescue. 16/19
- We also question whether the refolding pipeline always is a robust evaluation metric to begin with. As e.g. noted in @moalquraishi.bsky.social's Genie2 paper, the designability of PDB structures is on average much higher than that of the AFDB. Now, we have a pretty good guess why: 13/19
- Realizing that this gap is RMSD, not pLDDT driven, we speculated the cause to be a PDB artefact: in xray strucs, flexible termini often lack density / unstructured parts were truncated to begin with. As a result, the designability of PDB structures gets very high compared to full-length AFDBs 14/19
- Sequences from design models like ProteinMPNN boost folding success, but for anything beyond medium sized proteins, you almost never can accurately fold in ss mode. 11/19
- To sum up so far Designers: AF & co are bad at spotting poor designs. If u use ss mode, false positives go down, but maybe no design at all will fold Devs: if ur new seq design algo makes seqs with “regular” seq-struc mapping (as in nature) it gets worse, and users may *think* your tool is 💩 12/19
- We next looked at a set of literature-reported experimentally tested designs and compared folding models’ ability to act as “oracles”. Again, evo info was detrimental. MSA mode’s poor performance can be obfuscated though, for seqs where the MSA is very shallow / empty. Duh: empty MSA == ss mode 9/19
- So, AF2 ss mode wins. Problem solved? Nope. Besides still being overly confident, it also has another major issue: it generally does not work very well. Even for small natural proteins, AF2ss barely can fold sequences that pass the commonly used pLDDT and scRMSD-based “designability” criteria. 10/19
- We found that this trend is exacerbated by the availability of evolutionary info to models: AF2 MSA behaves the worst, ESMfold is a bit better, and AF ss is the best (out of bad bunch). As others noted: the signal from these MSAs / pLM embeddings overrules “reason” 7/19
- Worse: If you repeat this experiment starting from sequences designed with ProteinMPNN, the effect is even worse. The reason likely is that ProteinMPNN designs very strongly and unambiguously encode the intended structure, which can make folding models overconfident about their predictions. 8/19
- We probed this for ESMfold, and for AF2, where two modes can be employed: the default MSA mode (often seen for redesign of native proteins) and single sequence mode (often seen for de novo protein design) First, we checked how good they are at identifying clearly bad designs 5/19
- This is easy: just take natural protein seqs & randomly swap letters. In reality, you get a non-folder after a few exchanges. Yet, folding models very stubbornly insist that such seqs fold into the same structure despite nonsensical residues all over the place. @sokrypton.org saw this for Ala 6/19
- After refolding the designed sequence, two metrics are typically computed: pLDDT (structure confidence) and scRMSD (backbone similarity to the input). If favorable, i.e. it fulfills the “designability” criteria, the sequence (or seq-struc combo) is considered a good design 3/19
- While widely used, we've not seen a systematic analysis of how successful this evaluation step is. Are folding models indeed good at this task? Does it really sort out bad designs? Can you quantify how “good” a sequence is? And why is everyone using different thresholds?! 4/19
- Anecdotes about this kind of biophysics-ignorant confidence of AlphaFold & Co in certain sequences have been around for a while. We have now systematically assessed this and other undesirable behavior of folding models in the context of (de novo) protein design. 1/19
- Led by @kerlenkorbeld.bsky.social and Seva Viliuga, we started this project under the premise of the field's most common protein design evaluation workflow: the self-consistency pipeline (function->backbone->sequence->evaluation), where folding models are used for the last step. 2/19
- [Not loaded yet]
- [Not loaded yet]
- Let's hypothesise on the dominant correlating metric, make a prediction, and check again next year? Could be an artefact: database indexing, people use "biology" less for whatever reason, etc. Or something real, funding declined, tendency to write bigger instead of many papers,...
- [Not loaded yet]
- [Not loaded yet]
- You sure they *do* produce better models or do they report to be confident that they produced a better model? 😉 Beyond a certain level of accuracy and in particular if flexible parts are involved, who can confidently tell anymore what's a better model anyway?
- Wtaf
- Backward translation (3' to 5' Translation of Circular RNAs?) www.biorxiv.org/content/10.6...

- If you ever need to fuzzy search some DNA, sassy is your tool. Please spread the word; I think many people just outside my own circle could benefit from this :) cc @rickbitloo.bsky.social github.com/RagnarGrootK...
- Hell yeah! Now for amino acid alphabet please 🥹
- Next week, we welcome @rebeccasear.bsky.social in our Lecture Series. Rebecca will talk about 21st century eugenics, scientific racism and the role of academia in promoting political ideology. Just register here to participate 👉 rotorub.wordpress.com/roto-lecture... #PhilSci #HPBio
- Really important, thanks. It would be great if the scientific credibility wouldn't be somewhat compromised by using AI-generated left-handed helices though.
- You referring to the AF3 hallucinated helices for unstructured regions? And so you mean AF2 with prior gen? If soy, can you AF2, then template boltz with that for complex /ptm prediction (the only reason why you'd want AF3 over 2 anyway I guess)?
- Fun! Tldr AI researchers are pissed bc some AI research papers submitted to an AI conference by AI researcher colleagues is AI-written & many are AI-reviewed as found by an AI company's AI model, described in a paper for said AI conference. Said paper was also AI-reviewed (but deffo not AI-written)
- What can researchers do if they suspect that their manuscripts have been peer reviewed using AI? go.nature.com/4pxUNyD
- [Not loaded yet]
- As if sequencing cost were the bottleneck.. looking at you UPS/DHL/FedEx delivery drivers who keep losing samples at least once a month
- We've reached the bizarre state of things where poorly written emails in broken English are the ones that are the most relevant
- Students applying for grad school, or reaching out to professors. I have an important piece of advice for you: STOP DOING THIS 👇 (a thread) #STEM #PhD #gradschool #academictips
- Students applying for grad school, or reaching out to professors. I have an important piece of advice for you: STOP DOING THIS 👇 (a thread) #STEM #PhD #gradschool #academictips
- Emails that don't pass my fool proof (enough) sanity test don't get an answer. It's nearly always exactly the AI written ones bsky.app/profile/maxf...