I'm a researcher working on problems to do with deep learning and proteins at Microsoft Research New England. You'll see me post about interesting papers, pictures of food, and whatever else I'm interested in at the moment.

A protein language model is able to generate diverse functional lysozymes from 5 different Pfam families + predict whether generated enzymes will be functional.

Thrilled to finally share my work at @VivianGradinaru lab at @Caltech - we developed a spatial transcriptomics approach to visualize and quantitatively analyze pooled systemic AAVs (capsids and cargos) in intact tissue across species. 1/n

The 4yo once again confirms that if a kidnapper speaks mandarin to him he'll go happily and quietly

Something something dark forest?!
Gliese 581c is an exoplanet just 20 light-years away that orbits in its star’s habitable zone.

In 2008, scientists decided to send a radio signal containing 501 messages that will reach the planet in 2029.

(Artist render)

Next Tuesday (1/31) @ 4 pm ET we'll have @ginaelnesr talk about SVD of sequences to visualize sequence/residue space!

Sign up at to receive Zoom links in your inbox and add events to your calendar!

Large masked language models for genomes! And nicely curated benchmarks for them!

These are the only good pens
OK, you can only write with ONE type of pen for the rest of your life, which one are you picking? (ignoring the ink color)

ChatGPT for biology? Excited to share our work on LLMs for protein design out today @NatureBiotech

+ Proud to publicly announce @ProfluentBio with a $9M seed round to tackle meaningful challenges in biology with AI. Join us!

Research idea: given a text description of an object, I want a robot to fold that origami object for me.

Idea brought to you by making way too many origami for a toddler

Improve conditional generation from FoldingDiff by Inserting a layer that converts internal angles to atomic directions.

Tiny Tweets, a DEI initiative.

Tweeters from underrepresented backgrounds are not necessarily equipped with the same resources from the start of their twitter journeys. Therefore, we propose a 140 character format to attract underrepresented and underresourced tweeters.

Use the distance maps from AF to reweight the members of a structural ensemble for a disordered protein. Generate the members using MD or even by doing unconditional generation from FoldingDiff.

Our team working on ML for chemistry/drug discovery was unfortunately affected by the recent layoffs at Google.

I’m still very much interested in how new technologies can accelerate research in the life and natural sciences.

I need everyone reporting on the devastating shooting to know a few things about (thread)

