Testing Hypothesis Testing

I recently wrote about hypotheses that are beautifully simple, and how this kind of thinking can make us believe we understand the workings of Nature even when we don't.  Today we're going to appreciate some hard core biology that's so complex it's beautiful.  It's also validated against empirical testing.  Today is going to be a good day.

Before we begin, I've been making the point for a while now that it's easier to craft an accurate-sounding hypothesis that fits current evidence than it is to craft a hypothesis that matches reality.  Today I want to approach this idea from the opposite side.  Once I present the problem to you, I'll ask you to pre-register your hypothesis in the comments before you go on to read the solution.  Of course, if you know the solution already there's no reason to pre-register your hypothesis, so you can skip that step.  The exercise might not be as meaningful for you in this case.

Here's the central question we'll be developing our hypothesis around: how does your immune system create antibodies against specific antigens?  (An antigen is just the name we give for microscopic structures that are unique to a pathogen and can be recognized by the antibody.)

If you already have an idea of how you might design this kind of system - a hypothesis, if you will - go straight to the comments and record it there.  It's okay if you have to hand-wave a few things, so long as you get at the general problem here.  Don't worry, this is a no-judgement zone.  Part of the point of this exercise is to record something that's probably wrong.

Maybe you don't yet feel up to the task of recording a hypothesis.  I find that inspiration for me comes with learning a few new things.  If that's the case with you, I'll explain some of the constraints you're under from the underlying biology in the next few paragraphs.  If you read down below before registering your hypothesis, you're probably going to get closer to the real mechanism since I'll be giving away important clues.  I'm interested in how close you might come, so please let me know if you got the extra clues in your comment so we can see how much help knowing the constraints provides.

Let's unlearn some wrong things you've been taught.  I think the process of unlearning something requires you to revisit what you think you know while asking questions that are incompatible with what you learned before.  We need to break those wrong concepts down before we can begin to build something new in their place.  Go back to your high school biology classes (maybe even college) to when they were talking about how your body 'creates' antibodies against new threats it has never seen before.  I want you to think about how you would actually design a biological process that can do something like this in practice.  It's not as easy a problem as it sounds like at first, because biology at a molecular scale imposes some surprising challenges.

The first problem is probably something you read right past, which is the stipulation that you're creating defenses against something "never seen before".  That implies your body has some way to keep track of everything it has seen before, otherwise how would it know what it hasn't seen before?  The next problem is this idea of 'creating' something new.  If this antibody is protein-based (spoiler: it is) then it has to be genetically hard-coded, which means you received the sequence it's based on from your parents on back for thousands of years.  Besides, you can't just make up new genes for every new entity you see.  You're limited to the 20,000 genes in the genome, so somehow you need a mechanism that will track everything you've already encountered while covering everything you've never seen before, but that has a specific gene or set of genes that can create this incredible diversity of outputs.  That's only the first problem.

The next issue with 'creating' a new antibody is that your body's ability to 'see' something at a molecular level isn't the same as simply looking at it.  We can divide this problem into two parts.  First, your body can't intelligently go in and design a new antibody by looking at it.  (It can't see some structure and design a protein around that.)  The only way it interacts with a structure is by 'touch', where a molecule it has already created interacts with a new molecule.  But this isn't like skin touch, where you can tell whether something is rough or smooth, wet or dry, etc.  This only gives you an answer of 'matches' or 'does not match'.  The second part of this problem of seeing is that there's no possible editing mechanism, for a protein after it's made.  You can't check a protein to see if it fits, notice there's a small error, and exchange a positive charge to match a negative charge in the structure you're trying to fit against.  Protein 'editing', such as it is (complicated topic), can't go back and splice in new amino acids after the molecule is formed.  So whatever the gene is that creates your antibodies, it has to create an antibody that detects the antigen (the thing that your body is able to recognize) straight off the production line.  It has to be good enough as-is, which creates some real challenges for you to design this mechanism where you create new antibodies from scratch, and only after you first encounter the new antigen.

Okay, now you should have enough information to understand the problem and the tools you have at your disposal to tackle it.  Take a minute to imagine a system that might be able to solve it within these limitations.  Remember that it's okay to be wrong.  Record your hypothesis in the comments before you go on.  Be as detailed (or not) as you want.  When you're ready, the answer is below.

A Biological Solution - Clonal Selection Theory

First, let's tackle the problem of genetically encoding every possible antigen, but only the one we need at the moment we first see it.  We're going to cheat, and just make them all.  And by that I mean all of them - every possible short organization of random amino acids we can think of.  We're just not going to make them all in the same place.  In order to understand this next part, we need to zero in on a single cell type, called the B-cell.  In fact, we need to zero in on a single cell.  B-cells create antibodies, but not all B-cells create all antibodies.  In fact, if you were to take a single B-cell out of your body and compare the antibody it makes with another B-cell you'd find that the two antibodies are different in an important way: they recognize different antigens.  In other words, each antibody-producing B-cell is unique, creating its own proprietary antibody that targets a unique antigen.  A B-cell that can create an antibody to detect cholera can't make an antibody to detect measles, or vice-versa.  And there are billions of these things circulating around your body, each able to recognize something unique.  But at this point they are all completely naive as to what that thing might be.  How do we even get to this point, given we need a specific gene for every protein we ever make?

Every antibody has a region on each end of the "Y" shape that is where the 'detector' part of the protein comes into play.  Everything else is normal antibody and is the same for every antibody, it's just the detector part that changes.  (Actually, no ... but I'm not getting into that here.  But it's true enough to just go with me on this one for now.)  The other parts of the protein are invariant, except for the tips of the "Y". (And yes, it really does look like a "Y"!  This is one of the few times when the cartoon drawing we use is really close to the actual shape of what something looks like inside the body; it's just always drawn orders of magnitude too large.)

Those two variable regions are also encoded genetically, but that genetic code allows for some random shuffling of some of the actual DNA in the genes of the B-cells.  While they're developing, B-cells go in and shuffle those detector genes so they can create a semi-random 'detector' region.  This might seem like the cell is cheating to get around the gene-limit parameter we started with, and it is!  It's just cheating in a way that still technically qualifies as part of the rules so it doesn't get disqualified.

While this is happening, the shuffling machinery introduces some other random mutations into the code.  The gene is then stitched back together and it's set for the rest of the life of that B-cell.  Now the gene is hard-coded forever more.  This is like if you gave out a bunch of paint-by-number templates that were all the same, but you allowed the class to randomly switch around the numbers on each person's individual paint buckets.  Some students would do a little ad-hoc mixing to boot, with the result that no two pictures would look exactly alike, even as they all fit within a defined initial framework.  Once all the work is done and the paint dries, you can't go back and change the picture.

Negative Selection

Now you've got a library of billions of different random B-cells, all capable of making antibodies to who-knows-what.  What do you do with it?  Here's where the 'selection' part of the Clonal Selection Theory comes into play.

First, let's select out all the cells that make something that would attack kidney, liver, or heart cells.  Remember that you made antibodies to everything.  That includes benign things like house dust and even good things you'd never want to attack, like proteins from your own body.  You can imagine that attacking these things could be disastrous.  This negative selection part gets ... complicated.  I'm not even going to get into the details, other than to say that in the bone marrow (that mysterious organ nobody but immunologists know what it does) there's a type of cell that expresses every possible protein in the human body.  Every cell that recognizes something on that cell has to kill itself.

Now we have a library that excludes everything except self-detecting cells.  Great!  We can't stop here, though, because if you pump out a bunch of antibodies against everything that isn't you, you'd be launching a full-scale nuclear assault on everything everywhere.  The tools of the immune system are inherently harmful, so if we don't target our approach we'll end up killing ourselves in the process.  We need to do some more selecting.

Once your library of B-cells has fully developed, the billions of different cells start wandering about the blood stream.  They'll randomly visit all sorts of locations, but they're all on the same mission: find a match for their antibody.  The best places to find that match are all the other organs in your body they never explain when you're learning basic anatomy: the spleen, lymph nodes, tonsils, and appendix among others.  ("Teacher, what does this organ do?"  Well Timmy, let's first talk about Clonal Selection Theory...)


There's a type of white blood cell called an Antigen Presenting Cell (APC) that wanders around various tissues like a cop on the beat.  Whenever it finds garbage hanging about, it eats the garbage, chews it up, and coats itself with all the pieces, like a really messed-up serial killer.  But trust me, these APCs are the good guys.  Once they grab something, they move out of the lung, intestine, skin, or wherever, and catch a ride in the lymph over to the nearest lymph node (or one of those other mystery-organs mentioned earlier).  They stick around in the lymph node, waiting as millions of B-cells wander around near them.  Let's add in another simile here.  This part is like a single's bar.  Everyone is wandering around looking for their perfect match.  Inside the spleen and lymph nodes, B-cells check each APC for their perfect 'soul mate' match, wandering around and around year after year trying to find it.

(For those of you keeping track here, we have a police-officer serial killer covered in the chopped-up parts of its dead victim hitting up a single's bar filled with other serial killers all trying to find their soul mates.  This is either the plot of a twisted TV Netflix-series, or a prime example of why it's dangerous to mix metaphors.)

As anyone who has ever been single knows, most attempted matches aren't the right fit.  For B-cells, few will ever find a perfect match in their lifetimes and they don't have the option to 'settle' so they'll be 'lonely' forever.  The point isn't to find love for B-cells, but to make sure that if there is antigen out there you have a B-cell making antibody to match it.  And when they find that match something truly magical happens.

The B-cell starts reproducing like crazy.  It makes huge numbers of copies of itself, which translates into huge amounts of antibodies.  Ever wonder why your lymph nodes swell when you get sick?  This is what's causing that to happen.  It's a sign that your immune system found the culprit and is massing its forces for the final attack.  Since cells can divide about every eight hours, the process of finding and then expanding B-cell numbers to the level needed to protect against infection takes about two weeks total.

That's right, I said two weeks to generate immunity.  You probably read that and thought, "Wait a minute, I'm almost always over a cold/flu in less than two weeks."  You're right.  And the reason you got over that infection had nothing to do with your body making antibodies.  The first time you're infected by something, you fight off that infection without help from antibodies or what we call the 'adaptive immune system'.  In other words, the first time is always free, whether it's this season's flu or peanut antigen your body mistakenly makes antibodies to.  Because antibodies work so well, the only time we really notice them working is from allergies, where we can directly observe their function (and how bad that makes us feel).  You don't notice the times you didn't get sick.  But because of the way they work, if you get sick from something it's because you lacked antibodies against that thing, and you get better before the antibodies have a chance to kick in.

There are a lot of cool things antibodies can do, but what I want to focus on here is the amazing idea that each antibody starts from one specific cell, which goes through this elaborate dance of selection to find the exact B-cell that creates a matching antibody.  That specific antibody configuration is then 'clonally expanded' by having that one cell make massive numbers of copies of itself.  This pool of B-cells able to make one specific antibody circulates around the blood standing sentinel for another infection.  It's a system that creates a 'memory' of what you've been infected to in the past.  (For any infection there will be more than one - possibly dozens or scores - possible antigen you could make antibodies against.

Now, the number of specific B-cells of a certain antibody type you need to keep around varies.  Right after you get over an infection you probably want really high levels of protective antibodies, otherwise you'd keep getting re-infected soon after your last infection from other sick people so long as the sickness keeps going around.  However, within a few months of that first infection your body starts looking at all those B-cells it made and does a little spring cleaning.  No need to keep trillions of cells around any longer because the threat level isn't that high, so it starts reducing the number of that one kind of memory B-cell it keeps around for that old season's version of the flu.  This process of slowly killing off extra B-cells continues for years, to the point where if you don't get infected by the same thing for long enough (about ten years) the B-cell numbers get low enough that they can't prevent infection any more.  This one reason everyone - including adults - needs to get boosters from their vaccinations at least every ten years.  If you got the polio vaccine as a child, but then go volunteer to fight off the last few cases of polio in your 30's without getting a booster, you might mistakenly think you're protected against something your body long ago forgot how to defend against.

(I have no vaccine-related disclosures to make.  I've never worked in vaccine development, and the company I work for does nothing in the vaccine space.  Maybe later I'll write a vaccine-related post, but this one is about clonal selection theory.)

Complicateder and complicateder

I've been intentionally misleading you this whole time by leaving out about half of the story: T-cells.  We've been talking about B-cells, as if they're the only kind of cell that makes antibodies.  Technically they are, since their cousin T-cells make a molecule we don't call an 'antibody', but it's really close to the same thing.  It has a similar 'Y' shape, complete with variable regions on the top ends of the 'Y'.  It rearranges its genes in order to create a random library capable of recognizing every possible antigen (except instead of doing this in the bone marrow it happens in the thymus - another of those mystery organs).  The T-cell even circulates through the blood and lymph testing itself against APCs to find its perfect antigen match.

They're not exactly the same, though.  The primary difference between B-cells and T-cells is that when B-cells produce extra copies of the antibody they matched to polio (or whatever) they release those antibodies into the blood stream to circulate randomly throughout the body.  Certain special types of these antibodies can even make in across the placenta and into breast milk, providing early protection for babies whose B- and T-cells will take time to develop.  (I'll go more on antibodies and what they can do in another post.)

T-cells, on the other hand, don't release their 'antibodies'.  Instead, they cover their cell surface with them, and personally visit all the possible sites of infection looking to see if they get a match.  There are a few advantages to this, which I'll talk about in another post, but one of the main advantages is that since it's a whole cell doing the detection and not just one protein they can send more complicated signals far and wide.  In effect, a T-cell can recognize specific antigens and then direct the immune system how to respond to that antigen.  How do they know the difference?

Remember back to that situation where the antigen-presenting cell met its perfect match in the lymph node, spleen, or tonsils?  For T-cells, part of that interaction is the APC giving the T-cell a report about the context it found that antigen in:

"Yeah, I saw a lot of cell death nearby, plus there was some evidence a virus might have been involved, because there was some double-stranded RNA at the scene of the crime, which of course we never see unless there's a virus around."  It might give a different report for a bacterium, a fungus, a parasite, etc.  Alternately, it could give a more mundane report, "Nothing interesting here.  Just some harmless pollen."

That report programs the T-cell so that when it next encounters antigen it can make 'decisions' about how to eliminate the threat.  For example, it might call in other cell types that will be helpful at fighting this particular type of threat.  It might tell nearby cells to take steps to protect themselves or attack the invader.  The end result is a more robust attack strategy against a specific threat.

This whole process continues throughout your life and is happening simultaneously whether you're fighting off an infection right now or not.  Just like with B-cells, after the initial exposure is cleared your body doesn't need to keep around the same number of T-cells that it did when it was fighting the infection.  If it tried that, your lymph nodes would swell up into hideous basketball-sized bumps all over your body.  So the immune system pares down that huge initial army of specific T-cells (remember they recognize one and only one target) to a fraction of its size during the infection.  As with the B-cells, this is still many orders of magnitude greater than those naive cells, and these new T-cells are already pre-programmed to recognize what the antigen means when they encounter it again.

Measuring your hypothesis

Okay, now compare your initial hypothesis against what's actually going on in the body to solve the problem.  Did you get the general idea right, even if you didn't really understand the details?  To help define that more narrowly, here's the simple but audacious idea: create billions of random variations encapsulated in individual cells and then select matching cells to expand out.

If you came up with a good, clever way of solving this problem but it wasn't the way the body actually uses, that's okay.  The point of the exercise is to demonstrate how difficult it is to think of the real solution to a problem a priori.

As I'm writing this, it occurs to me that my own hypothesis might be wrong.  I'm guessing most people won't pre-register the central idea behind clonal selection theory, because most hypotheses are wrong when tested against reality.  Maybe that's not what we'll see in this case.  Maybe I'm implicitly giving it away somewhere in the expanded explanation section.  Maybe the idea of random selection from infinite variation is a concept people are primed to resort to because they bring it over from other domains of science and biology.  Whatever your personal experience with this exercise was, leave a comment on your original hypothesis comment analyzing whether you felt you got close to explaining how antibody specificity is achieved or not.


Popular posts from this blog

A better addiction

Covid-19: Epidemiology is useful

Open Questions: The Origin of Life