The critical nature of the tumor suppressor p53 is indicated by the fact that it is present in a mutated form in somewhere around one third of all human cancers (and in virtually 100% of all ovarian cancers). The "p" in its name stands for "protein"; the "53" for 53,000 daltons, the molecular weight of the protein that was estimated when it was first discovered. In actuality, the technique that had been used to measure its size is imprecise. In actuality its molecular weight is closer to 48,500 daltons. That's a bit misleading too. 53,000 is the size of a single p53 chain. But the protein doesn't function as a monomer. It is a tetramer inside cells (4 x 48,500), thereby increasing the error in p53's name still further. The picture below shows a half molecule of p53 attached to DNA.
p53 is an amazing protein. It, and the gene that specifies its sequence, has been the subject of 10's of thousands of papers to date. And the list is growing. It is a transcription factor that somehow monitors the health of cells, particularly the integrity of the genetic material. If it finds something amiss, it can slow division so that the cell can repair existing damage. Alternatively it can direct the cell into a state called "senescence", where it ceases to divide. Or, if the problem is so serious that the damage can't be undone or avouded, it sets the cell on the pathway for programmed cell death, or apoptosis. Since the molecular mechanism apparatus for apoptosis is built into the circuitry of most human cells, cancer cells can also be forced to undergo the process. And since cancer cells often carry damaged DNA or appear unhealthy in other ways, p53 acts as as an essential guardian against tumors, ridding the body of tumors or stopping division in its tracks. It's no wonder that avoiding the p53 barrier is one of the primary concerns of cancer cells.
A New Reference
In this post, I'm going to offer an overview of the way that p53 works. Some of the information that I hope to get on the subject comes from a source that I recently discovered – a book published in 2013 called "p53: The Gene that Cracked the Cancer Code" by Sue Armstrong.
Armstrong's book tells the story of p53 from a historical perspective. She interviewed many of the scientists who made significant contributions to our understanding of how p53 works. The book sparkles because many offered candid tales about the errors that they had made, the troubles that they encountered in trying to convince others, and the contributions of people in their laboratories that are often unacknowledged. It's written in an easygoing style and is aimed at readers that don't have a strong grounding in molecular biology. And there's little jargon! All in all, I highly recommend it.
A New Kind of Tumor Suppressor
From early on p53 didn't act like other tumor suppressors. For one thing, tumor suppressors usually require two separate events in order to cause malignancies. There's an initial step that renders the protein inactive, and another that often knocks out the corresponding gene on the other chromosome. However, some p53 mutants didn't seem to work like that. They sometimes increase the risk of cancer when present in only a single copy. For another, most tumor suppressors profoundly perturb normal embryonic development when both copies are absent. Mice lacking both copies of the p53 gene develop normally, although they are very prone to cancer after birth. This second peculiarity was particularly instructive (I'm going to ignore the mechanism governing the first because it isn't essential to this narrative). Because it wasn't required during embryonic and fetal life, it appeared that p53 was solely specialized for cancer suppression, not for normal cell activity.
An observation that fits with this second characteristic is that p53 is present in extremely minute quantities in most cells under normal conditions. Some sophisticated studies showed that its lack of abundance wasn't because it was synthesized at a low rate. Its scarcity was due to the fact that most of it was being destroyed as soon as it appeared. But isn't that wasteful? What's the point of making a lot of protein and then rapidly degrading it? Actually, other proteins have been shown to turn over quickly and the explanation offered by molecular biologists is that it allows cells to react swiftly to events. If you're constantly making a protein at a rapid rate and breaking it down just as fast, you can cause the protein to quickly accumulate if you stop its degradation. By contrast, if you want to increase a proteins concentration by increasing its rate of synthesis, you first have to increase the transcription of its gene, push its mRNA out of the nucleus, and begin translating it in the cytoplasm. If that process is already underway, it makes for a more expeditious response.
Increasing the Levels of p53
What specifically are the factors that decrease the rate of degradation of p53 and thereby increase its concentration? The list is remarkably long. It includes radiation, lack of oxygen, DNA damaging agents of various sorts, blockage of transcription, blockage of replication, and many more. p53 appears to monitor the health of cells, and when that health is impaired, it acts. That's what makes it so important, especially, but not exclusively, for the prevention of cancer. How does it sense these disturbances? What does it do about them? How does it do it? These are all questions for the next post.
You'll not be surprised to learn that cell division works much like many of the other complex biochemical pathways that I've described previously. Namely it is mediated by a Rube Goldberg procession of events, one reaction cascading into another, to produce a final endpoint or endpoints. Moreover, like many of the processes I've previously described, the individual steps are often controlled by protein kinases, enzymes that add a phosphate group onto a protein that cause it to transition from an inactive to an active state (or less frequently, vice versa). However, here is one major difference between a Rube Goldberg machine and all the pathways previously noted is that most biological processes make use of mechanisms that check to see if a step has been completed before going on to the next. The cell cycle offers no exception: it proceeds by a complex series of steps; it utilizes protein kinases; and there are checkpoints along the way that monitor progression through the cycle. Cancer cells are particularly adept at subverting these last surveillance mechanisms. In addition, all the complex pathways that I've discussed can be slowed or stopped by molecules that inactivate one or more steps. Tumor suppressors are a case in point.
I'll begin with the role of kinases because they are the main players in the cell cycle. There's lots of them and they are involved with virtually all of the steps that occur during cell division. For example, during mitosis the nuclear membrane breaks down just before mitosis, a process that is promoted by particular kinases that have the job of transferring phosphate groups to the nuclear membrane. Kinases also phosphorylate transcription factors that help to turn on genes appropriate to particular steps during the cell cycle. There are as many as a hundred other examples. But what are the regulatory molecules that turn on these kinases? As you might have guessed, the answer is other kinases. Foremost among them are a group of enzymes called cyclin dependent kinases. As their name implies, these proteins are inactive by themselves. Only when bound by a second protein, one devoid of kinase activity, can they transfer phosphate groups on to other molecules. This second protein, first discovered By Tim Hunt and named by him, is called cyclin.
In actuality there are four cyclins, called D, E, A, and B, that change in concentration (or availability) at various stages of the cell cycle. A figure illustrating how they fluctuate is shown below.
It is cyclin D that initiates the cell cycle after a cell has completed division. Its concentration is dependent on the presence of growth factors outside of cells and the subsequent cascades of reactions that follow their binding to their appropriate receptors. Ultimately, cyclin D synthesis is driven by transcription factors that are activated by these cascades. As its concentration increases it binds to two cyclin dependent kinases, cdk4 and cdk6. The pairing of a cyclin with these kinases activates them. They add phosphate groups to a variety of molecules that take the cell from the start of G1 to a point near the end of G1 called "R", the restriction point. If a cell makes it past R, it will continue its way through the cycle, through the remainder of G1 and then through S, G2, and M, without further input from growth factors. In fact, it will be largely unresponsive to growth inhibitors that are only effective before R. If a cell fails to pass the restriction point, it will depart the cell cycle, and will enter a stage called G0 where it may differentiate.
Because the restriction point is so critical, it is thought that most cancers have figured out ways to bypass it. Before describing how the restriction point works, I should make clear that in addition to the various cyclins shown above, there are a variety of cyclin dependent kinases. These associate with the cyclins,become active, and move the cell cycle along (see the figure at the right). The black arcs in the illustration represent the times in which these cyclin/cdk complexes operate during the cell cycle.
The Restriction Point
The protein in charge of the restriction point is the product of the retinoblastoma gene, RB. After a cell leaves mitosis, RB lacks phosphate groups. As it progresses through G1, more and more phosphate groups are added to it, until, at the restriction point, most of the possible sites of phosphorylation are filled. It turns out that RB in its unphosphorylated form acts a cell cycle inhibitor, preventing the cell from leaving G1. When it gets fully phosphorylated, it looses its inhibitory behavior, thereby allowing the cell to pass R and progress through the remainder of the cell cycle. As Weinberg so beautifully puts it, RB serves as a guardian of the R gate, keeping it closed through G1 until it is inactivated by the addition of phosphate groups.
But this explanation begs two further questions. First, what is it that unphosphorylated RB does to prevent cells from progressing through G1? Second, what proteins add the phosphate groups to RB? The answer to the first question is that RB is a "pocket protein". it has the ability to bind other proteins, among them a transcription factor that is important for the G1 to M transition. Because this transcription factor is sequestered, it cannot act. Upon phosphorylation, RB releases its pocketed transcription factor which, in turn turns on the synthesis of cyclin E. As to which protein places phosphate groups on RB, initially, it is the cyclin D/cdk4 and 6 complex that begins the job. As the cell progresses to R, the activity of the newly appearing cyclin E complexed with cdk2 completes the process. Since cyclin D concentration is directly linked to growth factors, the complex series of events from growth factor to initiation of the cell cycle should be apparent. For further clarification, take a look at the figure above.
The RB protein is depicted as a "pocket", holding onto transcription factors that are necessary for the cell to pass through the restriction point. Growth factors, like epidermal growth factor mentioned previously, start several cascades of reactions that eventually result in the appearance of cyclin D. Complexed with cdk4 and 6, it begins to phosphorylate RB, resulting in the release of other transcription factors that increase transcription of the cyclin E gene and production of the cyclin E protein. In turn, cyclin E complexes with cdk2, producing an enzyme that further phosphorylates RB, inactivating it, thereby allowing the cell to pass through R and proceed through the other steps of the cell cycle.
All this is complicated, but I've actually left out several steps and only described the essence of the process. For example, I haven't mentioned the role of protein degradation in controlling cyclin D. And I've left out several key components. However, what should be clear is that the retinoblastoma protein (RB) plays a critical role in blocking the cell cycle until it is signaled to advance by the end products of growth factor stimulation. If something were to happen to prevent retinoblastoma from doing its duty, by a mutation in the retinoblastoma gene for example, the cell cycle would lose one of its main control elements. Cells then could proliferate inappropriately and cancer might result. It is for this reason that Weinberg writes that is may well be that "virtually all human tumors" may bear defects in RB signalling.
By way of emphasis, I want to repeat one point mentioned previously. Recall that an oncogene stimulates carcinogenesis by acquiring mutations that increase its growth promoting activities. RB, as a representative of tumor suppressors, increases growth when it loses activity via mutation. In the next post, I'll write about another important tumor suppressor, one that acts entirely differently than RB, but whose absence seems to be correlated with nearly all cancers.
Oncogenes were first discovered in a tumor virus that had captured an animal's proto-oncogene, appended it to own genome, and subverted it to its own ends. Proto-oncogenes are cell growth promoters. They stimulate cell proliferation. It's no wonder that viruses use them to encourage growth of the cells that they infect. But non-virally induced cell growth has to be carefully controlled and organisms have developed wonderfully complex ways of accomplishing that mission. That's where tumor suppressors come in. Their name is something of a misnomer. Their role to make sure that normal cell growth is limited and properly regulated.
Tumor suppressor genes were first discovered in individuals that had a hereditary disposition to cancer. For example, the first tumor suppressor gene was found as a result of analyzing youngsters with familial retinoblastoma, a disease that afflicts approximately 1 in 20,000 children aged six and younger. Actually, there are two forms of the disease. In about 55% of cases, it's not hereditary – no close relatives show evidence of the disorder. In addition, most often only one eye is affected. This condition is termed "sporadic" retinoblastoma. The remaining cases exhibit a dominant pattern of inheritance and often both eyes harbor tumors. In the 1970's, the retinoblastoma gene was mapped to the 13th human chromosome. And in 1986, the gene was cloned. It was soon shown that children with the hereditary form of the disease harbor mutations in the retinoblastoma gene that renders the protein that it specifies inactive.
In order to explain these discoveries, scientists guessed that the retinoblastoma protein somehow acts to curtail cell proliferation. When it is inactivated via mutation, cells continue to divide and soon form tumors. But this hypothesis met with some difficulties. Humans have two 13th chromosomes. Analysis of the genome of affected individuals showed that only one of the two bore a defect in the retinoblastoma gene. Why wasn't the normal version of the gene making up for the deficit in the other?
The explanation was simple, but surprising. Ordinarily, DNA analysis is carried out on white blood cells from the blood because they are a convenient source of genetic material. That's how the defect in the retinoblastoma gene was detected. However, when the DNA of the tumor itself (rather than that of other cells in an affected individual) was analyzed, it was found that, in many cases, both chromosomes bore mutant copies of the retinoblastoma gene! Apparently, tumor cells somehow eliminated the normal copy of the retinoblastoma gene and replaced it on the homologous 13th chromosome with its defective partner. This phenomenon is called "loss of heterozygosity". It's caused by genetic exchange between two homologous chromosomes. Tumors use this strategy so often that it has been used to find other tumor suppressor genes. (There are other mechanisms that tumors use to eliminate or repress the normal copy of a tumor suppressor gene from a cancer. They include complete loss of the homologous chromosomes, loss of large sections of the homologous chromosome, and repression of the expression of the normal gene by the addition of methyl groups to its promoter regions).
The retinoblastoma gene is one of the most important tumor suppressors. I intend to discuss it in more detail in the next post. But in order to better explain its role in the cell and in tumorigenesis in general, I'm going to take a detour into the world of cell division. That's because the retinoblastoma protein plays a determining role in deciding whether cells should divide or not. For those of you who want a historical overview of how scientists discovered the mechanism of cell division, I recommend a short book, "The Cell Cycle" by Andrew Murray and Tim Hunt. It's 25 years old, but it remains a wonderful introduction to the subject. Professor Hunt was awarded the Nobel Prize in Physiology or Medicine in 2001.
As we've seen, growth factors are the means of some cells to tell others to proliferate through growth factor receptors that transmit information from outside the cell, through the cytoplasm, to the nucleus by a complex series of intracellular cascades. Ultimately, these various signals have to be integrated so that the cell can take one of three paths - to divide further; to stop dividing and differentiate; or to commit suicide. If a cell opts to divide, it goes through a series of steps called the cell cycle. First it must prepare for DNA synthesis. The interval between cell division and DNA replication is called "Gap1" or more commonly, "G1". In the next step, it replicates its DNA so that its daughters can each get a full complement of genetic material. The period in which DNA is being replicated is called "S" for DNA synthesis. Next there is another gap in which the cell prepares for the actual physical division of the cell. It's called "Gap2" or "G2". Finally, the cell splits in two. This process, called "M" for mitosis, ensures that each daughter is dealt the proper chromosome constitution. These series of steps are repeated each time a cell divides. In mammalian cells at body temperature, a complete round of cell division takes about a day, although the timing varies depending on the cell type.
I've drawn a cartoon of one cell division cycle and it's shown at the right. Each sector represents the time in which the cell spends in a particular stage of the cycle. In subsequent posts, I plan on adding to this figure to illustrate some of the biochemical events that drive the process
Up till about 50 years ago, cell division was carefully examined microscopically in a variety of cell types and meticulously described. But there was virtually nothing known about the biochemical events that drove the process. Most of the progress in elucidating the mechanism behind the cell cycle came from a combined biochemical and genetic attack. Again, the fascinating story of how the scientific community discovered how cell division worked is told in Murray and Hunt's book. I'll not get into the history, but will try to present what is known starting in the next post.
I've written about two proteins: the EGF receptor and Ras. The former reacts to growth factors by dimerizing and transferring phosphate groups to one another's tails. The latter becomes activated by binding GTP, thereby initiating a cascade of protein kinase reactions leading to phosphorylation of transcription factors in the nucleus. How are these two related? Or more specifically, how does the positioning of phosphate groups on the EGF receptor lead to activation of Ras?
The answer to these questions came, peculiarly enough, from a series of studies on eye development in fruit flies. The story is too long to relate here, but the upshot was the discovery of two proteins that bound to the tail of the EGF receptor. The binding changed their position in the cell and allowed them to interact with the Ras protein, specifically catalyzing the transfer of a phosphate group to GDP turning it to GTP, and thereby activating its kinase ability.
In the course of these and later investigations, an important and previously unknown property of proteins was discovered:There are domains in proteins whose function is to recognize and bind to specific sequences of amino acids and thereby position one protein near another. Two such domains are called "SH2" and "SH3" (SH stands for "Src homology" because the domains were first discovered in the protein encoded by the Src oncogene). Since their initial identification many similar protein binding domains have been found but I'll focus on SH2 and SH3.
SH2 is a sequence of about 100 amino acids. When present in a protein, it folds up into a structure that can bind to a section of a protein carrying one phosphorylated tyrosine followed by a sequence of three specific amino acids. There are over 100 such SH2 domains found in the human protein set, each capable of binding to a different sequence of amino acids (but all requiring a phospho-tyrosine at one end of its recognition site). A protein called "Grb2" possesses a single SH2 domain. It uses this domain to grab on to the tail of an EGF-receptor, recognizing a specific phospho-tyrosine containing sequence. It's important to know that Grb2 possess no enzymatic activity. But it does carry, in addition to its single SH2 domain, two SH3 domains. These are about half the size of SH2's. They recognize and bind to a short protein sequence that includes the amino acid proline.
The sequence of events that occur after the EGF-receptor binds the EGF growth factor should be clearer if you examine the figure above. First, one or more of the tyrosine amino acids in the receptor tail become phosphorylated via the kinase domain of the receptor. Next,using its SH2 domain, the Grb2 protein recognizes one of these phosphotyrosines and its nearby amino acids. Since the Grb2 protein also possesses two SH3 domains, it can serve as a bridge to bind the Sos protein, which, of course, has two proline rich sites capable of being bound. Finally, Sos, which is a guanine nucleotide exchange enzyme, is now in position to catalyze the removal of GDP from the Ras protein. GTP, present in much greater abundance than GDP, takes its place and Ras becomes active and promotes the kinase cascade described in the last post.
I don't know about you, but I find these complex cascades of reactions overwhelming. But the example that I illustrated is only one of a larger number of parallel signalling pathways operating to promote growth. Ras itself is involved in two other cascades, each of which triggers another series of reactions. One of these, called PI3K, instigates a series of reactions that increases cell proliferation while also lessening cell death via apoptosis.
And that's not all. We know about additional pathways with names such as "Jak-STAT", "Wnt-Beta-Catenin", and "Notch". And there undoubtedly others that are unknown. For example, most breast cancers grow in response to estrogen. There is evidence that estrogen activates the Ras-Raf-MEK-Eft1 pathway described here and in the last post, but no one seems to understand the molecular mechanism involved despite the fact that breast cancer has been intensively studied for many years.
The last two posts have been about oncogenes - genes that promote proliferation by stimulating growth and inhibiting cell death. In the next post, I'll begin a discussion of tumor suppressors, genes that act as brakes for cell growth.
In the last post, I used the epidermal growth factor receptor as an example to illustrate how signals outside of cells can traverse the cell membrane and provide information that can be used inside. Recall that as a result of EGF binding two EGF receptors embrace and the close proximity of their internal kinase domains causes one partner to attach phosphate groups on to specific amino acids (tyrosine) to the other. The question I want to get to in this post is: What happens next? How is the addition of phosphates used to provide instructions to the other cell constituents, including the DNA in the nucleus, so that cells can begin to proliferate?
Before addressing these questions, allow me to devote a paragraph to the strategy that scientists took (and are taking) to solve these riddles. In essence, biologists have traditionally used two fundamentally different approaches. One, that associated with the discipline of biochemistry, is to take the individual components suspected of acting in some process out of the milieu in which they ordinarily act, purify them, and study their characteristics in a container at the laboratory bench. Biochemical analysis has the advantage that is can obtain detailed information about a particular protein and its operation. It has the disadvantage that it can miss the big picture. A second strategy, used by scientists who call themselves geneticists, is to induce errors in DNA (mutations) that perturb a process in an organism. By studying the effects of changes in individual genes on how a given process is disturbed, geneticists can often deduce the function of these genes and the proteins that they encode. This second approach requires that the process being studied occurs in an organism that can be genetically manipulated. For that reason, much of the internal signalling pathways that lead to cancer have been elucidated in what are called "model organisms", like yeast, fruit flies, and worms. As a former fruit fly geneticist, I'm proud of the central role that the little creature that I studied has played.
A major contributor to the cell signalling pathway that lies downstream of the growth factor receptor is a gene called "Ras". It was discovered more than 50 years ago in two different viruses that induced rat sarcomas, hence its name. Robert Weinberg, among others, showed that there are three cellular counterparts to the viral genes. And over the years, it has become apparent that the three Ras genes, these proto-oncogenes, play a pivotal role in growth promotion and in cancer progression. In fact, mutations in the cellular version of the Ras gene are found in about a quarter of all cancers.
The Ras protein sits inside the cell at the cell membrane (unlike the EGF receptor, it doesn't traverse the membrane) and exists in two states: on and off. When it binds GTP (guanine triphosphate) it becomes active; when the bound GTP loses a phosphate and become GDP (guanosine diphosphate), it turns inactive (see the figure above for the structure of these two nucleotides). Recall that GTP and GDP are small molecules (see figure above) that are omnipresent in cells, but GTP is usually 10 times more abundant than GDP. The active form of Ras has an affinity for another protein called Raf and when the two bind together, Raf becomes an active protein kinase. It in turn, transfers a phosphate group to a protein called MEK, which becomes an active protein kinase, transferring a phosphate group to yet another protein called Erk1. This Rube Golbergian parade is shown in the accompanying illustration.
But I haven't told you the whole story. Normally, this pathway is shut off when growth factors aren't present. How does Ras become activated? It normally is inactive because it and some other enzymes are continually removing one phosphate from any bound GTP that happens to stray into its grasp. The answer to this question requires a short detour that I'll take in the next post.
When evolution first fashioned multicellular organisms, a problem arose. The main goal of unicellular organisms like bacteria and protozoans is to reproduce; to leave as many descendants as possible to ensure that their genes are passed to the next generation. While that aim continued with the advent of multicellularity, the individual cells in a multicellular creature are constrained. They are parts of a community and cannot go off, willy-nilly, reproducing on their own. Their ultimate job is to make sure that the genes carried by germ cells, sperm and eggs, not their own genes, are propagated. They live to sacrifice their own welfare for the greater good. As a part of this behavior, they must interact with their fellows, adjusting their rate of proliferation to match their neighbors so that tissues and organs form and function properly. Moreover, as cells die, new ones must replace them. That process too, tissue renewal, must be carefully controlled.
In order for organisms to regulate the growth of their tissues and organs, cells must communicate. They must inform each other when to divide, when to stop, even when to die. The major signalling molecules in this process are called "growth factors", and they generally are proteins. But proteins are much too large to cross cell membranes. How can a molecule like a protein affect the behavior of a cell if it can't pass through the cell membrane?
We've already seen the answer. Cells have proteins embedded in their membranes, called receptors, that function to pass information from outside to inside. How growth factors achieve this aim was first worked out by studying the mechanism of action of a growth factor, called EGF (for "epidermal growth factor"). EGF, as its name implies, serves to stimulate the proliferation of a variety of epidermal cells. It works by interacting with a receptor molecule, called logically enough, the EGF-receptor.
The figure above right shows the EGF-receptor protein before it has bound a molecule of growth factor. The yellow rectangle is supposed to represent a portion of the cell membrane. The receptor protein chain is shown as a dashed line. The portion of the receptor outside the cell is drawn in green and labelled "Binding Domain" (A domain is a portion of a protein that folds up independently of the rest, often performing a specific function. Domains will play a prominent role in future postings). EGF can affix to the binding domain. A small area of the protein, drawn in red, traverses the membrane (the "Transmembrane Domain"). A third section of the receptor that lies inside the cell is called the "Kinase Domain". It is shown as a blue bulge. Its function is to enzymatically transfer phosphate groups from ATP or GTP to another protein. Below the kinase domain is a tail of amino acids, that is located at the terminus of the protein.
As mentioned in a previous post in conjunction with immune receptors, the EGF receptor is free to move laterally within the membrane. Occasionally, it contacts another similar receptor and briefly forms a dimer (see the second illustration at the right and compare it with the one above). As shown, these transient dimers are good at binding EGF. When EGF binds it both stabilizes the association between the two receptors and activates the two closely opposed kinase domains. They add phosphate groups onto specific amino acids (tyrosines) in the tails of their adjoining partners. We'll delve into the consequences of this transphosphorylation in the next post, but suffice it to say, they are profound, allowing the receptor to set up a cascade of reactions that promote cell growth.
There are many growth factors that interact with their specific receptors and operate in a manner similar to the EGF receptor. Weinberg claims that DNA sequencing of the human genome has identified 59 proteins that bear close resemblance and presumably have similar functions. If that isn't enough, there are a host of other unrelated receptors that also send growth signals from the outside of cells to the inside. Some bear kinase domains and transfer phosphates to proteins, other use alternative mechanisms for signalling. A detailed description of these other receptors is beyond the scope of this blog. However, a key take home lesson is that many cancers harbor errors in receptors in general, errors that contribute to the malignancy of cells. I'll discuss this further in the upcoming posts.
After many cancer-causing genes - oncogenes - were found in tumor viruses and their cellular counterparts - proto-oncogenes - identified in normal cells, it seemed possible that mutations in proto-oncogenes might be responsible for cancer. The idea was that these genes could become corrupted and, instead of performing their normal function, could cause proliferation to run amok. If so, all tumors, not only those that were induced by viruses, should bear such mutated genes.
The technique used to pick out oncogenes from among the thousands of genes in the genome appeared daunting. But a technique was soon developed that did the job (see figure). DNA was extracted from tumor cells, fragmented, and introduced into a line of mouse cells growing on a Petri plate. The recipient cells were carefully chosen. They had to grow well in culture and readily take up foreign DNA and integrate it into their chromosomes. If a cell happened to take up a piece of DNA carrying an oncogene, it would take on some of the characteristics of a cancer cell. Namely, it would change shape and its progeny would pile up on one another, forming little clumps among a flat background of normal cells.
But were the cells in these clumps really cancerous? To find out, scientists injected the cells into mice as Rous had done (albeit with chickens) decades previously. And, like Rous, they found that the injected cells did, in fact, produce tumors. With this information in hand, research now turned to far bigger and more difficult question: How do oncogenes turn normal cells into cancerous ones?
The answer to that question is complex. For example, a gene called myc, named for its discovery in an avian myelocytomatosis virus, was found to be a potent oncogene. It's cellular counterpart, c-myc, can become cancer-causing by several mechanisms. One way is by an increase in the number of genes though a process called gene amplification. Apparently, the presence of multiple copies of the myc gene results in an increase in the protein that myc specifies. Since the myc protein is a growth promoter, the result is an acceleration of cell proliferation and cancer. Gene amplification seems to be a general phenomenon. Amplification of a variety of oncogenes beside myc is known to be associated with at least 20 different malignancies including breast, colon, lung and pancreatic cancer.
Another way of turning myc into an oncogene is by modifying the sequences that control its transcription. Normally, genes are regulated by nearby DNA regions that manage the rate of RNA synthesis. In several cancers, myc comes under the control of sequences that increase the rate of transcription, thereby flooding the cell with growth promoters, causing unregulated proliferation.
Still another mechanism of perverting a proto-oncogene is via a change in the amino acid sequence of the protein that it specifies. The ras gene offers an instructive example. The ras gene, named for the rat sarcoma virus in which it was first found, plays an important role in normal cells. However, it can become a cellular oncogene and can be detected as such via the assay described above. DNA sequencing revealed that a single base change, a G to a T, was responsible for turning Dr. Jekyll into Mr. Hyde. The mutation results in a single amino acid change, a glycine into a valine, and that is sufficient to make a difference. The ras gene has been found to be so altered in 90% of pancreatic and 45% of colorectal cancers.
In the next post, I'll begin to delve into the intricacies of how cellular growth is controlled by signals from their surroundings.
The genes that were associated with tumor formation were first found in viruses. So called "tumor viruses" don't kill their hosts directly. Instead they force the cells that they infect to multiply uncontrollably. It turns out that virus-induced cancers are rare, accounting for less than one fifth of all malignancies. Nevertheless, scientists in the 1970's recognized that they might provide clues to the mechanism of carcinogenesis in general. The idea was that if one could understand how a gene in a virus led to cancer then that knowledge could be more widely applied. Ultimately, these studies resulted in the discovery of oncogenes.
Peyton Rous and the Plymouth Rock Hen
The history of the connection between viruses and cancer actually begins in the second decade of the twentieth century. One day a chicken farmer arrived at the Rockefeller Institute in Manhattan bearing a Plymouth Rock hen in her arms. The bird had a large mass (a sarcoma) growing out of its right breast muscle. It isn't clear what the farmer expected from her visit to that venerable institution but she and her bird were shown into Peyton Rous' laboratory. Weinberg writes that Rous "dispatched" the hen but that's not quite correct. Rous etherized the chicken and removed the mass. He sliced it into fragments and injected a few into the chicken's other breast and several into its peritoneal cavity. The hen died a month later. By that time, the cancer had grown in the injected locations and was likely responsible for the fowl's demise. Injections of fragments of the tumor into similar locations in closely related chickens also caused tumors. By continually transferring pieces of tumor from one chicken to another, the cancer could be propagated indefinitely.
These experiments had shown that the agent promoting the cancer was transmissible. Rous wondered what the biochemical nature of the agent was. Could it have been the tumor cells themselves? Or chemicals emanating from the tumor? Could it have been bacteria or something more exotic?
To find out, Rous took a relatively straightforward approach. He took the tumor, ground it up in a weak salt solution, and passed the resulting slurry through a filter. The filter that he used was sufficiently fine that it didn't allow anything bacteria-sized or bigger to pass through. Subsequently, he injected the filtrate (the solution that came through), into the breasts of susceptible chickens. The result: The chickens got tumors! He concluded that the cancer causing agent was smaller than a bacterium and probably a virus (he didn't use the term because it wasn't popular at the time), or, less likely, some chemical given off by the tumor.
The scene not shifts ahead several decades to a laboratory at Cal Tech in Pasadena, California. There two scientists found that Rous' virus, now called RSV (for Rous Sarcoma Virus) could be used to infect chicken cells grown in a Petri dish. The infected cells acted in several ways like they had become cancerous. They didn't stop growing after they covered the entire dish in a single layer. Instead, they piled on top of one another, unlike normal chicken cells. They acquired a distinctive rounded shape that differed from normal cells. And they grew indefinitely. What was it in the virus that caused these dramatic changes?
RSV and similar viruses have a very small genome consisting of less than five genes (RSV is an RNA virus, a retrovirus, that reproduces by making a DNA copy of its genome and integrating it into the chromosome of its host). It seemed to researchers that if a viral gene was responsible for transforming normal cells into cancerous ones, it would be a simple task to identify which viral gene was responsible for the deed. The genes that specify the viral coat were ruled out as were the genes responsible for converting the RNA into DNA and integrating a DNA copy of the virus into the chicken genome. What was left was a gene that researchers called src (pronounced "SARK") for "sarcoma" named for its probable role in cancer formation.
All now seemed clear. The scientists involved reasoned that the virus attaches its DNA to the chicken chromosome, and the newly introduced src gene somehow causes the infected cells to become cancerous. To demonstrate that their reasoning was correct, they infected some chicken cells with the virus and using a well worn molecular technique assayed the cells for the src gene. As expected, they found it. However, as a control, they looked for the src gene in uninfected cells. To their astonishment, it was there too.
What soon became clear was that every animal they looked at harbored a src gene. Mammals, fish, insects, even sponges had it. Or at least a close relative, genes that had almost the same sequence. What was going on? To shorten a long explanation, it appears that the cellular version of the src gene, called c-src to distinguish it from its viral cousin (v-src), is an important gene that regulates cell growth in a great variety of organisms. Somehow the RSV virus had picked up the gene and corrupted (mutated) it, causing it to promote unwarranted proliferation after viral infection. The implications of this interpretation were clear: perhaps some cancers were caused by mutations in the normal c-src gene. Perhaps, by looking in other tumor viruses, other genes could be identified that had cellular counterparts.
And so it was so. The myc gene, a gene present in another avian RNA virus, was found to induce bone marrow cancers in chickens. And in time more than 30 similar genes were found in viruses that infect chickens, mice, cats, and monkeys. The cancer causing genes in these viruses were termed "oncogenes", and their normal equivalents "proto-oncogenes". Scientists predicted that these proto-oncogenes could be converted into oncogenes by mutations.
These observations opened up a key question: How do oncogenes, either of cellular or viral origin, change a cell from normal to malignant? That is the subject of the next post.
We all begin life as a single cell - a zygote - the result of fertilization of one of our mother's eggs by a sperm from our father. In time, and after many cell divisions, a multitude of different cell types form, each destined to carry out a specific function, each with distinct physical and biochemical properties. This process, whereby a cell becomes committed to a specific identity is called differentiation. The mechanism of differentiation has been worked out: A cell becomes differentiated by transcribing a limited portion of its genetic endowment. Each of the two hundred or so different cell types employs a specific and limited set of genes.
Normally, differentiation is irreversible. Liver cells don't become muscle cells, and nerve cells don't become skin cells. So it was surprising to the scientific community that cells don't lose genetic information during differentiation, despite the fact that much of their genomes goes unused. It's now firmly established that every cell, regardless of its differentiated state carries the same complement of DNA as a zygote. It certainly appears to be an inefficient design strategy since every time a cell replicates its DNA it uses a considerable amount of energy to insert the correct base in the right order into the newly forming strands. If a cell eliminated all the DNA that it didn't need during differentiation, it would save a lot of wasted energy.
But carrying a zygote's worth of DNA in every differentiated cell has another, more serious, drawback. It opens up the possibility that a rogue cell could make use of genes that it normally has no access to and thereby behave inappropriately. One such abnormal behavior is called cancer. Cancer begins with a single cell. Somehow one of its genes has mutated (acquired a change in DNA sequence). Suddenly, instead of behaving as a neighborly member of a multicellular society it begins to act selfishly, proliferating abnormally. New mutations occur as the cells divide. Then evolution via natural selection takes over. Offspring of the original mutated cell that are best able to survive and reproduce are favored over those cells that are less capable. Soon a tumor forms. It it not removed, it may continue to grow. Additional mutations may occur that allow the cells to escape the original mass of tumor cells and travel through the lymph system or blood vessels to other sites, a process called metastasis. Without medical intervention, the outcome may be dire.
Types of Cancer
More than 80% of cancers arise in epithelia - tissues that "line the outer surfaces of organs and blood vessels throughout the body, as well as the inner surfaces of cavities in many internal organs." (Wikipedia). Cancers that originate in epithelial tissues are called carcinomas, and their prevalence in this type of tissue probably reflects the fact that epithelial cells divide rapidly. Breast, prostate, lung, colon, liver, stomach, and pancreatic cancers are carcinomas. Only about 1% of cancers arise from connective tissues (sarcomas) and a little less than 10% from blood forming tissues (leukemia, lymphoma, multiple myeloma). In addition, there are cancers of nervous tissue, pigmented cells of the skin (melanoma), and even eggs and sperm. Practically any cell capable of cell division can switch from a normal member of the cellular community to one that pursues its own selfish interests and proliferates uncontrollably.
Carcinomas, sarcomas, and the other members of the cancer panoply are caused by errors in genes - mistakes in the sequence of DNA. But what causes these errors, these mutations? I think that many people when asked this question would answer that most mutations are caused by harmful agents that originate from industrial processes. Air pollution, atomic reactors, artificial chemicals added to our foods, pesticides, and the like would be examples that many would cite. After all, they would say, haven't cancer rates been rising in the modern era?
Cancer is now more prevalent than it was in previous centuries, but that's because we're living longer and cancer is a disease primarily of the old. The fact is that there is strong evidence that the factors listed above are not the primary drivers of carcinogenesis, One surprising case in favor of this view is cited in chapter 20 of "Molecular Biology of the Cell". They argue that in the absence of agents that cause mutations (mutagens) an error in DNA sequence occurs on the average once per cellular division. You can calculate that this means that that on average any given gene will experience a change in sequence once every million cell generations. Alberts et al estimate that in a lifetime a normal human will experience 10 quadrillion divisions (that's a one with 15 zeroes). Dividing the number of divisions by the rate of mutation results in the amazing estimate that every gene in our body will be subject to a change in sequence ten billion times! All this without any external mutagens. If all it took was a single mutation in a growth promoting gene, cancer would be be much more common. The reason it isn't is that more than one event must occur independently in the same cell. That's one reason that cancer rates increase with age.
Of course I'm not saying that external agents aren't ever responsible for cancer. On the contrary. The chemicals in cigarette smoke and the ultraviolet radiation from the sun are major contributors to lung and skin cancer. Asbestos and X-rays are proven carcinogens. There are viruses that have been implicated in cancer. And even some common components of our diet, like burnt toast and grilled meats, are known mutagens and are suspected carcinogens. In addition, as I'll discuss later, genes that are defective in repairing DNA errors can increase the rate of cancers significantly.
In summary, cancer is a disease cause by mutations in genes. While cancer may begin with a single mutation, it takes several errors in multiple genes for the disease to progress to where it is harmful. Which genes? Next time.
I began reading Robert Weinberg's book, "The Biology of Cancer", second edition, a few days ago. It was an illuminating experience. It's a formidable book of over 900 pages that summarizes much of the enormous amount of material that was known about the molecular biology of cancer as of four years ago. But instead of being a dry accounting of the facts, it reads almost like a mystery novel. In many chapters, Weinberg feeds us the data and experimental facts that were known at a given time, and then presents alternatives explanations from which to choose. Which one turned out to be right? And why? I really like this approach. It got me involved. In addition, Weinberg regularly intersperses questions throughout, many of which remain unanswered by the scientific community. To my mind, pointing out what the scientific community doesn't know is just as important as describing what is known. As a former research scientist, I started thinking about experiments designed to address some of these issues.
Weinberg writes beautifully. He also takes the time to explain appropriate techniques and concepts that other authors skip over. And the book isn't as long as the page numbers suggest. It is filled with numerous tables, illustrations and photos. All in all, it is one of the best textbooks in molecular biology that I've ever encountered. Of course, it isn't for everyone. While there's an introductory chapter that attempts to cover the basics of genetics, biochemistry, and molecular biology, beginners without a reasonable foundation in these fields will find the remainder of the book tough going. And even for people with a good understanding of these matters, many of the subjects covered are extraordinarily complex. Weinberg makes a valiant effort to cope, but as he writes in the introduction to chapter six: "The present chapter will perhaps be the most challenging of all chapters in this book. The difficulty comes from the sheer complexity of signal transduction biochemistry, a field that is afflicted with many facts and blessed with only a small number of unifying principles. So absorb this material in pieces, the whole is far too much for one reading." Nevertheless, if you have the time and some background, I highly recommend "The Biology of Cancer". It's the definitive book on the subject.
In the remainder of this post, I've set out my agenda for subsequent entries in this blog. My aim will be to more or less follow the order of chapters in Weinberg's book culminating with his chapter 15 on immunology and immunotherapy. From there, I hope to discuss more recent ways the immune system has been used to fight cancer. Here's what I hope to cover (however, I reserve the right to add or delete topics at my discretion):
Just as it was with immunology, my ignorance of cancer biology was, and is, profound. Accordingly, I looked around for both an introductory text to provide an overview of the subject and a more comprehensive source to fill in details. While it probably isn't suitable for someone with a limited background in modern biochemistry, Alberts et al's book, "Molecular Biology of the Cell", has a chapter appropriately entitled "Cancer" that nicely summarizes the field. Like the bulk of the book, it's beautifully written and, so far, it has proven invaluable. However, I needed a more exhaustive account of the subject. At 960 pages and $162 (for the paperback version), Robert Weinberg's tome, "The Biology of Cancer", second edition, is massive both in volume and cost. By all accounts, it's the definitive authority on the subject. But it seemed to be a bit much, so I opted for a less weighty book, and chose "Molecular Biology of Cancer", 4th edition, by Lauren Pecorino. It had gotten good reviews on Amazon, had a newer publication date, was less expensive, and promised more or less the same information as Weinberg in a smaller package. Unfortunately, after perusing a few chapters, I didn't care for the author's style. So, just this morning, I ordered the Weinberg book. It'll arrive the day after tomorrow. I'll have additional comments about it in the next post.
While waiting for Amazon to deliver Weinberg's book, let me introduce some important general principles.
I'll elaborate on these themes in subsequent posts, thereby providing enough of a decent introduction so that I can discuss how cancer and the immune system interact. Ultimately, I plan on getting to how the scientific community hopes to engineer the cells of the immune system to serve as therapeutic agents against cancer.
As aside: Just this morning (Oct 1, 2018) it was announced that the Nobel Prize had been awarded to James P. Allison and Tasuku Honjo "for their discovery of cancer therapy by inhibition of negative immune regulation". I hope to add their studies to the list of topics that I will cover eventually.
For a summary of the entirety of the immune system I recommend the first chapter in each of the three books on immunity that I mentioned in an earlier post: "Undergraduate Immunology" by C. Erridge, "How the Immune System Works" by L. Sampayrac, and "Cellular and Molecular Immunology" by Abbas et al. Another helpful resource is a website called "Bite Sized Immunology". It has a ton of short articles on most of the topics I've covered in this blog. For those looking to get an overview of the adaptive cellular system, there's a well written summary of the T cell response in an excellent paper that appeared in the journal "Advances in Physiology Education" by Nathan Pennock and others.
Here's my version of the major points covered so far.
Now I'm going to switch subjects. It's been a long time since the initial posting, but recall that the ultimate aim of this blog is to understand how the immune system can be engineered to react against cancer cells. Hence the next postings on cancer.
CD4 and CD8
In order for activation to take place, a T cell not only has to bind to the peptide carried by a presenting cell (mostly dendritic cells), it also must attach to the MHC itself. This feat is accomplished by one of two proteins found on the surface of T cells. They're called CD4 and CD8. CD4 only fastens to MHC II molecules, and CD8 only to MHC I's. Both CD4 and CD8 help to hold the dendritic cell and T cell tightly together. In time, in the various lymph nodes throughout the body, the bond between these two kinds of cells increases even further as a result of the recruitment of additional co-stimulatory molecules. Their embrace groups the intracellullar parts of the T cell receptors, thereby initiating a cascade of chemical signals that ultimately gain access to the nucleus and cause the T cells to propagate, leave the lymph nodes, and perform their functions.
Just a few words about the nature of this "cascade". In a previous post in conjunction with a discussion of complement, I discussed proteolytic cascades, where one protein cleaves an inactive target, thereby turning it into an active proteolytic enzyme. In turn, this newly activated enzyme goes on to perform the same operation on a downstream protein. and on and on. The cascade that is initiated by activation of T cells operates in a similar manner, but instead of sequential proteolysis is makes use of successive phosphorylations. It turns out that many proteins can be controlled by the addition and removal of phosphate groups. These act as kind of on/off switches. The enzymes that perform the addition of the phosphates are called "kinases". In the T cell activation cascade, one inactive kinase gets a phosphate added, thereby becoming active. In turn, the newly awakened kinase adds a phosphate to another inactive enzyme, turning it on. And so on. Such phosphorylation cascades are very commonly used when a cell needs to transmit a signal from its exterior to its nucleus. In this way, it can respond to external events by switching appropriate genes on and off.
Most textbooks divide T cells into two major categories: helper and cytotoxic T cells. Cells with CD4 on their surface are destined to become helper T cells, while CD8 bearing cells become killers (I'll refer to cytotoxic T cells as killer T cells from now on. The name presents such a powerful image I just can't resist. Other names that these cells go by include CTL's and CD8+ cells. However, don't confuse killer T cells with natural killer cells that are part of the innate immune system).
Killer T Cells
Once activated, killer T cells and helper T cells have different missions. Killer T cells, after they have undergone many rounds of proliferation, move into the blood stream and search for cognate peptides presented on MHC I proteins. Since virtually all cells have MHC I's, any cell infected by a virus will display viral peptides on its surface and be vulnerable to passing killer T's. Like natural killer cells and professional hit men, killer T cells destroy their victims neatly, without leaving a mess and without too much collateral damage. They cozy up to their targets and inject special enzymes through the cell membrane into their quarry's cytoplasm, causing the victim to commit suicide (apoptosis). Alternatively, receptors on their surface can bind special proteins on the surface of their targets that in turn will elicit a suicide response.
Helper T Cells
CD4+ helper T cells play a more complicated role. They secrete a variety of cytokines that instruct other cell types, B lymphocytes, other T cells, macrophages, and dendritic cells, on ways to fight infection. There are three major types of helper T cells: Th1's, Th2's, and Th17's, each of which secretes a different spectrum of cytokines, and each of which is specialized to combat different foes. For example, Th1 helper T cells secrete a variety of cytokines that activate macrophages, which, in turn, gobble up cells infected with intracellular pathogens. In addition, they interact with B cells, causing them to switch to making IgG antibodies. Th2 helper cells, which master in the ability to defend against parasitic attacks, secrete other cytokines that among other effects, causes class switching in B cells to IgE. Th17's, only recently discovered, seem to focus on fungal invaders.
Remarkably, it turns out that the helper T cells are instructed to assume their various roles by dendritic cells. The numerous receptors that I previously described in the innate immunity section that are located on dendritic cells cause them to secrete specific cytokines appropriate to the triggering pathogen. In turn, these cytokines cause the helper T cells to become committed to one or another of the tracks described above.
After T cells have been activated they proliferate and go off to do their job. After their mission is completed, most get no social security or pension and just wither away via apoptosis. A fraction remains in the site where they first encountered the invader. If another pathogen of the same kind infects again, they are already activated and resume their attack Another fraction of activated T cells take up residence in lymph nodes. If they encounter the same invader again, they are much more readily activated than naive T cells, and seek out their quarry with increased efficiency. This immunological "memory" is one of the hallmarks of the adaptive immune system.
That's it. That's what I've learned about the immune system. Of course, all the information that I've covered is just a tiny fragment of what is known. But still, it's a lot to learn and remember. I'll present a brief overview in the next post. After that, I'll begin a discussion of cancer and the immune system response.
T Cell Receptors
T cells detect their targets by way of their T cell receptors. I've shown a cartoon of this structure in a previous post. Recall that it consists of two chains, alpha and beta, that each bear a variable and constant region. Like their B cell counterparts, T cell receptors can assume enormous diversity in sequence due to random recombination of their variable gene segments. However, the T cell receptor doesn't refine its binding site after activation like the B cell receptor. Nor is there switching of constant regions. And, as I've already mentioned, the T cell receptor only binds to peptides that are affixed to the MHC. A model of the T cell receptor bound to a peptide that is being presented by a MHC is shown at the right. The molecular modelling program, Chimera, was used to generate the picture.
There are several other parts to the T cell receptor aside from the peptide recognition chains shown in blue in the figure above. In fact there are six additional protein chains that associate with the alpha and beta chains. They function to signal the inside of the cell that the recognition part has found a target. When properly stimulated, they signal to the cell nucleus to begin synthesizing the appropriate molecules that "activate" the various kinds of T lymphocytes so that they can begin fulfilling their functions. I'll discuss what these functions are in a subsequent post. For now I'll discuss the mechanism of the activation process.
The first step in T cell activation is binding of its receptor to a MHC borne antigen. Previously I mentioned that almost all cells have MHC I proteins on their surface and many have MHC II's. But naive T cells, ones that haven''t ever encountered their cognate antigens, can not be activated by any old MHC-bearing cell. Activation requires that they interact with what immunologist have quaintly termed a "professional" antigen presenter. There are three kinds of these professional cells, but one is most important. And it's one that we've encountered previously in the innate immune system: the dendritic cell.
Dendritic cells are located all over the body. In the absence of a microbial invader they assume a resting state, and, as such, are not very good at activating T cells. But they keep one eye open for trouble. Using their pattern recognition receptors they can detect an attack that is directed on them. Or they can react to an indirect attack on a neighbor by responding to chemicals given off by cells that are in distress. They also can respond to cytokines secreted by macrophages and neutrophils (remember them?). All of these situations result in dendritic cell activation. Yes, that's right. Dendritic cells must be activated before they, in turn, can activate T cells.
The results of this activation are profound. The dendritic cells begin a journey to the nearest lymph node. As they move they transfer additional peptide-loaded MHC I and II molecules to their cell surfaces. In addition, dendritic cells begin to mobilize several surface proteins (co-stimulators) that are complementary to receptors on the surface of T cells. When they reach their destination they encounter numerous T cells. If a T cell with the appropriate receptor encounters an activated dendritic cell presenting a corresponding peptide and co-stimulators, the two cells will cozy up to each other. It is the junction between the two cells when they are in close proximity that activates the T cell. A portrait of their embrace is shown.
There is one additional feature of T cell/dendritic cell interaction that I haven't explained. It has to do with the protein labelled "CD4 or CD8" in the diagram. I will discuss this matter along with the role of activated T cells in the next post.
In order for T cells to recognize proteins located inside of cells as antigens they must be properly "presented". What does that mean? In brief, presentation consists of two steps. First the protein must be split into fragments of the proper size. And second, the fragments have to be be positioned on the cell surface so that the T cells can interact with them. Two impressive molecular machines and a cell vacuole achieve these objectives.
As I mentioned a while back, all of our proteins are being degraded and replaced all the time. Some "turn over" faster than others. In particular, defective proteins, ones that carry errors in sequence, are particularly short lived and are chopped up into pieces rapidly. (Amazingly, I've seen several references including this one that estimate that 30 - 70% of proteins normally synthesized by cells are defective.) Viral proteins that are made within cells also degrade quickly. There is a mechanism that I won't describe that marks these, and other, proteins for discard. But the machine that actually does the dirty work, the one that breaks the protein apart, is called a "proteasome". I show two views of proteasomes on the right. The one labelled "A" comes from a website of the U.S. Department of Energy Genome Programs. It shows a cartoon version of a proteasome caught in the act. The green snake-like line at the top is meant to represent a protein as it undergoes digestion. It enters at one end of what looks like an in-sink garbage disposal and gets cut into pieces in a hollow in the middle. Out comes protein fragments - peptides - consisting of ten to a few dozen amino acids. The second image, "B", is a portrait of the central portion of the proteasome as depicted by the molecular modelling tool, Chimera.
All human cells carry these machines inside them, but the cells of the immune system carry a modified proteasome in which some of the components have been replaced. The protein fragments that are produced by these specialized machines are more well suited for presentation to the major histocompatibility complex.
What about viral proteins that haven't been synthesized in cells and are simply floating around in the extracellular fluid? Nature has devised an entirely different mechanism for dealing with this situation. While I wasn't aware of it, cells continually take sips of the liquid surrounding them. They enclose a tiny drop in transfer it into their interior surrounded by a pinched off portion of the cell membrane. These vesicles, called endosomes, that result from this action slowly change their internal composition, accumulating a host of digestive enzymes and becoming more acidic. In this way, they become capable of breaking down any proteins that may have been captured into fragments that are suitable for presentation.
It is the proteins of the MHC that bind these protein fragments and display them on the surface of cells so that they can be detected by T lymphocytes. But, as usual, there are some complications. For one, there are two kinds of T cells: Cytotoxic T lymphocytes and helper T lymphocytes. For another, there are two corresponding MHC's: MHC I and MHC II.
Let me tackle MHC I and cytotoxic T cells first. MHC I displays the fragments generated by the proteasome on the surface of cells. The MHC I is ideally suited for this purpose as shown in the illustration above. The pictures were created using Chimera, the molecular modelling program that I alluded to earlier. On the left is a view of MHC I from above. The protein fragment held on a platform near the top of the molecule is clearly shown in green. It looks to me like an Incan sacrificial animal on an alter, truly a presentation to the gods. The picture on the left shows the same molecule as seen from the side. Neither view is of the complete molecule. There's an additional part of the protein that is not shown. It is located near its base. It traverses the cell membrane and extends a small segment into the cell for anchoring the MHC to the surface.
Here's the way presentation works in more detail. Proteins that are marked for destruction enter the proteasome where they are enzymatically cut into pieces. These protein fragments are transported into the endoplasmic reticulum, a network of membrane-bound sacs that occupies a good proportion of most cells. Here the MHC I proteins bind the peptides, one peptide to one MHC molecule. After a series of additional steps, the MHC-peptide complex is transported to the cell surface where it hang outs, waiting for a wandering T cell to recognize the peptide that the MHC carries. But it doesn't wait long. Old MHC/peptide complexes are constantly being degraded. New ones carrying some other peptide take their place. The process ensures that the internal protein composition of the cell is continually being sampled.
Now, I learned in graduate school that proteins bind substances with exquisite specificity. That's true of enzymes and of antibodies, for example. But the MHC's are exceptional in that they bind a wide variety of protein fragments. But not every peptide will adhere equally well. To bind to MHC I, peptides have to be about 10 amino acids long and there are some additional requirements as to which amino acids are at their ends.
There's more. I've said that each human has are six MHC I genes and therefore six corresponding MHC I proteins. Because of their diversity in sequence (there are over 1,000 MHC variants in the human population), the chances are that each of our MHC I genes specifies a different MHC protein. And, for the same reason, unrelated people will carry different MHC's. It turns out that each of these various MHC's differs in its ability to bind specific peptides, a fact that has an important consequence. If there is a viral attack on the human population, the wide range of specificities of the MHC's means that some individuals will have an MHC that can successfully bind one of the viral peptides and thereby fend off disease. Of course, that means that some people will be less able to do so.
After the MHC I rises to the cell surface it can interact with a cytotoxic T cell. I'll pursue this matter further in the next post.
Mature B lymphocytes secrete antibodies in great quantity into the blood stream. In aggregate they are capable of synthesizing proteins that are capable of binding to and dealing with almost any molecular intruder. But they have one great limitation: they can't see inside of cells. Antibodies can bind to the surface of foreign microbes, but when invaders, like viruses, penetrate the cell membrane and take up residence within, they are hidden from attack. Evolution has, of course, devised an appropriate solution. It has developed the second arm of the adaptive immune system: cellular immunity. The major players in this process are two kinds of T lymphocytes, so named because they mature in the thymus rather than the bone marrow.
T and B lymphocytes share many properties but also exhibit considerable differences. One critical distinction between the two cell types is that T cells don't secrete antibodies. In fact, T cell don't make antibodies at all. However, like B cells they have receptors that have constant and variable regions that derive from genes whose segments have been randomly rearranged during their development. Utilizing this mechanism, millions of T cell clones are generated, each bearing a different receptor. Because of the great diversity of binding sites in the T cell population, it would seem like T lymphocytes would be capable of recognizing many different molecules. Well, they do and they don't. T cells are specialists. They can't bind to molecules just floating in the blood. What's more, they're restricted to the kinds of molecules to which they can bind. They only (with a few exceptions) recognize small fragments of proteins. And only protein pieces that are properly "presented" to them, meaning that they only can bind peptides that are in the grasp of a special apparatus on the surface of another cell. I'll begin this post on cellular immunity with a short description of the gene that specify the proteins that act as the presenters, MHC I and MHC II. In the next post, I'll tackle the presentation mechanism itself.
MHC - Genes
I wrote briefly about the major histocompatibility complex in a previous post in connection with the natural killer cells of the innate immune system. Because the MHC plays a much bigger role in the workings of T cells, I'll expand upon my previous discussion here.
The MHC refers to two entities, and the distinction is not always clearly noted. First, there are the MHC proteins, a subject that I'll get into in the next post. And second, there is the region of the sixth human chromosome that codes for them - the major histocompatibility complex. As a geneticist, I was surprised to learn that the chromosomal MHC spans an enormous distance, some three million five hundred base pairs. That's nearly the size of an average bacterial genome! Located in this region are over 200 genes, about half having a known function in immunity (the role of many has not yet been determined). The location of the six MHC genes and several others are shown in the figure above. Notice that three of these genes specify the MHC I proteins (in blue), and three the MHC II's (in yellow).
Perhaps the most intriguing property of the MHC I and II genes is their enormous variation in sequence. Of course that's reflected in the proteins they code for as well. I've already written about the variation in antibody genes and T cell receptor genes. But the variety of the MHC's is a different story. Because we inherit two six sixth chromosomes (one from our mother and one from our father), we humans have 12 MHC genes in all. We're born with these genes and, unlike antibody genes, we retain the same sequence and number throughout our lives. The variation that I'm writing about occurs within the human population. That is, if you were to determine the sequence of the MHC genes in thousands of unrelated individuals, the probability is that few would be the same.
Now to be clear, almost all genes show variability from person to person. For example, there are about 1,000 known variants in the beta hemoglobin genes (yes, there's more than one) within the human population. But most of these differences in DNA sequence are rare. Most people bear only one form of the gene. The MHC genes are different. More than 10,000 variants are known, and they're widely distributed among the populace. According to Abbas et al., the MHC genes are the most variable found in "any mammalian genome".
What intrigues me as a geneticist is how this variability is maintained. In most cases, if a change of sequence occurs in a gene – a mutation – it will either be more favorable than the existing sequence or not. If so, it will tend to replace the existing gene in the population over time, eventually becoming the dominant form. If not, it will be selected against and tend to disappear. (There may, of course, be mutations that are neutral, neither favorable or harmful. These will replace the prexisting gene at a random rate partially dependent on how often they appear in the population). This maintenance of enormous variability almost certainly has to do with what the MHC proteins do and how they do it, subjects that I'll take up in the next post.
An antibody switches its class due to rearrangements in the constant region of the heavy chain gene. Intervening DNA segments are removed and new ones are appended to the variable regions (see the figure entitled "Class Switching" below). The various constant region segments are given Greek letter names. The names of the antibodies that result use the English equivalents. As a result, there are five major classes to which an antibody can belong: IgM (constant region C-mu), IgG (constant region C-gamma), IgD (constant region C-delta), IgA (constant region C-alpha), and IgE (constant region C-epsilon). You may have noticed a discrepancy. There are nine constant regions in all, but only five classes. That's because several of the constant region segments come in subclasses. There are four C-gamma constant regions, numbered 1-4 and two C-alpha's. Each of these subclasses plays a slightly different role, a complication that's outside the scope of this discussion. Again, remember that the binding site of an antibody from any given B cell (and its descendants) remains the same regardless of the class of the constant region.
Antibodies start out by elaborating IgM's, the complicated pentamers pictured in the last posting. IgM antibodies are good at initiating the cascade of reactions that activate complement. I've already described how the innate immune system can activate the complement pathway. An alternative pathway to initiating complement activation is through IgM antibody binding. The initial steps are different, but the end result is the same: a complement cascade that can destroy invading microbes to which IgM has bound.
IgG antibodies represent a large fraction of all the antibodies in the blood. Their "Y" shape should be familiar to you by now. They're important in several ways. By binding to the outer walls of microbes, they mark pathogens for phagocytosis by cells of the innate immune system. Similarly, surface receptors on natural killer cells bind to them, marking those cells covered with antibody molecules for destruction. They also bind to viruses and some toxic products of bacteria, rendering them impotent. Like IgM's, IgG molecules can also spark the complement activation cascade, but they are less capable of doing so.
IgA antibodies are dimers - similar to two IgG monomers linked together. They are the most abundant of all antibodies - an average adult human makes two to three grams per day. Most of it is secreted into the gut where it acts to bind to pathogens. Apparently the structure of IgA makes it resistant to the degradative enzymes and harsh condition that are present there.
IgE antibodies resemble IgG molecules but, of course, have their own distinctive heavy chain tail. They are produced in response to a variety of parasitic infections and bind to their quarry in great numbers, cloaking it like snowflakes adhering to a parked car after a winter storm. Receptors to IgE antibodies on mast cells bind to the this surface coating. The newly recruited mast cells respond by releasing their toxic contents in the close vicinity of the parasite. In addition to the poisoning of their quarry, the result may be dilation of nearby blood vessels, contraction of smooth muscles, and fluid retention. These reactions are appropriate if some worm has infiltrated our body, but IgE is also expressed upon exposure to less harmful challengers. For example, pollen can elicit an IgE reaction and subsequent mast cell response. Our runny noses and watery eyes are evidence of the many pollens in the air interacting with our immune system. Another stimulant of IgE production is bee or wasp venom. Upon repeated stings, the body may respond massively, releasing mast cell contents in many tissues simultaneously. The result may be anaphylactic shock, a very dangerous condition.
Most activated B cells become plasma cells, highly productive workshops for synthesizing massive amounts of antibodies. They perform their duties with great energy, but when they have accomplished their job, when the invader has been vanquished, they die to be replaced by new B cells with different specificities. But by mechanisms unknown, a few B cells activated by helper T cells, undergo profound changes. They stop secreting antibody. They change their surface proteins. They remain alive, sometimes for many decades. With the help of T cells, they undergo further class switching and hypermutation, thereby becoming capable of secreting even more potent (and appropriate) antibodies. And they patrol the blood stream and lymph nodes, looking for some substance that matches their antigen recognition site. When they encounter such a molecule, they respond with force, much more powerfully then the initial antibody response.
Edward Jenner was the first to take advantage of this immunological memory in England the late 18th century. We continue to use more sophisticated versions of his technique, now called "immunization", to protect populations from a great variety of diseases.
CD stands for "Clusters of Differentiation". Also "Certificates of Deposit" and "Compact Disc", but those subjects are outside the bounds of this blog. The CD nomenclature is principally applied to cells of the immune system and has a long back story that begins with entities called monoclonal antibodies - a topic that I could greatly expand on. However, I'll try to make it short.
When a foreign protein enters our bodies, we respond with a burst of antibody production. Many different antibodies may bind to this single molecule. That's because, as I mentioned in the last post, proteins are complicated with many nooks, crannies, cavities, and protrusions. A single antibody is capable of recognizing and binding to only one of these surface features as shown in the illustration on the right. It is for this reason that the antibody response is called "polyclonal", meaning that many different B cells, each with a distinct recognition site, bind to the antigen, each eventually elaborating different antibodies.
To make it "monoclonal", restricting the response to one B cell and its descendants, you would have to isolate a single B cell, place it in a culture vessel, and have it reproduce many times to yield a clone of identical progeny. While this approach is feasible in theory, and actually was done in practice, its utility depends on the ability for B cells to reproduce indefinitely in a culture vessel, something that it is incapable of. That's because all differentiated human cells have a limited lifespan. Most cancer cells, by contrast, are immortal, and can grow in culture forever. In 1975, Cesar Milstein, an Argentinian immigrant, and Frederick Kohler, a German postdoctoral fellow in Milstein's laboratory at the Medical Research Council in the United Kingdom, fused a mouse B cell with a cancerous plasma cell, to create a hybrid that was immortal and capable of spitting out loads of antibody. For their efforts, they shared two-thirds of a Nobel Prize in 1984.
Why was their achievement so important? Monoclonal antibodies can be produced in great quantities and have an exquisite specificity. As such, they can act as "magic bullets" and home in on a specific therapeutic target. Since 1985, more than 73 monoclonals have been approved by governmental agencies, 33 of these in the last four years alone. Rheumatoid arthritis, multiple sclerosis, psoriasis, asthma, and several kinds of cancers are some of the diseases that are being treated by these reagents. Six out of 10 of the best selling drugs in the world are monoclonals. They're also extremely useful in diagnosis, acting as probes that allowing scientists to identify specific cells. (For a more extensive discussion of monoclonal antibodies, I recommend "The Lock and Key of Medicine: Monoclonal Antibodies and the Transformation of Healthcare" by Lara V. Marks, Yale University Press, 2015).
And that's where clusters of differentiation comes in (did my digression distract you?). As monoclonal antibodies became more widely used, a number of laboratories used them to characterize the cell surface proteins of a variety of immune cell types. Many of these cell types were indistinguishable microscopically They could only be told apart by the monoclonal antibodies that bound to the antigens that they bore on their surface. With many laboratories working with many monoclonals and a host of cell types, it soon became a mess. For example, lab A identified an immune cell that reacted with a specific monoclonal antibody that they had prepared. Lab B found a similar looking cell but used a different antibody. Were the two cells the same? Were the two antibodies binding to the same antigen? No one knew. There was a further contribution to uncertainty. Each laboratory gave its own favorite name to the cells they had characterized and the antigens that they had detected. Confusion reigned.
In 1982 immunologists met together in Paris in an effort to resolve this chaos. The meeting was called the "First International Workshop and Conference on Human Leukocyte Differentiation Antigens" and the result was the CD nomenclature. The term "Workshop" is appropriate. Each monoclonal antibody was tested in the lab. When two or more were found to bind to the same molecule, a CD number was assigned to that antigen. Since 1982, nine additional workshops have been organized. The last was held in Australia in 2014. Over that time, some 370 CD numbers have been allocated.
(I admit to having been bothered by the term "clusters of differentiation", both the "clusters" and "differentiation" parts. People using the term in different ways just added to my confusion. I eventually learned that surface proteins are often referred to as differentiation antigens. The cluster part of the term comes from the fact that the same molecule, most often a protein, may be bound by a group/cluster of different monoclonals (remember that proteins present a variety of different surface features and therefore may be recognized by different antibodies). The important point to remember is that CD's refer to surface proteins that can be used to classify different cell types. An article in Wikipedia states that some prefer to call CD's "Classification Determinants", a better name in my opinion.)
You'll recall that the heavy chain gene bears multiple constant regions. When a B lymphocyte is first activated, it caries a Cm segment on its heavy chain, and produces antibody of the IgM class (some IgD may also be made, but only in small quantities). IgM antibodies look a lot different than the ones that I've described previously. They are pentamers, consisting of five Y shaped antibody molecules bound together (see the figure). Later on, as the B cell matures, it can change its heavy chain constant region. By cutting out the other C regions, it can append one of the other C regions onto the heavy chain gene. The antibodies produced as a result of this switch may be IgG, IgE or IgA depending on which C segment remains abutted to the remainder of the gene. But remember, the variable regions, the antigen binding site, remains as it was, meaning that the resultant antibody retains its specifity.
You may wonder why "class switching" occurs, especially since switched antibodies continue to bear the same variable region. The answer is that the various classes of antibodies have different functions. I'll discuss these functions as well as memory cell formation in the next positing.
T Cell Dependent B Cell Activation
I've had some difficulties with this topic. Abbas et al's book describes the subject in great detail and I've found it hard to follow. Sompayrac devotes only a paragraph to it. The clearest, and simplest, explanation comes from Erridge's book, supplemented by material from Wikipedia. Here's what I learned.
The antigens that invoke a T-dependent activation of B cells are mostly proteins. That's because proteins are not repetitive molecules in the same way that polysaccharides are. That is, their surface consists of a great variety of distinct three dimensional features. A different receptor binds to each one. Therefore, when a B cell encounters a protein antigen, it can't cluster its B cell receptors in the same way that it does when binding a polysaccharide. Instead, when a B cell receptor binds to a protein, it internalizes it, meaning that it draws it into its interior. Subsequently it processes the protein into small fragments that it displays on the cell surface (much more on this later when I talk about T cell functions). These fragments can be recognized by helper T cells. Upon recognition, the T cells become active, secrete cytokines, and, at the same time, express a ligand on their surface called CD40L. B cells have a receptor for this ligand called CD40, with the result that the T cell and B cell bind to each other. This in turn causes the B cells to become active. They begin dividing and churning out antibody. This process is complex and takes place over several days. The cells make good use of this time. They produce antibodies that are much better at binding antigen and more versatile than those produced via the T cell independent pathway, employing a mechanism that I'll describe below.
What's with the CD nomenclature? You promised to trip light on the jargon and not use abbreviations. I'm glad you asked because it offers me the opportunity to discuss two other topics of interest. However, because the explanation is lengthy, I'll put it off until the next post.
Recall that antibodies are constructed from proteins that are specified by heavy and light genes. The heavy chain genes carry nine multiple constant regions with names like Cm, Cd, Ca, etc.. Newly activated B lymphocytes synthesize heavy chains bearing a protein coded for by the Cm region. Therefore the antibodies that they produce are said to members of the IgM class, where the "Ig" part indicates "immunoglobulin" and the "M" stands for the Cm constant region. Again, I'm going to put off a discussion of how the different classes of antibodies differ from one another and what roles they play for later. For now it's important to know that the class of an antibody is dependent on the heavy chain it carries, and that while undergoing T cell dependent activation, the B cell can switch the class of its heavy chain (Heavy chain switching doesn't occur in T cell when the B cell is activated in a T cell independent manner). This switching occurs via DNA rearrangement, causing the VDJ variable region of the heavy chain to become joined to the constant region. During the switch, rearrangement the variable region remains as it was, retaining its specificity for whatever antigen promoted the activation of the B cell. Any of the nine constant regions can be appended to the variable region, resulting in the production of different class of antibody. Each of these classes play a different role as I'll explain next time.
However, class switching isn't the only change that occurs to the antibody genes in T cell dependent activated B cells. The variable region also undergoes a change in DNA. As a B cell divides, some of the variable region DNA sequence begins to change (mutate) at a rate at least 1,000 times that of normal genes. The end result may be a variable region that differs by as much as 5% from that with which it began. Since these changes are more or less random, any given cell may carry an improved antibody (one which binds more tightly to its cognate antigen) or one that isn't any better at binding than it started out with, or one that it is worse or loses its affinity for the antigen altogether. If that's the case, how does hypermutation improve the immune response?
The answer is: via selection. Those B cells that have managed to acquire an improved antibody via hypermutation continue to interact with T cells, and they signal via CD40L to remain alive. In the absence of a good T cell interaction, the B cells will cease proliferating and die. In this way, B cells with "improved" antibodies will dominate the population.
Next time, I'll discuss the benefits of class switching, what "CD:" means, and immunological memory.
Millions of B cells that have passed through multiple checkpoints in the bone marrow will now enter lymph nodes. They're explosive devices, like naval mines, each capable of destroying a specific microbe or microbial product that they've targeted by virtue of the receptors bound to their surfaces. Most, however, will never encounter their specific prey. After a few months these unfulfilled B cells die, to be replaced by new generations that are constantly being spawned in the bone marrow.
B cells that have found and bound a target molecule first need to become activated, meaning that they have to undergo some drastic changes in order to transform into death dealing devices. Activation causes a burst of proliferation and the ability of individual cells to churn out soluble antibodies in enormous numbers (according to Abbas et al, a B cell can give rise to 5,000 descendants that can collectively synthesize a trillion antibody molecules in a week). The change is so dramatic that immunologists even give fully activated cells their own distinct name - "plasma cells". It's important to understand that a plasma cell continues to bear the same rearranged gene before and after it has met its cognate antigen. As I'll elaborate, this means the antibodies it produces are essentially the same as those borne by the receptor affixed to its surface (although, intriguingly, they may be improved under some circumstances).
But even after all it's been through, binding to an antigen is a necessary but not a sufficient action for activation of B cells to occur. One of two possible additional steps is needed. One route to activation requires the participation of a T cell, the second type of adaptive immune cell that I'll discuss in many subsequent posts. The other utilizes parts of the innate immune system. This second activation pathway is called "T cell independent". I'll discuss it next.
T cell Independent Activation
The antigens that generate the T cell independent activation pathway are generally repetitive chemical units, polysaccharides or lipids, borne on the surface of bacteria. As described in the last post, binding to these molecules causes the B cell receptors to cluster, priming the cells for activation. The actual activation process can be promoted by cytokines secreted by cells like macrophages that have detected an invader. Alternatively, a B cell that has detected a microbe can be activated by the simultaneous binding of its own Toll-like receptors to the antigen. Yet another source of co-activation are components of the complement system that are bound to a microbe. These are recognized by special receptors on the B cell surface. The last two of these routes to activation are illustrated in the cartoon on the right. You'll notice that all these routes to T cell independent activation involve participation of the innate immune system, emphasizing the close relationship between the two.
From what I can gather from the sources that I have access to, B cells activated via the T cell independent pathway don't marshal as robust an immune response as is provoked by the T cell dependent route. While B cells activated via the T cell independent mechanism proliferate and secrete antibodies, they appear incapable of undergoing two processes that I'll discuss in a coming post: class switching and somatic hypermutation. They also can't become memory cells, another topic that I'll have to put off for another day . However, T cell independent activation of B cells is fast. And it is often sufficient to meet the challenge of some bacterial infections.
T cell Dependent Activation
T cell dependent activation is generally promoted by proteins. That's because proteins don't often bear repetitive antigenic groups. B lymphocytes take up some proteins from invasive microbes, process them, and present them to helper T cells. In turn, the T cells help complete B cell activation. The B cells activated in this way can undergo class switching and hypermutation, the terms I introduced above. How all this unfolds and what it means will be the subject of the next post.
A B lymphocyte must feel like a student in an uncompromising boarding school. It is forced to endure a series of tests. If it passes them it gets to advance to the next grade. But if it fails, the consequences are grave. In fact, the grave. I've described a number of these high stakes trials in the last post. When a cell has overcome them, and while still in the bone marrow, it makes use of its newly rearranged antibody gene to synthesize B cell receptors, proteins that embed in the cell membrane. The cartoon at the right shows one of these. Notice that the molecule traverses the cell membrane with only a tiny portion (three amino acids) of the heavy chain sticking into the cell. Notice also that most of the receptor looks just like a typical antibody as depicted in a previous post. It bears at one end an antigen binding site and at the other a tail consisting of heavy chains specified by one of the constant segments of the heavy chain gene. Once in the membrane, it pairs with two other transmembrane proteins, Ig-alpha and Ig-beta. And there it must sit awaiting yet another trial.
Why is another hurdle placed before a B lymphocyte before it can leave the bone marrow and take up residence in sites where it can detect invaders? We've already seen that antibodies can take on millions of different configurations in its variable region, so many that it can bind to virtually any molecule. Of course, that means that molecules that are not foreign, not invading microbes, can also be detected. If that happens, the immune system may attack substances that are naturally part of the body (referred to as self molecules) resulting in potentially serious autoimmune diseases. The last step in B cell maturation helps to ensure that that doesn't occur. If a B cell carries a receptor that strongly reacts with one of the surrounding molecules in the bone marrow, it is directed to commit suicide. Only those cells that don't encounter a molecule to which it can bind pass this test. If a cell does, it exits the bone marrow, and heads toward more appropriate pastures. A summary of the checkpoints that B cell must endure is shown in the figure at the right.
Two questions. First, how does the B cell receptor get positioned on the cell surface? Second, since its binding site is outside the cell, how does it communicate to the interior that it has detected a molecule to which it can bind?
The answer to the first question is relatively simple. The role that a particular antibody plays is dependent on the constant region of the heavy chain. Recall that the heavy chain gene bears nine different constant segments. Any one can potentially join onto the rearranged variable segments but at this stage of B cell development, either a special C-mu or C-delta segment attaches (it does so via RNA splicing, not DNA rearrangement). These heavy chains are special in that they bear a short tail of amino acids. The resultant protein is directed to the cell membrane because of the chemical nature of this end piece. It binds there, sticking just three amino acids across the membrane into the interior of the cell to help hold it in place. In addition, two other proteins, Ig-alpha and Ig-beta, join the receptor at the cell surface, aiding in its attachment to the membrane.
One clarification. B cell receptors are bound to the cell membrane not like notes tacked to a billboard but like rafts on a pond. Like rafts, they cannot sink or rise. Instead they float freely on the surface. The fact that B cell receptors can move about helps to provide an answer to the next question.
How does a cell pass information from the outside to the interior? The answer involves a discussion of "signal transduction", a term defined by an anomymous Wikipedia author as follows: "Signal transduction is the process by which a chemical or physical signal is transmitted through a cell as a series of molecular events, ... which ultimately results in a cellular response. Proteins responsible for detecting stimuli are generally termed receptors..." When a B cell receptor has bound an antigen in its variable region, it utilizes the two transmembrane proteins previously described, Ig-alpha and Ig-beta, to transmit a signal to the interior. The details of signal propagation are Rube Goldbergian in complexity. They're critical to learn for those carrying on research in the field, but aren't if sufficient general interest for me to want to either learn or teach it. However, the initial events are fascinating and relatively straightforward in principle.
Most invading microorganisms are surrounded by a picket fence of sorts that consists of multiple copies of the same molecule linked together to form a protective barrier. Each B cell receptor can bind one of these molecules. And, since there are many B cell receptors on any given B lymphocyte, many receptors may bind at the same time to adjoining "pickets". Since receptors can move about on the cell surface, they may clump together as shown in the figure at the right. This cluster of receptors brings the Ig-alpha and Ig-beta proteins of adjacent molecules close together. And when they are in proximity they enzymatically alter each other, causing a chain of enzymatic reactions that end up signaling to the cell that something wicked is afoot. These signals help activate the B cell, causing it to become an antibody synthesizing powerhouse, a topic for next time.
Adaptive immunity is so complex it's difficult to know where to begin. Do I start with the cells or the molecules? Actually, I've decided to do neither. Since my main interest is in genetics, I'm going to lead with genes. In particular, the genes for antibodies.
Recall that antibodies are proteins expressed by B lymphocytes (small white blood cells) that consist of four chains of two types: heavy and light. There's one kind of heavy chain and two light ones (called lambda and kappa). Since DNA sequences encode proteins, that must mean that there are must be three genes that are responsible for specifying antibody molecules. And there are. All big ones. The heavy chain gene is found on the 14th chromosome in humans. It's very long, about 1.24 million bases. One of the light chain genes, kappa, located on chromosome two, is even longer, at 1.8 million bases. The other light chain gene, lambda, is found on the 22nd chromosome and stretches over one million bases. (I'm omitting the fact that humans have two of each chromosome, and therefore twice as many antibody genes as I've described. The immune system deals with these two sets of genes in an interesting way. Stay tuned).
But there's a problem. I've already noted that there are at least millions of different antibodies. Put another way, there are millions of antibody proteins each with a distinct amino acid sequence. You'll recall from elementary molecular biology that the sequence of a protein is specified by a gene. That must mean that there must be millions of antibody genes. Which is it? Just three or millions of genes? The fact is that if there were millions of antibody genes and each one was more than a million bases long, the genome would need to be a thousand times bigger than it is. On the other hand, how can three genes dictate the sequence of millions of proteins?
The answer to this conundrum is that antibody genes have some unique properties that allow a single DNA sequence to specify more than one protein. In simplest terms, it accomplishes this task by randomly mixing and matching gene segments, taking a piece from one region and pasting it onto a piece from another and deleting the region between the two. (There is another mechanism for generating multiple proteins from a single gene: alternative splicing. That process is entirely different than the one described here. Alternative splicing occurs in the RNA transcript. Antibody diversity in the variable regions of light and heavy chains is generated by changes in the DNA).
At the right is a diagram of the human heavy chain gene. It's composed of a series of different segments. There are 40 or so "V" or variable segments, each about 300 bases long. Adjoining them are about two dozen relatively short "D" or diversity segments. Next come six "J" or joining stretches, each some 3 to 6 dozen bases long. Finally, some nine or so "C" or constant regions make up a region near the end of the gene. Both the kappa and lambda light chains are similarly structured although light chains lack D sequences and have different numbers of V, J, and C regions.
In a complicated series of enzymatic reactions, these gene pieces are randomly assembled together into a different functional gene in each antibody producing cell (see three of the possible arrangements in the figure above). Notice that substantial portions of the gene are discarded in this process. It's also important to emphasize that these rearrangements only occurs in B lymphocytes. In all other cells in the body the heavy chain and two light chain genes remain unmodified.
The events leading to antibody diversity begin with the joining of one of the D segments to one of the J's. Intervening DNA is thrown away. Subsequently, a V segment joins the group, again with the removal of the bases in between. The J's remain attached to the C regions. The result is an edited gene consisting of one V, D, and J region, the three of which code for the variable portion of the heavy chain, followed by the constant region segments.
How many different antibody producing cells could result from these operations? Simple probability tells us that all the possible combinations in the variable region can be calculated by simply multiplying the number of potential V, D, and J segments. This comes to 45 X 23 X 6, or about 6,000. There's a lesser number of combinations possible from the V-J joining in light chains, say 200. Multiplying 6,000 by 200 yields about a little more than a million. While this is an impressive number, it doesn't take into account the fact (see below) that many combinations of heavy and light chains don't yield functional antibodies. This leaves us short of the millions of different antibodies that is claimed for the immune system. Quite a bit of added diversity comes from the results of joining the D to J and V to DJ segments together. Because this process is imprecise, bases are added, lost, and changed at random at the junctions. While inefficient, often leading to antibodies that can't possibly bind to anything, it results in a tremendous increase in the number of DNA sequences. As we'll see, the immune system employs an evolutionary process that rids the body of defective antibodies and selects for the ones that work well.
As I've noted, all these gene rearrangements occur only in B lymphocytes, the cells responsible for antibody production. However, somewhat similar changes to DNA sequence occur in T cells, the other major player in the adaptive immune system, the one responsible for cellular immunity. In T cells, the proteins analogous to antibodies are the T cell receptors. There are two genes responsible for the synthesis of 95% of these proteins (I've omitted the two other genes that code for the minor forms of the T cell receptor for simplicity's sake). The T cell receptor beta chain gene, about 600,000 bases long, lies on chromosome 7. The T cell receptor alpha chain gene, about a million bases long, is located on chromosome 14. Like the antibody genes, the T cell receptor genes carry repeated V, D, and J segments (D is absent from the alpha chain sequence) and these are rearranged in a similar manner to produce many millions of T cell receptors. A cartoon portrait of the T cell receptor is shown at the right. As in antibodies, these receptors bind to their targets via the variable regions shown near the top of the figure.
Back to B lymphocytes... Their development occurs in the bone marrow in humans and in a specialized structure called the bursa of Fabricus in birds, hence the "B". One of the first events in a B cell's life is the rearrangement of the heavy chain's gene on one chromosome as described above. Remarkably, if this process is successful, the heavy gene on the other chromosome doesn't undergo rearrangement. If not, it does. In either case, only one of the two chromosome 14's bearing the heavy chain gene participate in antibody formation. In those cases where both chromosomes can't specify functional proteins, the cell commits suicide, programmed cell death.
If a pre-B cell survives this checkpoint, it somehow tells the kappa light chain to begin rearranging the segments on its gene. Again, this process is restricted to only one of the two chromosomes. If both chromosomes fail to make a complete kappa light chain protein, only then will the lambda light gene come into play. This mechanism ensures that the B cell will only bear one of the two light chains. If both kappa and lambda chains aren't functional, the cell will die. In summary, a B lymphocyte gets six shots at survival. Two come when each of the sister chromosomes bearing the heavy chain chains rearrange, and four after rearrangement of the kappa and lambda light chain genes.
A surviving B cell will now transcribe its rearranged gene to form an RNA. The protein specified by this processed transcript is capable of binding an antigen. But before it can do so, it must survive an additional test. I'll put that off for the next post.
Before entering further into the labyrinth of adaptive immunity, I'm pleased to share yet another text that has helped me explore the subject. It's called "Undergraduate Immunology: A Textbook for Tablets and other Mobile Devices". The author is Clett Erridge, a Senior Lecturer at Anglia Ruskin University in England. I downloaded it from Amazon to a Kindle app on my iPad. My impression so far, after perusing a half dozen chapters, is that it's aimed at the same audience as Lauren Sompayrac's book. And it looks good. I particularly liked the organization of Chapter 6, in which he describes the limits of innate immunity and writes how they are addressed by the adaptive system. I'm going to use a variant of his approach in what follows as a way of introducing the features of adaptive immunity.
Question 1 - The innate system can only respond to a fixed number of evolutionarily conserved features. If a microbe has developed a way of hiding these features or creating new ones, the innate system is largely helpless. How does adaptive immunity avoid this seeming intractable problem?
The adaptive system gets around this issue by brute force. It creates millions of sentinel cells each bearing a different detector/receptor on its surface. It does so randomly. The system doesn't make use of prior knowledge about what kind of invader is threatening. The idea is to synthesize so many receptor shapes that one is bound to be complementary to whatever comes along. As you might imagine, this process is extraordinarily wasteful. The overwhelming bulk of cells never encounter a complementary shape to the receptor that it carries.
Question 2 - With millions of different cells each with a specific receptor, how can any given cell respond effectively?
This has an easy answer. Once a cell finds something that it can bind to, it responds by proliferating, making many identical copies of itself. These copies are a clone.
Question 3 - Receptors are proteins. Proteins are specified by genes. If an organism is going to make millions of different receptors, it will require millions of genes. But it's known that there are only tens of thousands of genes in humans and other vertebrates. How do you synthesize millions of proteins with only a limited set of genes?
The immune system is ingenious. It mixes and matches pieces from a relatively small number of gene segments in millions of combinations to build up a myriad of receptors, each with a different sequence. More on this next time.
Question 4 - But this strategy is sure to create receptors that bind to molecules that don't pose danger. How does the immune system avoid making antibodies to one's self?
Recognizing that reactions against self is a serious problem (it can lead to autoimmune diseases), adaptive immunity rids itself of cells whose receptors target the host. How this is accomplished is the subject of another post.
Question 5 - What about viruses and microbes that infiltrate the interior of cells? How does the adaptive immune system deal with these agents whose fingerprints are hidden?
It makes use of the major histocompatibility complex (the MHC) that I described earlier in connection with natural killer cells. Remember, the MHC displays little pieces of internal proteins on the surface of cells. These little pieces can represent viral or bacterial proteins that have invaded a cell.
Question 6 - What about memory, the ability of the adaptive response to react more strongly to a second attack, one that may have occurred many years previously?
Some cells from a clone that reacted to a foreign antigen are set aside. They're long lived and ready to proliferate rapidly if challenged with the same antigen again.
I'll address these matters in more detail in subsequent posts.
The adaptive immune response, at least the version practiced by humans, appears to be a novel development that is restricted to vertebrates. Innate immunity, on the other hand, is found in all organisms, but only vertebrates seem to have developed a sophisticated mechanism that is able to recognize nearly every molecule that enters their domain, an apparatus that remembers past events and can marshal an "improved" response upon encountering an invader for a second (and later) time. This state of the art form of immunity is widely distributed in the vertebrate kingdom, and even "primitive" fish, like sharks and rays, share this same system with their more "advanced" cousins.
But is it true that only vertebrates have managed to develop an adaptive immune system? Perhaps not. A recent article suggests that fruit flies, my favorite organism, seem to make use of an adaptive response of sorts when faced with viral infection. Although the mechanism that they use for defense against this kind of attack is quite different than that of vertebrates, it does result in a kind of immunological memory that is characteristic of vertebrate immunity. There are other examples. For instance, there is some evidence that snails can synthesize a diverse group of proteins that that have antibacterial activity. Even bacteria and archaea, organisms that are considered near the bottom of the evolutionary tree, make use of CRISPR to mount a specific defense against invading viruses, incorporating viral DNA into their genomes so that their response can target a specific marauder.
One other fascinating (to me) point is worth discussing before I get off the subject of the possible diversity of adaptive immunity. There is a tiny group of living vertebrates, one that includes hagfish and lampreys, that do adaptive immunity differently. These jawless fish (all other vertebrates have jaws and are placed in a separate superclass) don't employ immunoglobulins (see below for a definition of this term) in their immune system. Instead, they make use of an entirely different protein in the construction of the molecules they use for defense. What this indicates to me is that the adaptive response can be fashioned in quite different ways, and that it may very well be that the scientific community may have missed finding adaptive immune systems in groups other than vertebrates because they may be so different than the systems we know about. Of course that's only a guess from a relatively uninformed observer.
OK. Let's dig in. Adaptive immunity can be divided into two parts. There's the humoral response and there's cellular immunity. I wondered about the first term. It doesn't mean that this branch of immunity is funny. The word "humoral" has roots that go back to the fourteenth century when physicians thought that one's state of health was dictated by bodily fluids, the humors. The word now means fluid-based. In the current context it indicates that the responsible parties for humoral immunity occur in the non-cellular portion of the blood.
The main players in the humoral system are antibodies, a term that was coined in 1891 by the legendary physician/microbiologist/immunologist and Nobel Prize winner, Paul Ehrlich. Antibodies are proteins, immunoglobulins, that are specialized for sticking to entities called "antigens" - meaning any molecule that upon capture by an antibody can induce an immune response. A cartoon version of an antibody shown in the act of binding to an antigen is depicted at the right. Below it is a more detailed and realistic picture sans antigen. Notice that this particular antibody consists of four chains, two longer "heavy" chains, and two shorter "light" chains, all of which bind rather tightly together to make a functional molecule.
I should say a bit about the bottom part of the illustration at the right. What you're looking is a three dimensional rendering of the surface of an antibody. It was produced using two resources. The first is a collection of all the "solved" protein (and nucleic acid) structures in the universe. Solved in this case means that the position in three dimensions of virtually all the atoms in the molecule have been worked out. These coordinates, essentially a series of numbers assigned to each atom, are stored in the PDB, the Protein Data Bank, a public facility managed by Rutgers University and the University of California, San Diego. I searched the PDB for an antibody and found the one pictured. In order to display the structure and to manipulate it, I made use of a second public resource, "UCSF Chimera", a computer program from the University of California in San Francisco. It takes the raw data from a PDB file and creates a three dimensional depiction that can be manipulated on a computer screen. There are several such programs, but I've gotten to know Chimera fairly well and I use it often despite the fact that it rather complicated. If you'd like to explore the structure of antibodies and other proteins using a simpler program, look up "molecular modeling software" in a search engine. Or simply go to the PDB site (the URL is above) and use the modeling program found there.
Antibodies are unique. All other proteins bear a specific sequence of amino acids, the monomeric units from which proteins are constructed. The amino acid sequence of a protein is dictated by a corresponding gene. In any particular organism, for any specific protein, there may be minor differences in the gene that codes for the amino acid sequence, but in general these are limited to one or two substitutions. Antibodies by contrast, occur in millions of forms, each with a different amino acid sequence, each one with the capability of binding a different antigen, each one dictated by the sequence of a different gene. The complex story of how this vast assembly of proteins is generated is the subject of the next posting.
Let me try to bring together the salient points that I've learned so far about innate immunity.
Point 1 - There are two parts to the immune system: innate and adaptive immunity. What's the difference? For one thing, innate immunity is older in evolutionary terms than the adaptive system. Organisms up and down the tree of life utilize various versions of it. It's also faster to respond to attack, taking minutes and hours rather than days. But the main difference between the two systems is that the adaptive defense is preprogrammed. It has an evolutionary designed repertoire of defensive agents and can make no more (at least not in less than millions of years). In general, if confronted with a threat that it hasn't anticipated, it isn't able to respond. The adaptive system, which seems restricted to vertebrates, has adopted a different strategy that I'll expand upon in later posts. But briefly, its game plan is to construct millions of different weapons at random without worrying about whether they'll be effective or not. One other difference between the two systems. Innate immunity apparently hasn't any memory, a second attack by the same agent doesn't result in a heightened response. That's not true of the adaptive system.
Point 2 - The systems are distinguishable but inseparable. That's something that I haven't emphasized enough, but is extremely important. Although most of my references divide immunity in two, they also warn that there exists no real barrier between them. One often works in conjunction with the other. I'll try to point out where this occurs in later posts.
Point 3 - The various components of the innate system make use of receptors to distinguish between the harmful and harmless. Receptors are proteins that lie in the membranes of cells, with portions protruding inside and out, one end binding to intruders. Once an enemy has been detected, receptors pass signald across the membrane, thereby releasing a cascade of events that prime the cell for an appropriate response.
Point 4 - Macrophages are sentinel cells. They are long term residents in many tissues and send out signals upon a microbial invasion. Upon activation, they ingest microbes and cells that are killed or injured. And they secrete cytokines to alert other immune components to come to their aid.
Point 5 - Neutrophils are the major phagocytic cells of the body. They more swiftly move in the bloodstream, awaiting a signal to slow down. When macrophages send out an alarm (through the release of releasing cytokines), they slow down, stop, and leave the blood stream and become killing machines.
Point 6 - The complement system consists of blood borne proteins that contribute to the innate and adaptive immune response, It also is responsible for the direct killing of bacteria and some viruses. Complement makes use of proteolytic cascades to initiate and amplify its effects.
Point 7 - Natural killer cells target viruses, injured cells, and tumors. Unlike the other cellular components of innate immunity, its main function is to detect invaders that have gotten inside cells. It does its job by making use of the major histocompatibility complex, a system for displaying fragments of the proteins located in the interior of the cell on its outer membrane.
Point 8 - Mast and dendritic cells are sentinels. Mast cells rapidly release toxic chemicals upon activation. Dendritic cells are the major link between innate and adaptive immunity.
That's a lot of food to ingest in a relatively brief sitting. Take a deep breath. The next topic is even more filling.