Posts by Yingke Liang

I don't really read but I sometimes write.

The Experience of Research as an Undergrad

Ah, research! Cutting-edge technology, exciting chemicals, pushing the limits of knowledge with your own two hands! But is that all there is to it?

The reality is that pushing the limits of knowledge requires a lot of inspiration, and takes a really freaking long time. Doing research is not your run-of-the-mill undergraduate lab. There, you do one experiment, say, synthesize aspirin, which has already been well characterized and done numerous times by a vast number of people You then you write about your specific attempt and all is well and good. In research, you don’t have the luxury of previous renditions of the same experiment because they’ve already been done, so what’s the point?

Instead, you need to find a new topic to study so that you can appease: 1) your supervisor, 2) your advisory panel, and 3) a funding agency, if you get there. Each of these require a more original and “exciting” experiment, and often those are quite hard to find. In fact, losing your research topic because someone else has already studied it, Usually, people tend to find new topics by looking into similar topics and tweaking them slightly, or delving deeper into a topic that has only been generally covered. This involves reading A LOT of papers so that you can become an expert on the current status of the research area you’re interested in. Also, keep in mind that while you will have help along the way, ultimately you must decide on the topic on your own because it is YOUR project, not your supervisor’s; otherwise, what’s the point?

Finally, after digging through trawls of papers, you have solidified your research topic and you are pretty confident that it will be exciting enough to give you a degree (let’s not get ahead of ourselves to the grant stage yet). However, since your topic is so new and exciting, you have no idea how you’re going to do it or if it will even work. You can ask around for help from people in your surroundings, but odds are they are not familiar enough with your topic. After all, you chose this topic specifically because it is new, and nobody has really researched it yet. So how do you proceed? By, guess what, reading more papers! In this case reading papers is like an extension of asking people in your surroundings. You won’t get the exact answer you’re looking for, but you can get an approximation of what you can do to get results. Also, thanks to modern technology, there are now internet resources such as research gate to help you in addition to reading a ton of papers, so all is not lost.

After much scrounging around, you are finally ready to plunge into research, exciting! Time to collect data!

…but collecting significant data also takes a long time and along the way you will inevitably have experiments that fail, reagents that degrade, part of your project getting scooped, etc. Eventually, you will succeed in enough experiments to get data to write a thesis and get your graduate degree, but if you plan on pursuing academia look forward to having to do this all again for your PhD! And then your post-doc! And maybe another post-doc! And then if a university accepts you into their faculty, your assistant professorship, which is like a more intense post-doc! And at the very end of the road, when you are finally offered tenure, you’ll realize that the things you have been doing this whole time are the same things you’ll be doing from now on as well: reading papers, creating experiment proposals, reading more papers, doing experiments, etc. This is also why professors always seem so old – getting to that stage takes a long time.

All this may sound really daunting and maybe even discouraging, but this is just a run-through of the drier parts of research. When you’re knee-deep in some sprawling experiment (and they always become sprawling), it’ll seem like there is never enough time. When your experiment fails and you have to read more papers, you’ll learn cool things you didn’t know even from all your previous education. You’ll meet people who will be experts about things you’ve never even heard about. You’ll get to use cutting-edge technologies and exciting chemicals just like you thought you would. And, at the end, when your experiments do succeed and you have collected enough data on top of all the knowledge you have amassed during the process, you will really have discovered something that nobody has ever seen before and nobody yet knows about, until YOU tell them about it! Now tell me that isn’t the coolest thing ever. (You really can’t.) The long path of research definitely has many downer moments and dry patches, but it is equally full of excitement and discovery. As long as you have patience and are undaunted by occasional failures, you truly will be on the frontline of pushing the boundaries of human knowledge.


Observing Molecular Machines using Cryo-EM

In 2017, the Nobel Prize in Chemistry was awarded to Jacques Dubochet, Joachim Frank, and Richard Henderson for their development of cryo-electron microscopy (cryo-EM). In 1986, Ernst Ruska, and Gerd Binning and Heinrich Rohrer won the Nobel Prize in Physics for designing the first electron microscope and for designing the scanning tunneling microscope respectively. In 1982, Aaron Klug won the Nobel Prize in Chemistry for developing crystallographic electron microscopy. With so many Nobel Prizes having been awarded for electron microscopy, what makes this recent development different from the other two?

The keywords for cryo-EM are proteins and resolution. Binning and Rohrer’s scanning tunneling microscope, designed way back in 1981, had a maximal lateral resolution of 0.1 nm and a maximal depth resolution of 0.01 nm—a resolution high enough to resolve individual atoms. They do this by slowly hovering a needle only one atom wide at the tip over an extremely flat surface of a solid that is made of a uniform lattice of atoms and reading the disturbances in the voltage difference between the surface and the microscope tip.

However, proteins (and other macromolecules but we are less excited about them—“another DNA structure, woooooo,” said nobody ever after the double-helix structure was determined in 1953) are these wild things that require atomic-level resolution to determine how the individual amino acids are oriented, but are also absolutely gigantic molecules where the 3-dimensional configurations of said amino acids matter just as much, if not more as their identities. Therefore it is no surprise that just slowly hovering a really thin needle over a uniform layer of proteins doesn’t really get you to this 3D structure. Not to mention that proteins are really delicate flowers and will precipitate and become some amorphous aggregate if you so much as look at them funny (okay I exaggerate, but only slightly). A good resolution of a protein for structural biologists averages at around 2.5 angstroms (Å, 1 Å = 0.1 nm), though there are many that have a higher resolution. For reference, a covalently bonded carbon atom has a diameter of 1.5 Å.

The current gold standard for protein structural determination is with x-ray crystallography, where an x-ray beam is fired into a protein crystal and due to the crystalline nature of the proteins, the single beam diffracts into many different directions which is caught on a screen. The beam’s diffraction angles and intensities can then be measured to produce a 3D electron-density map of the individual atoms can be reconstituted to eventually yield a complete 3D protein structure. However, protein crystallization often requires the use of poisonous salts and precipitants, and extreme pHs that would never be found in living organisms and very large proteins and membrane proteins are often impossible to crystallize. (Also the actual crystallization process is very luck-based and even for regular sized proteins may or may not happen. Trying to grow protein crystals is very good for building character.) This is where cryo-EM, our star, finally comes into the scene.

Electron microscopy usually uses some form of electron interactions between a source and the object to be imaged. In order to not disturb these delicate interactions, the imaging has to take place in an absolute vacuum, which falls under the “will aggregate protein” condition category. Instead, a pure sample of a protein of interest is frozen down and an electron beam is fired onto the frozen protein to produce an image—a “trace”—on a detector. Other than the freezing instead of vacuum, this sounds like pretty standard electron microscopy, no? The key advancements were figuring out how to flash freeze water-soluble proteins (because normal freezing could also fall in the “will aggregate protein” category, and may produce ice crystals which are NOT protein) and how to get the detectors good enough to achieve the really high resolution but also really large width required to image proteins.

The protein in the frozen sample is found in various orientations and from taking an image of hundreds of different orientations of the same protein using the new, high-tech detector that was recently developed, a computer can be used to generate a complex electron density map comparable to those obtained through x-ray crystallography, which can then be used to generate the true structure of the protein in high resolution and accuracy.


An example of cryo-EM images of a protein that together can be used to generate a 3D structure. Image: Maofu Liao, Harvard Medical School.

The beauty of cryo-EM is that it can do everything that x-ray crystallography cannot: image proteins in a non-poisonous environment and image very large proteins. It also doesn’t require painstakingly screening every possible combination of precipitant, salt, and pH possible and hoping and praying that out of one of those potentially thousands of combinations a crystal will grow within your tenure in that lab.

But you may be wondering, “who cares about protein structures anyways?”. Protein structures are very important in drug development, and are also important in understanding the molecular mechanisms of both healthy and disease states. Viruses are also basically protein assemblies, and determining their structures are very important in understanding their pathology and behaviour on a molecular scale. Also, they look cool! (Protein aesthetics is usually how people get suckered into structural biology). Appreciate this nice picture of the Zika virus obtained through cryo-EM. Picture1

Cryo-EM structure of the Zika Virus. PDB: 5IRE Sirohi et al. (2016) The 3.8 angstrom resolution cryo-EM structure of Zika virus. Science. 352, 467-470

Happy lurking the cryo-EMs!

Research Awareness Day 2017

November 25th, 2017 marked the annual Research Awareness Day (RAD) held by the Biochemistry Undergraduate Society (BUGS). One of the most prominent undergraduate research events of the year with over 80 undergraduate attendees, RAD featured a full day of rapid-fire presentations by 10 different biochemistry professors, lunch, and a poster fair featuring graduate and undergraduate students alike. With the diversity in the research topics of the different professors, there was something for everybody, not just those majoring in biochemistry.

Once again, RAD 2017 was a great event to learn more about research, network with profs, and to get excited about science. You definitely do not want to miss out on RAD if you have the chance, but for those that didn’t make it to RAD 2017, here’s a glimpse at what the professors talked about:

Dr. Albert Berghuis

As the chair of the biochemistry department, Dr. Berghuis gave a brief snapshot of biochemistry at McGill, past (biochemistry is one of the oldest departments at McGill!) and present, before presenting his lab and his current research. The Berghuis lab centers around structural biology and drugs: the development of anti-cancer drugs, the identification of fungal drug targets, and various other drug related topics. But no topic is as pressing as the central feature of the Berghuis lab: antibiotic resistance. Taking a structural biology approach, using techniques such as x-ray diffraction, NMR, scattering, and electron microscopy, the lab seeks to use structures of bacterial enzymes that confer antibiotic resistance to develop new, better antibiotics.

Dr. Jose Teodoro

The Teodoro lab is in equal parts biochemistry and virology, as their primary focus is to learn how to kill cancer cells using viruses that only seem to kill cancer cells by honing in on specific cellular features that only cancer cells possess. For example, the chicken anaemia virus, which causes anaemia in chickens, only targets and kills rapidly dividing cells by interacting with the Anaphase Promoting Complex/Cyclosome. While this destroys chicken hematopoietic stem cells, it is fantastic news for cancer biologists since cancer cells also tend to divide rapidly. Furthermore,the chicken anaemia virus is small, and its only function is to target and destroy rapidly dividing cells. The Teodoro lab also works on p53, a very well known gene that encodes a tumour-suppressing transcription factor, and its effects on tumor angiogenesis.

Dr. Ian Watson

The Watson lab focuses on melanocyte biology in melanoma. 50% of melanomas have a hotspot mutation BRAF, and 25% have a hotspot mutation in NRAS, both of which are mitogen-activated protein kinase (MAPK) regulators, and are druggable targets. The goal of the lab is to develop a therapeutic strategy for long-term survival, as many current techniques show initial promise but no increase in the rates of long-term survival. The Watson lab created stable Cas9-encoding mice with which genome manipulation can be easily done, and they also collect samples from patients who underwent checkpoint inhibition therapy, so they have excellent models for melanoma, the poster child for precision therapy of the future.

Dr. Alba Guarné

One of the newest additions to the McGill biochemistry department, hailing from McMaster University, the Guarné lab studies genome stability and DNA-protein interactions. DNA needs to be extremely condensed to fit into the tiny nucleus of the cell. Almost all DNA processes require the DNA to be decondensed. Once this occurs, the DNA is under constant attack by many components of the cell. Over time, this constant attack can lead to significant mutations in the DNA if it weren’t for the DNA repair mechanisms that prevented the accumulation of mutations. One of the projects the Guarné lab is currently undertaking is the analysis of DNA mismatch repair, specifically studying how the mechanism can discern which of the two DNA strands contain the mutation. All this is done through structural biological techniques such as x-ray diffraction and EM microscopy.

Dr. Janusz Rak

The Rak lab, at the Montreal Children’s Hospital, is a cancer and angiogenesis laboratory, asking questions related to the complexities of diseases. One disease the Rak lab studies specifically is glioblastoma, a type of brain cancer that kills nearly 100% of patients due to the tendency for the tumour to hemorrhage in the brain, and its peculiar penchant of forming blood clots elsewhere, such as the leg, demonstrating the interactivity of cancer. The lab is interested in the unconventionally connectivities of cells — one that does not involve neither the neural nor the endocrine system. Glioblastoma cells exemplify this lack of convention as they seem to communicate using extracellular vesicles, which Dr. Rak described as “motherships that can change things in different ways”. Techniques used in Dr. Rak’s lab include atomic force microscopy and liquid biopsy.

Dr. Uri David Akavia

The Akavia lab is interested in metabolism bioinformatics in cancer, conducting computer modelling of the metabolism of the entire cell, specifically in cancer cells. The lab also intentionally changes genes known to be involved in metabolism using Cas9, and observes and models the consequences. (This leads to some pretty wild flow charts). Ultimately, the Akavia lab seeks to examine how cancer metabolism makes the cell resistant to treatment or developing cancer, and to develop treatment options from the results.

Dr. Bhushan Nagar

The Nagar lab uses structural biology techniques, specifically x-ray crystallography, to decipher molecular mechanisms that underlie diseases. The lab has a diverse range of research interests, such as analysis of IFIT proteins, members of the innate immunity which interact with viral RNAs to block their replication; AvrA, a bacterial protein that blocks immune signalling in the host cell to promote successful infection by the bacteria; and lysosomal enzymes, a subset of acid hydrolases whose mutations lead to lysosomal storage diseases. The Nagar lab hopes to use information gleaned through structural analysis to develop better therapeutics, such as drugs and pharmaceutical chaperones, for associated diseases.

Dr. Alain Nepveu

The ultimate goal of the Nepveu lab is to develop a novel cancer treatment by exploiting vulnerabilities of the cell (not JUST rapid divisions, but other characteristics as well), as well as examining base excision repair. The Nepveu lab uses mouse models and a lot of different assays to collect the data. Dr. Nepveu also stressed the importance of starting research early, and that you don’t need to have prior research experience to conduct interesting experiments in the lab — the skills you learn from making pizza at your part-time job can be transferred to running a PCR! But ultimately, you should not be shy to approach professors to ask about getting research experience.

Dr. Jason Young

The Young lab’s primary focus is on chaperones, specifically the Hsp70s and 90s, which anyone can learn about in GREAT detail if they take BIOC 212/ANAT 212 from the man himself, but Dr. Young’s RAD presentation was about how to get involved in research, in which he also stressed that you don’t need research experience to get involved in research at the undergraduate level, and that there are many classes such as the 396s and independent research courses available to students, providing a helpful and resourceful end to the rapid-fire talks.

This month on the PDB: December

Hi everybody! 162 structures were released from November 28th to December 5th, ranging from the typical Homo sapiens proteins to the zesty proteins of the Asian rice. Without further ado, let us take some time to peruse this newly released selection of never-before seen macromolecule structures!

  1. HSP90 WITH [sic] indazole derivative, by Graedler, U., Amaral, M., & Schuetz, D.. You know how part of what makes the PDB exciting is that the PDB releases never-before-seen-structures? Well this is not one such case, but just like the lysozymes of last week, this title is notable in that it is actually the name of not one, but SIX separate releases for this week. Hsp90 (heat shock protein 90) is a chaperone protein that promotes the proper folding, stabilization, and activation of a client polypeptide in an ATP-dependent manner, commonly as a dimer. In teaching materials, Hsp90 usually looks like some weird ellipses lumped together to form some two-pronged rabbit head-looking thing, but now you can see what it really looks like, in six slightly different conformations! Indazole derivatives are Hsp90 inhibitors, and these monomeric structures of human Hsp90 are each binding a slightly different indazole derivative. Why are they monomeric and what’s with all these different indazole derivatives? Well, the paper hasn’t been published yet so you’ll have to keep on guessing for a while, but until then, you can entertain yourself by looking at the structures. The PDB accession codes are 5LNY, 5LNZ, 5LO0, 5LO1, 5LO5, and 5LO6. The following figure includes an approximate surface density of the protein, which is why it probably looks a little strange. You can see there are slight differences in conformation between the structures, but for the most part they look decently similar.3.1
  2. Crystal structure of Os79 from O. sativa in complex with UDP, by Wetterhorn, K.M., Gabardi, K., Michlmayr, H., Malachova, A., Busman, M., McCormick, S.P., Berthiller, F., Adam, G., and Rayment, I. This is actually a generalization of the titles of four different structures, all of Os79 from O. sativa in complex with UDP, with and without different mutations and some with different sugar moieties. Oryza sativa, the humble Asian rice, one of the most essential cereal crops to society, is susceptible to head blight infections caused by fungi in the Fusarium genus that affects many other cereal crops as well. Trichothecene toxins are a family of toxins that are responsible for the virulence of Fusarium head blight, and toxic to humans and livestock as well as plants since it inhibits protein synthesis in the eukaryotic ribosome. Os79 is a UDP-glycosyltransferase: it adds a glycosyl moiety to a substrate using a co-substrate such as a UDP-glucose (a glucose bound to a uridine diphosphate). Glycosylating trichothecenes reduces its bioavailability and toxicity, so it would be super rad to engineer a protein that could easily glycosylate these toxins and prevent humanity from losing a lot of crops, some livestock, and a few humans every year. That’s a lot of lives saved! Os79 seems to be a good candidate for this ^type of protein engineering, but this goal is still in the fledgling stages as there are a lot of different trichothecenes and only one Os79 (Remember that enzyme-substrate specificity which we thought made enzymes so cool? Well now it’s sort of biting us in the butt. Enzymes are still cool though, obviously). The good news is that the group that released these structures also did some structural and activity analyses and determined what parts of the structure controlled its specificity. You can read all about it here: The PDB accession codes are 6BK0, 6BK1, 6BK2, and 6BK3.3.2

Someone requested the experimental method metrics of the PDB releases, so here they are: out of 162 structures released this week, 146 were obtained through x-ray diffraction. 4 structures were obtained using cryo-electron microscopy, the up-and-coming technology in the world of structural biology (the 2017 Nobel Prize in Chemistry was awarded to the developers of cryo-EM), a lower number than the usual. This week, there are a whopping 12 structures obtained through solution microscopy. What are these mysterious 12 structures? The 4 cryo-EM structures? The 144 x-ray diffraction structures? (You already saw 2 of them.) Go check them out yourself on this week’s PDB release! You can find it here: Happy lurking!

This week on the PDB: November 24th – November 30th

Welcome back to another week of “This week on the PDB”, where I discuss  you a very small section of new, never-before-seen protein structures uploaded to the Protein Data Bank, because proteins just keep on getting discovered.

  1. Lysozyme is an enzyme essential to our immune system that destroys bacterial cell walls. It is ubiquitous in the body: tears, mucus, blood, you name it, you can probably find lysozyme there. Lysozyme also holds a special place in the PDB, as not only was it the FIRST structure to be deposited in the PDB, back in the day when the whole PDB consisted of only 7 structures, but it is also the protein with the MOST different structures deposited, largely owing to a series of experiments conducted Brian Matthews where he made various (and by “various”, I mean “hundreds of”) mutations to lysozyme. Lysozyme is also one of the most consistently crystallisable proteins and has consequently been used to study the protein crystallization process. This week, lysozyme has once again reared its head on the PDB, where Hosur et al. deposited four new structures of lysozyme, at different time increments in the guanidine hydrochloride and glycerol soaking process. The picture below is of the four structures and the ligands aligned with each other. The PDB accession codes are 5H6A, 5H6C, 5H6D, and 5H6E. The associated article has not been published yet, so it the purpose of this structure is unclear, but be sure to check out all the other hundreds of lysozyme structures on the PDB, which can be accessed here.


    Hen Egg White Lysozyme native crystals soaked in precipitant solution containing 2.5 M guanidine hydrochloride and 25% glycerol.

  2. PolyA polymerase module of the cleavage and polyadenylation factor (CPF) from Saccharomyces cerevisiae. In order for a freshly transcribed piece of RNA to mature into a useful piece of mRNA for translation, its 3’ end must be cleaved, and a string of adenines added to form a polyA tail. CFP mediates this whole process, making it one of the MVPs of mRNA processing, but for some reason, we don’t really know how this amazing complex of proteins so important to our existence assembles itself (shocking, I know. I rank it high in my list of “why don’t we know this” along with the polymerases involved in DNA replication, where there has been ambiguity for quite some time now). At least, until Casañal and Kumar et al. solved the structure of nuclease, polymerase, and phosphatase modules of the CPF in the common baker’s yeast, S. cerevisiae, using electron microscopy with a resolution of 3.55Å. You can read all about the structural features, such as the four beta propellers (ooooohhh), of the yeast—a great model eukaryote—CPF here. The PDB accession code is 6EOJ.


    Cleavage and polyadenylation factor (CPF) from Saccharomyces cerevisiae

  3. Crystal Structure of pro-TGF-beta 1. This structure of the pro-transforming growth factor-beta 1 from the wild boar was obtained through x-ray diffraction by Zhao, B., Xu, S., Dong, X., Lu, C., Springer, T.A. with a resolution of 2.9Å. TGF-beta 1 mediates many cellular functions, including growth, division and proliferation, differentiation, and apoptosis (programmed cell death). It is also important for the immune system, where it can interfere with other cytokines involved in the cellular immune response. In general, it is a very important cellular signalling protein, specifically a cytokine. It also has an implication with tumor development, and consequently holds relevance to cancer (as we all know, anything with connection to cancer is a hotbed of potential biomedical research). This structure has a swap between the N-terminal prodomain and the C-terminal GF domain, which appears to affect how the proteins assemble with each other. The associated article is heavy on the structural biology and lighter in clinical implications, but can be read here. The PDB accession code is 5VQF.2.4

The previous week, a total of 220 structures were released on the PDB. While many of them were simply different resolutions of the same protein (not to mention those four lysozyme structures), there are still a plethora of structures released every week. This just goes to show the expansiveness and diversity in the realm of macromolecules. You can find all the PDB releases this week here: See you next week!



This week on the PDB: November 7th – November 13th

Welcome back to “This Week on the PDB” for the week of Nov. 7th! This week on the PDB, 133 new structures were released, providing 3D visuals for macromolecules found in species ranging from the common human to the HIV virus. Here are some highlights:

  1. The solution NMR* structure of the Membrane Associated Segment of HIV-1 gp41 Cytoplasmic Tail was released by Murphy, Samal, Saad, and Vlach. This HIV-1 domain is important in mediating the recruitment and incorporation of the viral envelope protein (Env) into the virion. The full paper can be found in the November 7 issue of Structure, DOI: The PDB accession code for this macromolecule is 5VWL.
  2. Human glutathione s-transferase Mu2 complexed with BDEA in a monoclinic crystal form was released by Zhang et al. The structure was obtained through x-ray crystallography and had a resolution of 1.6 Å. The asymmetric unit consists of two chains, A and B. Glutathione s-transferases (GST) make up a family of proteins that add glutathione to proteins, which aids in their regulation. GST is important in cancer research, and deviations from normal GST structure appears to correlate with increased susceptibility in certain cancers such as lung, prostate, and colorectal. Though glutathione s-transferases (GST) are generally well characterized, and many other GST structures can be found on the PDB, this appears to be the first structure of GST M2. Additionally, this GST is complexed with an inhibitor, and glutathione, giving further insight into how this GST M2 interacts. The article for this structure has yet to be published, but for now you can look at the structure by searching up its PDB accession code, 5HWL.


  1. Cryo-EM structure of human insulin degrading enzyme was released by Liang et al. The resolution is 6.5 Å and the asymmetric unit contains two chains. Insulin degrading enzyme, as the name suggests, is involved in the degradation of insulin in the cell, as well as IAPP, glucagon, bradykinin, kallidin and other polypeptides, which allows insulin degrading enzyme to affect certain intercellular signalling pathways. Insulin degrading enzyme also breaks down amyloid formed by APP and IAPP, which might have implications in neurology. An article has yet to be published, but for now you can see it on the PDB, with accession code 6B7Y.


  1. Human Apo-TRPML3 channel at pH 4.8 was released by Zhou, Li, Su et al. The technique used was cryo-electron microscopy and the resolution is at 4.65 Å. TRPML3 channels are mostly found on the endolysosomes and are critical to the endocytic pathway and consequently cell signalling. This apo (unbound) structure at pH 4.8 is very different from the apo structure at the physiological pH of 7.4, and the lower pH appears to inhibit channel activity. Malfunctioning TRPML3 channels can cause deafness and interfere with proper pigmentation in mice, but does not appear to correlate with any human diseases yet. However, malfunction of the closely related TRPML1 channels in humans cause a severe lysosomal storage disease, mucolipidosis type iv. The paper was published November 6 online in Nature Structural & Molecular Biology, and can be found here: Its PDB accession code is 6AYG.


This just scratches the surface of this week’s released on the PDB. If you are interested in seeing what the other 129 structures are, you can find them at Happy viewing, and come back next week for more structure releases!

*Solution NMR is a technique where the protein is suspended in a buffered solution and multidimensional nuclear magnetic resonance is applied to the solution. Radioisotope-labelled proteins are also used during the process. Through resonance assignment and by creating different restraints for parameters such as distance and bond angles, a 3D structure can eventually be generated. Solution NMR doesn’t require a crystal to collect data, unlike the case of x-ray crystallography, and also has the added benefit of being able to provide information on the dynamics of the protein, but is limited to small-sized proteins.


This Week on the PDB: An Introduction

Protein functions are largely dependent on their 3D structure, but where can we find these structures, and more importantly, manipulate them? The answer: The Protein Data Bank (PDB).

The original PDB was established in 1971, and contained a total of seven structures. Since then, it has grown to contain more than 133759 structures. Though the name suggests that only proteins are covered, the scope off the PDB is much wider than just polypeptides. Ligand-bound proteins, nucleic acid-bound proteins, viruses, ribosomes, and just about any other kind of macromolecules can be found on the PDB.

The PDB has a built-in 3D structure viewer accessible through a browser. Using the 3D viewer, you can spin the structure in any direction to get a full 360o view, zoom in, and zoom out. You can also view structures in different styles such as cartoon or ball-and-stick, and you can see the plot of the surface as well. Other features include the amino acid sequence of the protein and seeing how specific residues interact with ligands.


An alignment made in PyMol, with sequence view turned on.

The PDB’s built-in viewer has all the features needed for a cursory glimpse of the structure, but lacks features required to do in depth structural analyses. As such, most people tend to download the .pdb file of the structure and view it in another program. The PDB is so ubiquitous in structural biology that structural analysis programs such as PyMol can fetch PDB files just using the 4-digit alphanumeric code associated with a given protein, called the PDB accession code.

Most of the structures on the PDB are obtained through x-ray diffraction, but other techniques, such as electron microscopy, can be used to solve for the structures. There is also a database specifically for structures solved from electron microscopy called the EMDataBank, which is easily accessible from the PDB. The reverse also applies.

Though the PDB contains an immense database of protein and other macromolecular structures, there are still an extraordinary number of macromolecules that have yet to be catalogued. Consequently, the PDB updates once a week, adding newly solved, never-before seen structures.

PDB intro

The main page of the Protein Data Bank.

Now that you have been primed on what the PDB is and how it can be used, I would like to introduce a new column on The Abstract: “This Week on the PDB”! Every week we will present to you the new structures in the PDB, and a short summary of their purpose. Together, let us explore the cutting-edge science of structural biology.

Check out the PDB here: