Staying up to date and organised with scientific literature

By Santiago Rompani

Biological research has never been more vibrant and interesting, but the flipside is an ever-increasing number of studies to keep track of. Gone are the days when the ponderous intellectual could leaf through the latest journals while twirling a goblet of brandy or smouldering pipe. Therefore, I had to come up with other ways to keep up with the scientific literature.

The first is to spend a few minutes to learn how pubmed searches work, watch a tutorial here. Then register for “My NCBI”, which is free and allows you to setup automated pubmed searches and have the website output any new results every day via email. I have searches setup for specific scientists I wish follow, like “gross cornelius[Author]” (use without the quotation marks). I also have more complex searches for keywords in particular journals such as “(Lateral Geniculate) AND (“Nature”[Journal] OR “Science (New York, N.Y.)”[Journal] OR “elife”[Journal])” (the list of journals in my actual search is very long, again, don’t include the quotation marks, but keep the square brackets).

Not sure what bizzaro-world logic is used to populate such results, because sometimes they suggest the weirdest stuff, but it works well enough as a starting point.

But how to find papers relevant to your work? This is in fact harder than it sounds, and pubmed mining can be very labor intensive, boring, and prone to missing papers that did not quite mention what they studied in the way you searched for them. Because what kind of scientist wants to keep to the commonly accepted standard nomenclature in the field? My way to solve this is what I call the “google scholar, web of science, pubmed” circle of life, or GWP, because it’s fun to come up with pointless new terms. First, you go to google scholar and type in “lateral geniculate circuitry” because there is nothing more interesting in my completely unbiased and impartial opinion. You look at the first page of results, skim the abstracts, and pick a paper that seems important and seminal, the older the better. Do not pick reviews, since they are often cited in the first few years after being published, but less and less as the field progresses and new reviews get written. Also, often you already have a starting point when people just hand you a couple of the seminal papers in the field.

Then you take that paper’s title and search it in Web of Science. On the results screen, to the right of the paper you put in, you will see the number of times that paper was cited, and if you click there, it will list the papers that cited this seminal paper. Boom, now you have a very relevant, curated list of papers relevant to that first one. However, you will likely miss papers again. So, when you find even more relevant papers in this search, you need to not only plug it back into the Web of Science, but also look at the papers referenced there (the pubmed part of GWP). Occasionally you will stumble upon a term or concept related to your work, prompting another round of google scholar searching to unearth the granddaddies of the lot, and the circle of GWPing starts anew.

Why not just dump keywords on pubmed you ask? Well, if you just put in “lateral geniculate circuitry” in pubmed, you only get 170 results, compared to that seminal paper’s 467 results from web of science, many of which are very relevant, but didn’t appear in the 170 because they didn’t use the precise keywords you searched for. Of course, pubmed reduces the problem a bit by allowing searches using MeSH terms, which are terms the American National Library of Medicine uses to index papers with, a very labor-intensive manual curation that is a very heroic investment in organising the scientific literature. However, in my experience, even a very adept use of MeSH doesn’t eliminate the problem completely.

Ok, now you have a couple of dozen papers saved randomly in different folders on your desktop and the file name situation is completely out of hand. How do you organise this mess? I personally use Zotero, others might use Mendeley, EndNote, or Papers.  I can just drag a pdf into the program and it converts the pdf into a standard filename and I can move it around in different collections. Zotero also has a useful browser plugin, you can have it save any page a paper is on, and it will automatically download that pdf (this actually saves the metadata better than loading a pdf). Well, the saving from a webpage happens most of the time, but this is often broken in the big fancy journals, because god forbid they should make the pdfs easy to download even after your institute has paid for access. Zotero also has a plug-in to Word, which makes it easy to insert and manage references in any paper or application (explained here: Zotero is free, but you can pay to save more of the files in a cloud, which I do, because It’s pretty cheap as far as these things go.

My notes from one paper, more on that page, but to be honest, there are probably more elegant systems around.

So you have a veritable library of Alexandria in your favourite paper-tracking receptacle, now what? You could closely read each one in such a way that your genius brain remembers each and every one, Sherlock Holmes-style. Alternatively, I would suggest some form of database to store your notes and thoughts on the papers you read (and everything else, really). I’ve recently been using OneNote, mostly because it comes free with Microsoft Office. I like that it lets you take snapshots of anything in your desktop easily, but very much don’t like that it doesn’t let you group objects like most of other Microscoft products.

My favourite thing about this account is that they have 7,000 followers and follow nobody. Also, they have not once tweeted about anything other than their papers, no pictures of cats, no off-topic banter, just the titles of their papers and a link to it. Feels almost too productive for Twitter.

But we are living in the shiny future of the 21st century, and all of the options above seem so 1995. So there is also Twitter. I setup a Twtitter account because somebody told me I should, yet its value has been rather mixed, except for one shinning beacon of incredible usefulness: the bioRxiv feed. For each manuscript uploaded, they tweet the title and a link to the article, the main bioRxiv account (@biorxivpreprint) for all biology, the bioRxiv neuroscience account (@biorxiv_neursci) for just the neuro papers. I find it is very easy in my commute every day to scroll through the manuscripts and quickly pick out those worth reading in more detail. I’ve caught some true gems while browsing this account’s posts. I highly recommend checking this pipeline out, even if it means having to join Twitter to do so. There are also new options for paper curation and discovery that use machine learning to search for articles, such as pubchase, which I imagine will become much more powerful in the future, but right now it still shares many of the shortcomings of more traditional programs.

Leave a Reply

Your email address will not be published. Required fields are marked *