Written by Tim Bushnell, PhD
“It would be possible to describe everything scientifically, but it would make no sense; it would be without meaning, as if you described a Beethoven symphony as a variation of wave pressure.”
― Albert Einstein
The goal of any scientific process, as you know, requires the communication of the data that supports or refutes the hypothesis under testing.
Before it is deemed worthy of publication, it must survive the process of peer review ― where the data is laid bare before a group of experts in the field who judge the material impartially (usually) and in secret ― then pass judgment on the suitability of the information for publication.
The presentation of your data must be clear.
As such, choosing the right flow figures to communicate your data is essential.
Good handwriting (formerly known as proper penmanship) and drawings might have been enough to convince peers in the distant past, but not today.
Today, the expectation is that you'll choose the right flow figures from all that are available, selecting the ones that reflect your data accurately and without confusion.
There is so much data and so little time that it is essential to present information in the clearest, most concise way.
As Einstein once said: “Everything must be made as simple as possible. But not simpler.”
Presenting the data in the best possible format, highlighting your results while avoiding glitz that can make the integrity of your data suspicious, is key.
At first glance, flow cytometry data is very visual.
Analysis techniques rely on presentations using univariate (a.k.a. histograms), bivariate (a.k.a. dot plots) and even higher order plots (3D plots, SPADE trees, etc.).
The huge caveat with falling in love with any of these types of plots is in knowing the plots used for flow analysis are more often than not a means to an end.
Their purpose is to extract numeric values (such as percent positive or median fluorescent intensity) from the data ― the real value of the data to be presented.
Here are the benefits and drawbacks of popular flow figures to consider when presenting your data:
1. Histograms.
Histograms tend to be the most abused of figures for presenting flow cytometry data.
These plots show the intensity of expression versus the number of events.
Typically, figures are shown with data from different conditions shown on one graph, often with an offset as below…
Histograms are useful for cell cycle and proliferation analysis, but are less useful for presenting data for several reasons:
- No relationship between different markers (can't identify double positive cells)
- Subtle populations lost in larger distribution (no rare events)
- Shape is dependent on binning (different for different instruments and analysis tools)
- Peak height is a function of the number of events and spread of the data
2. Scatter Graphs.
The real data that is important are the numbers extracted from these graphs. As such, scatter plots should be seen as a way to summarize the real data.
The power of the scatter graph shows several things:
- The number of the experiments that were performed in generating the data
- The average of the data
- The spread of the data
- The significance of the data
3. Bivariant plots.
Bivariant plots have some utility in presenting the manner in which the populations of interest were identified.
Bivariant plots show the relationship between two different markers, allowing for more complex phenotypes to be identified and important populations of interest to be isolated via gating.
The original bivariant plot was the 'dot plot', a figure that showed the relationship between two variables, but lacked detail in terms of the intensity of the number of events in a given region.
4. Density Plots.
The dot plot led to the development of the 'density' plot ― a way to show not just expression levels, but the relative number (i.e. density) of events in a given region.
Three such density plots are shown below (generated in FlowJo v9)…
Each of these plots show the same thing, just in slightly different ways, so pick the one you are most comfortable with and use it.
5. Contour Plots.
The other way to show the density of your data is to use a contour plot. Like the above density plots, these show the relative intensity of the data using contour lines. In this case, each line contains x% (as defined by the plot).
In the plot below, the lines are at 5% of the population, so the outermost line contains 95% of the cells, the second line 90% and so on.
The closer the lines are together, the steeper the 'island' of cells. Unfortunately, contour plots are not good at showing the outliers. The best strategy here is to couple a contour plot with a dot plot, allowing your rare events to be displayed (shown below in the plot on the right).
One concern reviewers may have over the contour plot that can prevent your data from being published is that these plots do not convey a sense of the number of events on the plot. This is a common criticism of all bivariate plots.
As shown in this figure, only a few points make a very compelling plot (or seemingly compelling plot)…
The solution to this problem is to indicate the number of events on a given plot. This will give reviewers and all readers an indication of the magnitude of the data involved in the analysis.
6. Gating Strategy (All Plots).
The gating strategy used is of great interest to the reader of a paper or grant. It is also a common criticism of flow cytometry data in general. Why? Because…
Gating is a subjective art form.
At least, gating can be a subjective art form. In a Nature Immunology paper, Maecker and other researchers performed a series of studies concluding that…
In other words…
Since the conclusions from the study will be based on the populations of interest as defined by the gating strategy, getting this consistent, and communicating how the gating strategy was established, is a critical piece of data to share.
An excellent example of this can be seen in any of the published OMIPs, such as OMIP-3 by Wei et al. (see below).
The above presentation of the gating strategy is valuable for dispelling that myth that gating is a subjective art form.
As new automated analytical techniques become more widespread, they will also help in addressing this issue while adding a level of confidence that the data extracted for downstream statistical analysis has come from a robust, vetted process.
When preparing figures for publication, the scientific question and hypothesis that forms the basis of the paper must be central and all the figures must be in support of that. The flow cytometry data that forms the basis of the conclusions should be presented clearly and concisely. While it provides pretty pictures and colorful layouts, the meat of the data are the numbers ― percentages of populations, fluorescent intensity levels and the like ― these are what will convince the reader that the hypothesis tested is valid and well thought-out.
To learn more about getting your flow cytometry data published and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.
Such a very useful article. Very interesting to read this article. I would like to thank you for the efforts you had made for writing this awesome article
ReplyDeleteData Science Training in Hyderabad
Data Science course in Hyderabad
Data Science coaching in Hyderabad
Data Science Training institute in Hyderabad
Data Science institute in Hyderabad