This document aims to help to circumvent some pitfalls in data presentation and creating scientific articles and slides. The intended audience are undergraduate, graduate, PhD students and possibly also working scientists in Computer Science and Bioinformatics. It is partly based on [Sanders, 1999] and partly based on what we conceive as major mistakes often made in slides and articles.
All of the points presented here will revolve around the following fundamental principle:
The main aim of using is to transport information in a concise way.In theses, it is okay to use numeric style, e.g. [1]. For slides, citing the authors is better.
One often sees the result of the following "algorithm". Give the author names and the year of publication. For one author, give its name. For two authors, give both names, separated by a comma. For more authors, give the first name followed by an "et al." Put the result in (square) brackets and maybe use a different color.
Examples are [Smith, 2001], [Doe, Smith, 2001], [Smith et al. 2001].
At the end of the presentation, there should be a "Literature" slide. You might to show the slide to the audience very briefly only or not at all but it is good to have as a reference or when you make your slides available for download.
Giving comprehensive advice on how to prepare a good presentation with good slides is out of the scope of this paper. Still, we want to give some landmark rules that apply to most cases. Remember, rules are there to make you think before you break them.
And again: Your aim is to transfer your central points as clear and terse as possible.
If something in your presentation is not necessary for supporting these core points then ask yourself if they are necessary. Note that this is a very soft rule. You might leave informative or "funny" facts in, for example.
Note that it is also a useful trick when attending talks: Remember and focus on the takeaways. Structure your memory and understanding around these takeaways.
Generally, a slide should not have more than a handfull of bullet points and rarely have more than 50 words. In most cases, do not put more than 2 graphics on one slide, and then only if you are making a comparison.
First, too much gimmicks and playfulness can distract your audience. Second, things like gradients can also interfere with clear regonition. You might also give a thought to what happens if you print out your slides. Unnecessary gradients or images in the background waste ink and if the slides are scaled down might make them unreadable.
It is generally okay, to be playful on the first and last slide which give the title of the talk and say "thank you for your interest" respectively.
Use the same fonts and colors for the same things everywhere.
MS Sans Serif looks very informal and has not been designed to be easy to read. Your computer ships with fonts that are easy to read and that are probably the default for your slide template anyway.
Graphics become easier to read once one has understood the style. Use the same fonts, axis style in all your graphics if possible. For example, if one graphic has a box around it (matching with the axes) then all graphics should have one.
Italic and bold text are good means to highlight text for printed work. Black and white is still dominant over color. On slides, however, color can be very useful if used properly.
Highlighting text can be very useful. Consider the example of giving two definitions on one slide that only differ in one or two words. When the difference is highlighted, the slide becomes easy to read. When the difference is not highlighted, the slide becomes very hard to read: Your audience has to read the two texts word by word at the same time to find all differences.For formulas, consider the ratio between complexity and importantness.
The arithmetic mean is very simple but probably useless. The geometric mean is very simple and some people might like to be remembered. The normal distribution is pretty complex and probably useless if you are not actually giving a talk on it.
If you give a complex formula on a slide then make sure you understand it and can explain it. If the formula is not so important, consider giving it a symbolic name and putting it in the appendix. When giving the talk, describe roughly what the formula describes and show it when requested only.
If you decide on using formulas, make sure that they are not too painful to read:
Grids are non-data ink. As a corolary of Tufte's rule, they probably are not a good idea. If you use them, use lighter colors, e.g. gray.
Each axe should be labeled with an explanation such as "time", "input size" or more complex descriptions where it makes sense ([Sanders, 1999] gives some examples).
Give the unit for each axe if it makes sense to do so: "time [ns]", "time per operation [ns]", "input size", "input size / 1000" etc. Use appropriate font sizes: Axe labels and tick marks are useless if they cannot be read! For slides, remember that while beamers are catching up resolution-wise, you are still limited to a lower resolution than on your monitor. On the x axis, one often uses a logarithmic scale, e.g. when using input sizes of for . Since is 1'000'000'000, most plotting tools will abbreviate this to 1E09. It is more readable to actually write $%10^9%$ at the x axis than 1E09. You can do this with gnuplot, for example, using the "enhanced" option in the output driver.On the y axis, one sometimes has very small or very large values in a relatively small range, e.g. 0.00034 to 0.00045. Many tools would plot this as 34E-04 to 45E-04 which is hard to "parse" for the human mind. Additionally, a small change in the exponent, say between 34E-04 and 34E-03 has a large impact on the represented value. However, it is hard to "parse" this out of the "noise". In this cases, one could simply multiply the values by one thousand, yielding tick marks from 34 to 45. If another graph has a value of 0.00045, this becomes 4.5 and it is easier to see the difference. An axe label "running time [ms]" documents the "zoom".
The range of an axis should cover all realized values and if possible a margin for aesthetics. If you have more than one graph, and the y values are to be compared then make sure that the scale is the same. Otherwise, the advantage of graphics is strongly reduced.
Do not use pie charts. It is much easier for humans to compare lengths than angles or areas.
The wording of [Sanders, 1998] on tables is concise and comprehensive:
Tables are easier to produce than graphs and perhaps this advantage caues that they are often overused. Tables are more difficult to interpret and too large for large data sets. [...] Nevertheless, tables have their place. Tufte [Tufte, 1983] gives the rule of thumb that "tables usually outperform a graph for small data sets of 20 numbers or less". Tables give very acurate values which make it easiert to check whether some experiments can be reproduced. Furthermore, one sometimes wants to present some quantities, e.g solution quality as a function of problem instances which cannot be meaningfully arranged on the axis of a graph. In that case, a graph or bar chart may look nicer but does not add utility compared to a more accurate way. Furthermore, there may be an apendix or a link to a wb page containing larger tables for detailed documentation of the results.
If you can answer any question with "yes" then try to improve your work using this article or [Sanders, 1998].