When to use log scales
log scales seem to have caused a lot of debate recently. When should we use them?
I think there are two reasons:
- Your data is exponentially distributed.
- You want to transform your data to make certain regions easier to see.
Exponentially distributed data
The first is a bit of a tautology. For independent variables (usually the x-axis) it’s easy. Perhaps you have collected data at points 1, 2, 4, 8, 16 etc. Makes sense to plot this on log2 scale (0, 1, 2, 3, 4). Collected at 0.1, 1, 10, 100 etc? Plot on log10 (-1, 1, 2, 3). An example is minimum inhibitory concentrations (MIC) which are directly measured in log increases, or parameter searches which are often over large ranges.
But what about dependent variables? Anything that’s a geometric progression (multiplied by a constant factor at each step) will be exponential, and probably needs to be logged to see its range properly.
An examples might be cases in an epidemic (multiplied, roughly, by R each time interval).
Often when you do this you’ll see a nice linear relationship emerge, which is easier to analyse (visually and analytically).
In the top row, the dependent (x) variable has been collected in logs, and should be plotted as such so the linear relationship is clear. This also spaces out points on the x-axis better. Similarly in the bottom plot, the growth rate of an independent variable which rises along an exponential curve is better seen when logged. This also spaces points out on the y-axis better. It is usually clearer to write values on the scales as their untransformed values.
Transforming for visualisations
Another reason you may want to log a scale is more purely for visualisation purposes – nothing mechanistically to do with the variables being plotted or how they were collected.
Broadly, logging a variable expands the range of the smaller values and shrinks the range of the larger values.
Very useful when looking across a large range or your values are bunched up at the lower range!.
Really this has nothing directly to do with logs/exponentials, it’s just a nice function whose shape has this useful property. In fact, you are often free to use any transformation you want. If you want one that shrinks/grows less than log(x), use sqrt(x). If you want one that does it even more than log(x), use 1/x.
In this case we take a transform of the x variable to display the data more clearly so it fills the plot evenly, and uses the space as best as possible. Each transform spreads/squeezes data more. In this case the data is actually logged so this leads to perfect spacing.
More intuition
To get a sense of what logging looks like, try using log graph paper.
Also very handy: in base 10 a linear change of 0.3 is a log change of 0.5 as log10(3) = 0.5. So if you had a progression 10, 30, 100, 300 etc, these are about equally spaced on a log scale.
log-log scales
log-log scales can be used to analyse power laws, or when both variables are exponentially distributed, or need to be spread out better.