Sunday, August 26, 2018
Drug crazed mapping
Drug crazed mapping
I had told myself I wasnt going to bite when @Amazing_maps screamed once more for my attention. But the more I tried to ignore, the more it reeled me in so eventually I thought it worth a few comments.
Heres the so-called amazing map:
Ive no idea who made it. It doesnt really matter. What I feel matters is the impact maps like this have on those that view it. This is more about the consumption of maps but, of course, their design and construction goes a long way to underpinning the message people take away.
Quick look and take away: Holy drug barons Batman,...San Bernardino is full of crack-heads! So are a few smaller areas I dont even know....but theyre really small so they cant be as important eh? Right, must be time for Alaska State Troopers, turn on the TV...
Thats how a lot of people will look at this map. Message delivered. Warped view of reality perpetuated. Job done. Wait for the next Amazing Map.
Heres the longer look and take aways I formulated...
Hmm. Somethings not quite right with this map. Lets talk it through. Its a choropleth. We can assume from the title...well, the line that doubles as the legend title, what the subject matter is. Its about the labs, not the population so its about production, not consumption. And the colour scheme goes from light to dark so we see where there are more meth labs and where there are fewer. Ill not repeat myself like a cracked record about it being totals (but it is) and not normalised (but it isnt) suffice to say it needs to have the data transformed into per capita or something equally sensible to allow us to compare like for like. Though critical for a choropleth, lets ignore that for the purposes of this because theres other take aways in this map.
Look at San Bernardino County again...jeesz, its heaving with meth labs.
This makes me a little more interested (perhaps concerned) as its where I live. Notwithstanding its totals, look at that large, expansive area filled with loads of meth labs. How many?...theres about...errr, well, let me look at the legend. hmm. Its dark blue. Does that make it 300, 500, 1000 meth labs?
Its impossible to tell without doing some assessment of the actual RGB values. Its actually closest to the RGB value about 1/3 along the legend colour ramp which would make it about 330ish...though there are no RGB values in the legend that match those found in San Bernardino County so its impossible to be certain and why am I having to do an RGB analysis of a legend anyway? It shouts out from the map yet is nearer the lower end of the legend. That doesnt seem right.
So San Bernardino leaps out because 1. its the largest county in the US 2. It has a lot of meth labs (though possibly not per capita or in relation to counties with many more) and 3. Its dark blue and that means more except theres virtually no differentiation between the blue used at 330 and that for 1000. All the variation in colour value is at the lower end.
The map uses an unclassified choropleth approach. That means every data value is given its own position along the chosen colour ramp. Im not a huge fan of unclassified choropleths. Choropleths are generally used to show where places are similar and that relies on classifying your data into groups that display similar characteristics. All you can really see from an unclassed choropleth is the extremities...which areas tend to the maximum and which to the minimum. Its really difficult to assess where those in-between values might sit...and thats assuming the scale is linear and the colour scheme is applied linearly. Of course, you can stretch colour to be applied non-linearly but then its an even more confusing picture thats arguably more difficult interpret visually. If you dont classify data before mapping it then youre painting by numbers and its a bypass to considering your data and teasing out the message through careful classification and symbolisation.
Im going to add a caveat here - if the map is for interactive web display and the user can hover or click an area to retrieve the value directly, then unclassed choropleths are, arguably, less problematic because people can retrieve values across the map. Id still contend, however, that if we know the map is classified into, say 5 classes using natural breaks then every county symbolized in the same shade of blue is similar. Its an important metric we can easily see in the map and its a good default. Other classification schemes exist to suit alternative purposes. If we use, say, a quantile scheme of 5 classes then we know each class shows 20% of the data values in rank order - again, similarity between values, across the entire range values, can be easily seen and its simple to see which areas are in the top 20% of values. If you make two choropleths then using something like a quantile scheme allows you to compare the two maps on a comparable cognitive basis. Clicking to retrieve a value is an additional step in the map reading process. Trying to remember values from one hovered-over area to another is equally taxing because our short term recall is not our best cognitive function (think of memorizing and recalling a pack of cards in order...its not easy!). I like maps to show and tell rather than require further processing or actions by the user to reveal the message.
Onto the colours. Because there are just so many different shades of blue across the map we get a sense of some overall pattern but we cant really tell which are similar to which. How similar is San Bernardion COuntys colour compared to the other dark blues across the other side of the map? Its called simultaneous contrast and is a problem for our map reading. Our perception of a colour (or shades of a colour) varies as we look across the map due to the colours that surround it. Look at the following two grey squares and how they are affected by the surrounding shades:
The grey square differs in perception depending on whether its surrounded by dark or light. A darker surround makes us see it lighter than if it has a lighter surround. Now look at how different colours modify the grey square:
The grey squares, despite being the same, take on a perceived tinge of colour based on whats around it. And when the image gets even more complex we have even more difficulty processing what we see. In the following animation, which grey square, A or B, is darker?
Of course, the greys in A and B are the same. In the above diagrams all the grey squares are seen differently simply because of their surroundings. The map of meth labs has over 3,000 counties, each shade of blue being surrounded by its own different mix of blues.
These perceptual issues are also a problem in classed choropleths of course - but not nearly to the same degree because its much easier to distinguish and differentiate 5 or 6 shades of blue across a map than it is to try and make sense of several hundred (thousands?) different shades of blue.
And what about labels? Yes we can probably all recognise its the U.S. I know where my home is so I recognise San Bernardino County. Ive no real way of describing where other patterns exist in language that makes sense. Giving people context is important. Interactive maps support this through basemap labels or, again, hover and click...but however you deliver the map, give people a way to reference the patterns they see.
So the take-aways for me...
- Its totals. If you cant or wont change to a rate or ratio then use something other than choropleth like a dot density, proportional symbol, dasymetric or cartogram.
- If you have to use unclassed choropleths then scale your data across the range of colour so that extremeties dont dictate the way values map onto the colours. Make the legend more useful by providing labels at key positions and make your map interactive so people can retrieve values.
- Go with a classed choropleth if you want people to see more than just the extremeties in your data and how different areas are similar to others for all values that display similar characteristics. Learn which classification techniques are going to manage your data most appropriately for the message you want to share.
- Be aware of the problems of simultaneous contrast.
- Include some form of labelling to give people a way of referencing the geographical patterns they see.
Other problems...no real title, no source, no credits, no dates, no contact details. Nothing. Like I said, I dont know where the map came from but as is, its a fail in every respect.
Finally, I tried to get the data to recreate this as a per capita but after a quick search I wasnt able to find it at county level. Instead I came across this abomination on the Drug Enforcement Administration web site:
I dont even know where to start with this one, and theyve made one per year for the last few years. They were clearly on something or other. And if we assume the DEA reporting is accurate (and the most current) AND that the Amazing_Maps one is broadly of the same time period (OK, a lot of assumptions) then whats with San Bernardino having over 300 meth labs given California as a whole has only 79?
Clearly somethings wrong somewhere. Amazing map? Possibly. Its just poorly designed and constructed and gives a totally misleading impression of a dataset that cannot be verified. Its another potentially mildly interesting dataset thats poorly mapped.
And by the way, San Bernardino County is the 5th most populous county in the US so per capita...we may even have a paucity of meth labs so a different map might support the assertion we need more to get our supply increased*. Additionally, while the overall area of the county is about 20,000 sq miles, the populated areas are predominantly crammed into the south west corner in an area roughly 450 sq miles...which makes a choropleth map of totals covering mostly desert even less useful (unless the meth labs are in the desert). And all those less important smaller areas...Seattle, St Louis, Tulsa and Grand Rapids. But because of the way the boundaries lie, choropleths are always going to cause difficulties in interpretation. Thats the Modifiable Areal Unit Problem...and a whole different blog entry.
* this is a joke