"ex-libris"​ of a Data Scientist, part V: Visualization

Pre visualization dashboard sketch, Francois Dion (c) 2015


abstract: I will cover some of the essential books for data science in a 6 part series. This part V covers visualization (see part I for introduction and: data and databases, part II: model, part III: technology, part IV: code, and part VI: communication).

"Le plus court croquis m'en dit plus long qu'un long rapport", Napoleon Ier


The above quote from Napoleon, appeared in L'Excelsior in 1910, a little over 100 years after he said it. It means "the most summarized sketch tells me more about the matter than a long report". It is quite similar to the saying "a picture is worth a thousand words". It is one reason for the effectiveness of visualizations.

That is, if they are well made. The following books will help, for sure, but you will need to spend time doing visualizations, not just reading about them. One of the most productive path I've taken was to take a class by Alberto Cairo (two of his books are included in this list). It forced me week after week to make time to prototype and create visualizations, but also to interact with many people, get feedback and push myself to produce better charts. A similar thing happened when I did the Tufte workshops: I continued exchanging with other participants over months.

Suggestions? Find a mentor or friend who will be able to give you feedback. Take the time to draw sketches on paper, with pencils and rulers and your hands (yes, shocking). Also, create new visualizations or tools and do historical research to improve your skills (as a side note, I have a few interesting timelines and surveys in the works to be released later this year). And speaking of historical research, let's start there. I'll spare you years ? to 1899, we'll start at 1900:

Historical


Books with visualizations, particularly those with folding inserts and color charts were always expensive to make. The subject was also fairly specialized. For that reason, early books in the domain tend to be scarce and expensive. Still, with some patience and some effort, most of these can be found, at least in a library. One caveat, reproductions are typically in black and white and of average quality. Who knows, you might find something interesting while visiting a small town antique shop (Picture: The chartroom at du Pont de Nemours, circa 1920)

  • 1900 Statistical Atlas (1903), Washington US Census Office, prepared under the supervision of Henry Gannett, geographer of the twelfth census (radar plots, check. treemaps like charts, check. Really nice color visualizations, check. Later years did include some nice visualizations too, at least for a while)
  • The Construction of Graphical Charts, 1st edition (1910), McGraw Hill, J. B. Pebble (Just an amazing book, covering all kinds of visualizations, including tangible 3d representations, all in 1910!)
  • Croscup's Synchronic Chart of United States History (1910), Windsor, George E. Croscup (a good example of parallel timeline charts)
  • How to Make and Use Charts, 1st edition (1919), Codex, Allan C. Haskell (An interesting snapshot of the state of the art from the era, with a whole chapter on trilinear charts - Haskell also published Graphic Charts in Business and would preface Weld's How to Chart, many years later)
  • Graphic Methods for Presenting Business Statistics (1926), McGraw Hill, John Riggleman (some interesting thoughts on planning visualizations for maximum impact, and for doing comparative charts. Riggleman also states that what is easy to do is typically hard to read, while doing an easy to read visualization is the hard thing to do)
  • How to Use Pictorial Statistics (1937), Rudolf Modley (In the 1920s saw the rise of Otto Neurath's Isotype and the Vienna Circle manifesto. By the 30s Neurath had a staff of data transformers, visualizers, graphic artists etc. In the US, the movement gained traction through Pictorial Statistics, Inc, with Modley as executive director)
  • Principles of Charting (1939), Barron's, Walter E. Weld (the democratization of visualization is peaking, just as the 2nd World War starts)
  • Graphic Presentation (1939), Willard C. Brinton (A very solid book on the subject, with color illustrations - unfortunately the reprints are black and white, but I own the original publication - a sort of second edition to his "Graphic Methods for Presenting Facts" from 1914, also part of my ex-libris)
  • Flight Thru Instruments (1946), Harley Earl (Before he became the head of design at General Motors, Harley did some interesting information visualization work, making something hard seem easy)
  • Charting Statistics (1952), McGraw-Hill, Mary Eleanor Spear (A solid book. Her 1969 Practical Charting Techniques is also highly recommended)
  • Handbook of Graphic Presentation, 1st edition (1954), Calvin F. Schmid (No less than 29 editions of this book were published from 1954 to 1979! A beautiful, book covering everything from statistical maps to pictorial charts, to binning and classification. Schmid also covered important social subjects, like suicide rates).
Recent reproductions of classic material (added 2019):
  • W.E.B. Du Bois's Data Portraits, Visualizing Black America (2019), Whitney Battle-Baptiste and Britt Rusert editors, Princeton Architectural Press. (A collection of avant-garde data portraits made by W.E.B. Du Bois for the Paris International Exposition of 1900. All were handmade, obviously, and are fascinating, from both a historical and a design of information visualization perspective. This is the first time they are available.)
  • The Minard System (2019), Sandra Rendgren (A collection of most of Charles-Joseph Minard flow maps and other information visualizations from the 1800s. You might be familiar with his visualization of Napoleon’s army. This covers so much more. This is a large size book, and the only reference of the kind. The only negative to me is the fact that many of the prints are over two pages, and the binding prevents seeing all the information near the center. I could see a future print with folding statistical charts and maps...)

The Classics


The books in this section are pretty much all must read.
  • Graphic Rational Patterns: A New Approach to Graphical Presentation of Statistics (1968), Israel University Press, Roberto Bachi
  • Exploratory Data Analysis (1977), Addison-Wesley, John W. Tukey
  • Graphical Methods for Data Analysis (1983), Chapman and Hall, John M. Chambers, William S. Cleveland, Beat Kleiner, Paul A. Tukey
  • Semiology of Graphics (1983), Jacques Bertin (An absolute must have. This was first published in French, in 1967. I own the original edition in French and the 1983 print in English)
  • The Visual Display of Quantitative Information (1983), Edward Tufte (I think that if somebody knows one book in visualization, it is highly likely to be this one)
  • The Elements of Graphing Data (1985), William S. Cleveland (there is also a more recent 2nd edition of this classic book - known for his systematic approach to effectiveness of methods of encoding data)
  • The Collected Works of John W. Tukey Vol. V: Graphics 1965-1985 (1988), Chapman and Hall, John W. Tukey (this cover all his important work related to and semi-graphic statistics)
  • Envisioning Information (1990), Edward Tufte (the 2nd book by Tufte, takes Visual Display to the next level)
  • Information Graphics: A Comprehensive Illustrated Reference (1999), Robert L. Harris (as close as it gets to an encyclopedia of visualization, a reference, even if some terminology has changed)
  • Grammar of Graphics (1999), Springer, Leland Wilkinson (most people are familiar with a software based on this, ggplot2, but this is a language agnostic approach)
  • Information Visualization: Perception for Design 1st edition (1999), Collin Ware (I also own the 2nd edition, and there is now edition. Recommended, and/or alternatively the book below)
  • Information Visualization (2001), Robert Spence (a professor of Information Engineering at the time, Spence wrote many research papers in visualization. This book wraps up our classics, even though it is only 16 years old)

Graphic Design




Books on the subject of colors along with typography/fonts will not be included here. Instead, they will be covered in the next part of this series (on communication).

Graphic Design is a huge field and could easily be its own list with subdivisions. But again, we are looking at this from a data science perspective.

If you want to dig deeper, do check out the Awesome Design list, which covers fonts, books, blogs, resources, icons, logos, tools, prototyping, style guides (you do have one, yes?), awards,
conferences, podcasts, networks and more.




  • Ink on Paper 2: A Handbook of the Graphic Arts (1972), Edmund C. Arnold
  • Problems: Solutions - Visual Thinking for Graphic Communications (1986), Van Nostrand, Richard Wilde
  • The Graphic Language of Neville Brody (1989), Jon Wozencraft (just one designer here, but there are many books exploring the graphic language of many designers)
  • Graphic Design in America: a Visual Language History (1989), Mildred Friedman et al. (Complements Megg's book)
  • A history of graphics design (1992), Phillip B. Meggs (the reference for graphic design, there is a more recent version edited, even though Meggs passed away many years ago)
  • 50 Trade Secrets of Great Design: Packaging (2002), Stafford Cliff
  • Information Graphics and Visual Clues: Communicating Information through Graphic Design (2002), Ronnie Lipton
  • Size Matters: Effective Graphic Design for Large Amounts of Information (2004), Lakshmi
  • 100 Habits of Successful Graphic Designers: Insider Secrets on Working Smart and Staying Creative (2005), Josh Berger Plazm, Sarah Dougher
  • Signage Systems & Information Graphics: A Professional Sourcebook (2007), Thames & Hudson, Andreas Uebele

Cartography


In the classic section above, Jacques Bertin is probably the main reference to get.

However, I have quite a few books on the subject that many readers will enjoy. I've excluded overly technical books (one that is 100% on the mathematics of geography, for example).

This is just enough to be a data scientist proficient at using maps the proper way.

  • The Story of Maps (1980), Dover, Lloyd A. Brown (if you are curious about the history of maps)
  • Mapping Information: The Graphic Display of Quantitative Information (1982), Abt Books, Howard T. Fisher (published 3 years after the death of Fisher, not as exhaustive as Bertin's Semiology of Graphics, but completely focused on mapping to coordinate systems)
  • The Map Catalog, 1st edition (1986), Vintage, Joel Makower (there are a few more recent editions)
  • The Harper Atlas of World History (1987), Harper (translated from Hachette's Atlas and edited by Jacques Bertin - alternatively, see the Times Atlas of World History from 1979)
  • Applied Cartography: Source Materials for Mapmaking (1989), Thomas D. Rabenhorst (if you are a map geek like me, you'll enjoy, else you can probably skip)
  • Pictorial Maps (1991), Watson-Guptill, Nigel Holmes (was designer at Time Magazine, I'm including this as the counter to the Tufte minimalism)
  • Introduction to Thematic Cartography (1992), Judith Tyner (you have to start somewhere, and this introductory book does the job)
  • How to Lie with Maps (1996), Mark Monmonier (see also the next part of this series on communication)
  • Thematic Cartography and Visualization (1999), Terry A. Slocum (I really like this one)
  • Cartographies of Disease: Maps, Mapping, and Medicine (2004), ESRI, Tom Koch
  • Maphead: Charting the Wide, Weird World of Geography Wonks (2011), Simon & Schuster, Ken Jennings
  • Atlas of Design, Vol I-IV (2012-2018), published by NACIS (the Atlas of Design is dedicated to showing off some of the world’s most beautiful and intriguing cartographic design. Every two years, a new volume is published filled with full-color maps, selected from a worldwide competition and judged by an expert panel.)

Information Design


Another section full of books that could have easily been listed under Graphic Design.



















  • Information Graphics: a survey of typographic and cartographic communication (1989), Van Nostrand, Peter Wildbur
  • Information Architects (1995), Graphis Press, Richard S. Wurman (this needs to be on your bookshelf. Seriously)
  • Visual Function: An Introduction to Information Design (1997), Princeton Architectural Press, Paul Mijksenaar (an introduction to the subject, short enough to read on the subway on your way to work)
  • Information Design (1999), MIT Press, Robert Jacobson (the best introduction to the subject I've read, it also hints at so many other fields)
  • Information Graphics: Innovative Solutions in Contemporary Design (2000), Thames and Hudson, Peter Wildbur, Michael Burke
  • Information Anxiety 2 (2000), Richard S. Wurman (not as groundbreaking as the first book published in 1989, but slightly less dated - read both if you have the time)
  • Atlas of Cyberspace (2002), Pearson, Martin Dodge, Rob Kitchin (Rob made the book available online for free)
  • Visual Complexity: Mapping Patterns of Information (2011), Manuel Lima (similar to the above, but more recent and covering different examples)
  • Design for Information: An Introduction to the Histories, Theories, and Best Practices Behind Effective Information Visualizations (2013), Rockport Publishers, Isabel Meirelles (Could have been in the next section)

Information Visualization


This section is probably what most people have in mind when they are talking about visualization. Or maybe even a subset of statistical data visualization. I will not cover dashboards, as this would require covering UI, UX and many other themes.









  • Designer's Guide to Creating Charts & Diagrams (1991), Watson-Guptill, Nigel Holmes (see the note on Holmes in the Cartography section - this is again a to Tufte's minimalism)
  • Visual Explanations (1997), Edward Tufte (the third book by Tufte, focusing on dynamic data)
  • Graphical Analysis of Multiresponse Data (1999), CRC Press, K. E. Basford and J. W. Tukey (specific to plant breeding trials, the techniques can be applied to many fields)
  • Ma10fiej: 10 de 2002 (2002) (Or any other year of Malofiej. You have an obligation to track down at least one of these and page through and appreciate the work put into these visualizations)
  • Show Me the Data: Designing Tables and Graphs to Enlighten, 1st edition (2004), Analytics Press, Stephen Few (There is a 2012 edition available. All about data presentation. All of Stephen Few should be on your to-read list)
  • Introduction to Information Visualization (2009), Springer, Riccardo Mazza (if you are not sure where to start, this book gently eases into the subject)
  • Now You See It: Simple Visualization Techniques for Quantitative Analysis (2009), Analytics Press, Stephen Few (Another excellent book by Few, this one on data analysis techniques)
  • Visualize this: the flowing data guide to design, visualization and statistics (2011), Nathan Yau
  • The Functional Art (2012), New Riders, Alberto Cairo (Cairo's original visualization book, a very good reference)
  • The Book of Trees (2013), Manuel Lima (eye candy visual survey of tree charts, great coffee table book)
  • Visualization Analysis & Design (2014), CRC, Tamara Munzner (the modern equivalent to Jacques Bertin's Semiology, Munzner spent years writing this book and it shows. A must have)
  • The Truthful Art (2016), New Riders, Alberto Cairo (the follow up to Functional Art, raises the question about what should be charted - the material was tested in Cairo's class before the publication of the book)
  • The Book of Circles (2017), Manuel Lima (same concept as the Book of Trees but for Circular representations, another great coffee table book)

To learn even more, see Awesome DataViz and Awesome Visualization Research (both on github)

This is a wrap for books related to visualization. Sorry if I did not include your favorite visualization book. I was aiming for a maximum of 60 books (a tiny fraction) and I ended up with 66, even without papers and without books on dashboards, UX, UI or more specialized scientific visualizations.

The last article (part VI) in the series will be Communication (and it will be followed by a bonus list of books for managers). You can also check out a few pdf available at http://artchiv.es/

Francois Dion
@f_dion

Comments