"ex-libris" of a Data Scientist, part VI: Communication


"Poste de telegraphie aerienne" (Chappe visual telegraph or semaphore), Furne, Jouvet et Cie, mid 19th century

abstract: I will cover some of the essential books for data science in a 6 part series. This part VI covers communication (
see part I for introduction and: data and databases, part II: model, part III: technology, part IV: code, and part V: Visualization)


"Il y a dans toute foule, pensait Rivière, des hommes que l’on ne distingue pas, et qui sont de prodigieux messagers. Et sans le savoir eux-mêmes." - Antoine de Saint-Exupery

In "Vol de Nuit" (Night Flight), Antoine de Saint-Exupery, through the fictional character Rivière, states: there are people who are unaware that they are great "messengers". Saint-Exupery is writing here about air mail pilots delivering letters, a very difficult task in the 1920s.

One major obstacle they faced in delivering their communications was the size of a mountain: they had to cross the Andes range at the Argentina-Chile border, near the Tupungato volcano! (Photo: Tupungato, 6,570m as seen from the Leonera–Pintor ridge, CC-by-SA-3.0 Tijs Michels 2010)

In any data science project, there are several obstacles. The most challenging come to proper and precise communication. Even if you do not see yourself as a great "messenger" or communicator, you can tackle this. This article covers books related to communication.

From question to answer

As a reminder, a data science project starts with a question (obstacle number one) and ends with an answer (obstacle number two). Both of these things are communication processes. We went through several articles covering Data, Model, Technology, Code, and Visualization to help us come up with an answer.


But how do we communicate at the time of formulating the question? How about at the time of providing an answer? By any means necessary.

Do we need a video to make a point, or specific color choices to help in understanding the data? We have to spend the time thinking about these things. What about selecting the best font for the task at hand? Right placement? How big is too big and how small is too small? How can I balance the different elements on a page?

The books below will answer or provide guidance to these questions and much more. They are a complement to part V on visualization (especially the section on Graphic Design and Information  Design).


Historical

Per my usual, I will start the book list with some historical background. This is specific to the subject of color, of fonts (or type and typography) and of writing.

On the left is a detail from a Gutenberg Bible. The movable type printing press is one of the most important inventions of the past one thousand years. Without it, there would be no concept of fonts and there would be no "ex-libris" of a data scientist.

If you are curious about things like the origin of the words lowercase and uppercase, read on (see Henry's book).

And once more, if you have a severe allergy to history, feel free to skip to the next section.



  • The Principles of Harmony and Contrast of Colors and their Applications to the Arts (1890), George Bell and Sons, M.E. Chevreul, translated from the French by Charles Martel (English translation of one of the most important work ever written about color: "De la Loi du Contraste Simultane des Couleurs et de l'Assortiment des colores" by Eugene Chevreul, 1889 edition)
  • Feste des Lebens und der Kunst (1900), Diederichs Verlag, Darmstadt, Peter Behrens (so, a book in German? Do not worry. It is not as much to read, as to see: this is the first book typeset in a sans serif font. You can see it at archive.org)
  • The Philosophy of Color (1904), Clifford & Lawton, C. R. Clifford (much easier to find than Chevreul's book, but nowhere near as significant, nor extensive)
  • Essays in the Art of Writing (1905), Chatto & Windus, Robert Louis Stevenson (and you thought you only had to read Treasure Island and Dr. Jekyll and Mr. Hyde)
  • Printing for School and Shop (1917), Wiley, Frank S. Henry (Henry along with United Typothetae had come to realize that journeymen could no longer train apprentices and printed instruction was needed. During the late 1910s and 1920s several books were printed to remedy the situation, ranging from subjects like "How Paper is Made", design and "Tabular Composition")
  • The Elements of Style (1920), Harcourt, Brace, William Strunk, Jr. (this is to the English language what "Le BLED" is to French Language - dreaded and unavoidable - further down I have the reference to a more recent edition if you prefer)
  • An Essay on Typography (1931), Eric Gill (I own the second edition)
  • The Practical Art of Color Matching (1953), American Cyanamid Company, W. H. Peacock (it would be hard to have a better author's last name... more seriously, this book touches on the subjectivity of color and many other aspects of color)
  • Type and Typography, the Designer's Type Book (1963), Reinhold, Ben Rosen (I chose one of many of such "type catalog")

Audio/Video/Photo


A single book is not going to transform you into a photographer, a video editor or a recording engineer. Nor would a complete list of all the books I own on these subjects. But, it is a case of "knowing a little goes a long way". If nothing else, it helps to bring the level of professionalism up in communications. This applies to meetings, the boardroom, conferences or in electronic and printed form. (Photo: "Jungle Fog" by Francois Dion - Monochromatic color photography, Winston Salem, 2015).


  • The Photographer's Handbook: A Complete Reference Manual of Photographic Techniques, Procedures, Equipment and Style (1982), Alfred A Knopf, John Hedgecoe (this is for conventional film photography, not digital photography, but the skills transfer to any DSLR with manual controls - many people will prefer a book that is specific to their camera brand and model)
  • Handbook for Sound Engineers (1991), Sams, Glen M. Ballou (more information than you'll ever want to know on the subject as it goes very deep - the 5th edition from 2015 is the latest)
  • Sight, Sound, Motion: Applied Media Aesthetics (2007), Wadsworth, Herbert Zettl (you could possibly get away with reading just this one book, particularly if you outsource all your media work. Either way, you will look at other people's communications, production techniques and variable encoding choices in a totally different light)
  • Television Production Handbook (2012), Wadsworth, Herbert Zettl (I also owned an older edition I had picked up when I worked at the Canadian Broadcasting Corporation. This covers everything from writing to selecting the right lens, to post-production)

Color

The first serious scientific work on color would probably have to be Sir Isaac Newton's:

A Letter of Mr. Isaac Newton, Professor of the Mathematicks in the University of Cambridge; containing his New Theory about Light and Colours: sent by the Author to the Publisher from Cambridge, Febr. 6. 1671; in order to be communicated to the R. Society.

He would later publish "Opticks: or, a Treatise of the Reflexions, Refractions, Inflexions Colours of Light" in 1701. Another big breakthrough would come through French chemist Eugene Chevreul, in the 1860s. (Image: cover of BYTE issue number 10, June 1976, "The Game of LIFE Played in Color", on Dazzler-LIFE implementation of Conway's cellular automaton)

Color is a fascinating thing because it is something perceived by our brain. Not as an absolute, but in relation to other known aspects of our field of vision, or to what we know to be true. Our eyes are not Camera Obscura and the retina is not the final site of the perception of vision. It is in the brain. Most everybody by now has heard of the "color of the dress" debate. This has to do with something called "subjective color consistency". In December 1977, Scientific American publishes an article by Land. In it are the details of Retinex (Retina + Cortex), a theory the inventor of the Polaroid Land Camera had first presented in 1959. Yet, to this day, color is quite misunderstood.

Some additional useful books on the subject I've collected over the years include:
  • Electronic Color: The Art of Color Applied to Graphic Computing (1990), Van Nostrand Reinhold, Richard B. Norman
  • Color and the Computer (1997), Academic Press, H. John Durrett
  • Color: A Natural History of the Palette (2003), Penguin, Victoria Finlay
  • Graphic Designer's Color Handbook: Choosing and Using Color from Concept to Final Output (2003), Rockport, Rick Sutherland, Barb Karg
  • Color Matching Handbook: A Comprehensive Guide to the Art of Using Color (2004), Readerlink, Viv Foster
  • The Digital Color Printing Handbook (2005), Watson-Guptill, Tim Daly
  • Color Management: A Comprehensive Guide for Graphic Designers (2008), RotoVision, John T. Drew, Sarah A Meyer

Software

When it comes to software, everybody will have a different approach. I tend to do the bulk of the work with Python (as automation language or directly). I also use some open source software.

Here are manuals or user guides to a few of these programs. For a more elaborate list of software, see my publication the Hitchhiker's Guide to the Open Source Data Science Galaxy. (Photo: "5-1/4 R. G. Biv" by Francois Dion)



  • Crafting Digital Media, Audacity, Blender, Drupal, GIMP, Scribus, and other Open Source Tools (2009), Apress, Daniel James
  • Mastering Blender (2009), Sybex, Tony Mullen (Blender is principally a 3D editor, but also a game engine and video editor)
  • Blender Studio Projects: Digital Movie-Making (2010), Sybex, Tony Mullen, Claudio Andaur
  • Inkscape, Guide to a Vector Drawing Program (2011), Prentice Hall, Tavmjong Bah (the author has the 5th edition for free)
  • The Book of Gimp: A Complete Guide to Nearly Everything (2013), No Starch Press, Olivier Lecarme, Karine Delvare
I also have a separate list for a whole "parallel universe" in terms of software for publishing documents, that of LaTeX. Here are some books on the subject:
  • Computers and Typesetting / A The Texbook (1986), Addison Wesley, Donald E. Knuth (not needed unless you run the original TEX)
  • Computers and Typesetting / B Tex: the Program (1986), Addison Wesley, Donald E. Knuth (not needed unless you run the original TEX)
  • A Document Preparation System, LaTeUser's Guide and Reference Manual (1994), Addison Wesley, Leslie Lamport
  • The LaTeGraphics Companion (1997), Addison-Wesley, Michel Goossens, Sebastian Rahtz, Frank Mittelbach (it's not just for text)
  • A Guide to LaTe(1999), Pearson, Helmut Kopka, Patrick W. Daly
  • The LaTeWeb Companion, Integrating TEX, HTML, and XML (1999), Addison Wesley, Michel Goossens and Sebastian Rahtz
There are many more recent books on LaTeX, but the fundamentals haven't changed much. A complete recent LaTeX book is available for free from Wikibooks as PDF or on the web.

While on the subject of LaTeX, this is a great time to introduce typography. Knuth, who wrote TEX, knew that fonts, spacing and device independence were as important as the layout of a document. To complement TEX, he wrote METAFONT. The original books on the subject are:

  • Computers and Typesetting / C The METAFONTbook (1986), Addison Wesley, Donald E. Knuth
  • Computers and Typesetting / D METAFONT: the Program (1986), Addison Wesley, Donald E. Knuth
  • Computers and Typesetting / E Computer Modern Typefaces (1986), Addison Wesley, Donald E. Knuth

Typography

We are now ready to jump right into typography. So that we are on the same page (sorry), typography is:

"the art and technique of arranging type to make written language legible, readable, and appealing when displayed." (according to Wikipedia)

Even when making charts or tables, it is important to be mindful of the impact of our choices. A font can impact the legibility and readability of the information. (Picture: typewriter art print by Dutch artists Hendrick Nicolaas Werkman, 1920s)





  • Typographic Design: Form and Communication (1985), Van Nostrand Reinhold, Rob Carter, Ben Day, Philip Meggs (this book has been used as a textbook in a wide range of domains, even showing up as a textbook for a computer media course at MIT)
  • Type & Image: The Language of Graphic Design (1992), Van Nostrand Reinhold, Philip Meggs (chapter 2 is particularly relevant with regards to the combination of words and images and chapter 4 on graphic resonance, particularly through typography)
  • Type & Typography (2002), Watson Guptill, Andrew Haslam and Phil Baines
  • Thinking with Type, 2nd revised and expanded edition: A Critical Guide for Designers, Writers, Editors, & Students (2010), Princeton Architectural Press, Ellen Lupton (1st edition is from 2004)
  • Just My Type (2011), Gotham, Simon Garfield

Various

A few books that didn't fit anywhere else, and didn't warrant adding a whole new section for each of them:

  • An Anthropologist on Mars (1995), Alfred A Knopf, Oliver Sacks ("The case of the colorblind painter" and "To see and not to see" are particularly relevant)
  • Understanding the Language of Printers and Graphic Arts Professionals (1999), Graphic Arts Educational Enterprises, Chris Sposi (this is normally a paperback, but thanks to a librarian, my copy is a hardcover - if you know what kiss register and imposition are, you probably don't need this)
  • Media Ethics, Cases and Moral Reasoning (2005), Pearson, Christians, Rotzoll, Fackler, McKee, Woods
  • The Back of the Napkin: Solving Problems and Selling Ideas with Pictures (2008), Portfolio, Dan Roam (this usually makes the list of books for startups, but I find this especially important for data science and communicating ideas)
  • Now You See It: How the brain science of attention will transform the way we live, work and learn (2011), Viking Penguin, Cathy N. Davidson
  • Information Doesn't Want to Be Free: Laws for the Internet Age (2014), McSweeney's, Cory Doctorow
  • How We Talk, the Inner Workings of Conversation (2017), Basic Books, N. J. Enfield

Writing

I enjoy writing and drawing on paper.

There are usually a half dozen ink bottles on my desk, along with many fountain pens. I also own all kinds of writing instruments: stencils, compass, mechanical pencils, markers.

Yet, I find it painful to write for the web. It is my Kryptonite, to a certain degree.

In that regard, feel free to ignore my readings on the subject, because it has not changed my perspective on writing for the web one bit...








  • The Elements of Style (1959), MacMillan, William Strunk & E.B. White (a classic - the original edition from 1920 is found in the historical section and there's an illustrated version that was published in 2004)
  • Writing for the Technical Professions (1987), Wadsworth, Thomas N. Trzyna & Margaret W. Batschelet
  • Maps with the News (1989), University of Chicago Press, Mark Monmonier
  • Style Guide for Business and Technical Communication (1999), Franklin Covey, Larry H Freeman (3rd edition)
  • Effective Writing for the Information Age: Elements Of Style For The 21st Century (2002), Norton, Bruce Ross-Larson
  • Content and Complexity: Information Design in Technical Communication (2003), LEA, Michael J. Albers, Beth Mazur
  • The Associated Press Stylebook (2011), Basic Books, Associated Press

I also have a Larousse Unabridged English-French dictionary at the ready, given that English is not my native tongue.

Conclusion

In the introduction, I mentioned two obstacles. The first, at the beginning of the data science process, was figuring out the proper question. In part I of this series, I highlighted the reason this was so important, quoting John Tukey:

"Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise"

Given we do have the right question, what is the second obstacle? It is not how to communicate the answer. Just consider the amount of literature available on the subject. No, the second obstacle is one illustrated by Arthur Conan Doyle in "The Red-Headed League", where Sherlock Holmes confides in Watson:

"I begin to think, Watson,” said Holmes, “that I make a mistake in explaining. ‘Omne ignotum pro magnifico,’ you know, and my poor little reputation, such as it is, will suffer shipwreck if I am so candid."

This Latin expression means 'everything unknown [taken] for magnificent'. Notwithstanding Sherlock's suggestion to leave the details in obscurity, do not be afraid to explain your results in a way that is easy to follow and understandable, even if makes your results seem less superhuman... Only then will you have communicated an answer to your customer's question.

Francois Dion

Chief Data Scientist, Dion Research LLC

@f_dion

NB: Also published on LinkedIn at: https://www.linkedin.com/pulse/ex-libris-data-scientist-part-vi-communication-francois-dion/

P.S. This wraps up my 6 part series, but there will be a follow-up article covering books for data science managers, directors and consultants.

Comments