(Summary in Finnish: Oppimispäiväkirjamerkinnät jatkuvat. Luku 2 oli paljon lyhyempi kuin kuvittelin (luen ekirjaa joka ei näytä sivunumeroita), joten suurimmaksi osaksi tässä kirjoituksessa keskityn lukupiirikeskustelun pohjalta nousseisiin havaintoihin ja ajatuksiin.)
Context: This is a second part in a series of reading / study diary notes, where I track my thoughts, comments, and (hopefully) learning process while I am reading How to Measure Anything by D. W. Hubbard together with some friends in a book club like setting. The first part in the series (vol.0) can be found here. All installments in the series can be identified by tag “Measure Anything” series on this blog. (Edit2021-02-24 : vol. 2.)
Summary of lessons from Chapters 1 & 2
Chapter 2 was much shorter than I anticipated! If you don’t recall, in Chapter 1 Hubbard outlines his basic thesis, which is (very briefly stated) that anything is measurable. Specifically, any and all quantities related to vague, often socially embedded stuff in business and government (“how much value we get from certain quality standards”, “how much value buying the product X to brings to us?”, “what are the effects of decision Y”) that are often thought impossible to measure. This can be done by applying correct, rigorous methods and enough effort into doing it. (Or that is the elevator speech for selling the book.)
In more detail (which I briefly touched in vol. 0), the primary purpose of measuring anything (in Hubbard’s words) is about reducing uncertainty about decisions. This can be done in many ways, starting from simple data acquisition and collecting them in Excel sheets, ending with quantitative models, and (Hubbard argues) this generally trumps mere intuition alone (because people tend to overvalue the soundness of their intuition). While perfect certainty is seldomly obtained, argument is reduced uncertainty from the measured data but also the act of thinking about what one can and should measure, helps making better decisions. Hubbard dubs the process presented in the later chapters of the book as “Applied Information Economics”, with following simple steps:
1. Define the decision.
2. Determine what you know now.
3. Compute the value of additional information. (If none, go to step 5.)
4. Measure where information value is high. (Return to steps 2 and 3 until further measurement is not needed.)
5. Make a decision and act on it. (Return to step 1 and repeat as each action creates new decisions.)
– How to Measure Anything, ch. 1.
However, the author grants that the audience may remain skeptical! So, in the chapter 2 of How to Measure Anything, we have some examples that illustrate how measurement solution may exist. The first two are examples with many are familiar, the third a bit less known.
First is Eratosthenes’ measurement of Earth’s circumference. The story and measurement themselves are quite well known, so instead of repeating it here let me link to Wikipedia’s explanation. (Did you know Eratosthenes was famous also for a prime number algorithm and was an acquaintance of Archimedes?) The second set of examples consists of a series of puzzles in the style made famous by Enrico Fermi, thus popularly known as Fermi problems. (How many piano tuners there are in Chicago?) The third example presents a school kid who set up a legit experiment to measure the effects claimed para-psychology-adjacent therapy for school science fair and consequently debunked it (with assistance James Randi; the details are bit long-winded to repeat here, but they don’t matter for the point being made).
The common property with all above-mentioned examples is that they demonstrate how measuring something is often possible with only minimal resources and thinking carefully what you already know. (Eratosthenes achieved remarkable accuracy with simple methods and measurements that were practical in his time. Fermi’s method for finding number of piano tuners does not involve any new measurements at all.) the science fair example is also similar in spirit: despite that the (claimed) phenomenon under investigation is supposed to be complex and mystical, is sometimes often enough to identify a core mechanism that would produce an easily understood measurable effect, which can be measured with simple methods (available to schoolgirl preparing a science fair presentation.
Points made during the discussion
So, what we though about it? I am not making an attempt to write a coherent nor all-inclusive recap, but present some of the more interesting ideas that came up. (Many of them are not mine! But I worded them from my notes in a way that I find interesting. Also I forgot to record who was saying what. I also have the benefit of writing everything here during the course of several days afterwards the discussion session.)
- My previous point about it having a tone of stereotypical American “business self-help” was met with some agreement. However, in retrospect, maybe some other could say that this kind style is just being effective at conveying information to a certain kind of audience. And anyway, the quintessential self-help book, How to Win Friends and Influence People contains a lot of sound advice.
- We talked how useful it is to have these kind of books that focus on the practical stuff. Many statistical textbooks remain difficult. It is also good to remember that while people like Galton, Pearson, Neyman, Spearman, Fisher and “Student”, who are usually best known as names for very abstract and theoretical, dry things in your statistic textbook, actually worked on many practical problems. (“Student”, that is, William Gosset used pen-name while publishing his paper on t-distribution because he worked at Guinness studying small samples and (presumably) Guinness did not want to their competitors know about the applications.)
- Someone pointed out this: There may be at least one potential problem with applying the lessons from this book in practice, which is that sometimes the perceived inability to measure the “intangible” is a feature, not a bug. Many people and many organizations claim they have objectives, targets, values and whatnot that are written in official programs and defended in loudly spoken statements, but those targets are not what they are really trying to achieve. The act of claiming to have some particular objectives serve some other purpose. Giving the behavior a more benign interpretation, such publicly-stated objectives may be only a part of larger set of objectives that are left unsaid. A true, factual instance of greenwashing would be example of this phenomenon; in such scenario, an org that greenwashes their product would not want anyone making a truthful evaluation about their product’s environmental impact.
- However, while I agree it is good to be aware of such things, I am not convinced how to usefully take such consideration into account while making decision when to engage in an analytical, quantitative decision-making procedure (like the one Hubbard presents). Afterwards, when I tried to come up with more examples of situations where organizations or individuals falsely pretend to be interested in either undervaluing or overvaluing something intangible, I concluded that collecting the information would be useful if you are in position to make a decision about the matter in the first place. Conversely, if you are a nobody in the hierarchy and can’t affect the $thing, it won’t make much difference if you find out how much value the purportedly intangible $thing has or makes, so maybe researching the matter is not very good way to use your time. (And maybe it is: if you know that the organization is allocating resources in sub-optimal way, and while you can’t affect it, it may be useful consider while making decision. Sell the stock? Vote for a different party? Relocate from the municipality? Brush up your CV?)
- It seems that the book is going to take quite pro-“quantitative models” stance. We were interested in reading more how the book tackle’s the different kinds of uncertainty. The probabilistic models often present many ways to quantify uncertainty in the phenomena being modeled. (That is already what the classic P-value does: it states the numeric uncertainty of observing a particular set of data under some specific model.) But how to take into account — or even quantify — the uncertainty about your model being wrong? Some useful google search terms for learning about this more are aleatory and epistemic uncertainty. Many other forms of uncertainty have been defined, too.
- While writing this write-up, I noticed that the author makes quite strong commitment while arguing how reducing uncertainty is beneficial and important. I started to wonder that such simple language masks quite many difficulties in what it means. See for example, bias-variance tradeoffs often discussed in machine learning textbooks. Suppose your uncertainty about true value of some parameter can be depicted as a normal distribution. Would you want to tighten your estimate by making the distribution less broad (smaller variance/s.d.)? What if it comes at a cost of your distribution being very tight and narrow but around a wrong (that is, biased) estimate, when previously you had more uncertainty mass covering also the true value? If you are making decisions based on the central parameter estimate and on it only, what matters is that the central moment of your uncertainty distribution is correctly positioned, not its s.d.
- Also vaguely related to the above points, the book presents an example where some limited Fermi estimation is useful in determining whether to start up a particular kind of business in some region (whether there is enough customer base, and so). A counterpoint was raised: according to anecdotal stories (and maybe common sense building a successful start-up business seems to require something more than relatively simple calculation exercise with numbers. (Common sense, because many start-ups by persons who can make rational calculations fail and birth of new business empires are rare.) I am not sure what about to think about this objection. Maybe it is a case that easy-sounding examples look too easy because the truly valuable information necessary to build a successful company (or any other enterprise) is difficult to obtain and requires non-trivial insight, analysis, or luck. (But if it can’t be taught in a book, what is the point of reading a book?) I do not know. Let us continue anyway!
- Related to the point about value of truthful information being related to your position or (in more generic terms) opportunity to use it, I found the subsequent discussion on power relationships in measurement particularly insightful. (In retrospect, this was probably a half-formed reinvention of ideas coming form Foucault. Sometime reinventing a wheel is still an useful exercise in learning the art of wheel-making?)
- The argument raised was that introduction of formal way to measure something or anything, in practice usually in form of metric, is a form of wielding power. If you have an annoying productivity metric imposed on you, you are certainly not in control of the situation. People are often skeptical of metrics given by others, and often like to present objections related to what is measured and how the information is used.
- The issue seem to disappear when you are making measurements for your own benefit. Example: Many people like the walk distance / step number counters nowadays common in phones and gamify their exercise routines. (Some other acquaintances of mine recently simply decided walk to Mordor.)
- However, while I was in school, every winter the teacher handed out a photocopied piece of paper with ski-shaped progress bar, where every pupil was supposed to track how many km’s we skied during that winter. My parents put it on refrigerator, where my slow progress loomed over me menacingly. It was a very 00s Finland thing. I hated it.
- I also think many people would view with suspicion if government required everyone to wear a walk meter and would put some metric targets where every citizen is required to walk to Mordor for national health benefit. Or your insurance company would require you to wear one to calculate premiums.
- A bad metric may become a target and distract everybody from what really is going on and what was the phenomena supposed to be measured in the first place.
- In a more adversarial setup, it may become a move to intentionally suffocate a rival organization by introducing wrong metrics, misleading metrics, or metrics that simply distract.
- In my opinion, all this sounds like a very good reason to learn how to reason which metrics should or should not be applied, and how to make good measurements, and also present to reasoning convincingly to others.
- The argument raised was that introduction of formal way to measure something or anything, in practice usually in form of metric, is a form of wielding power. If you have an annoying productivity metric imposed on you, you are certainly not in control of the situation. People are often skeptical of metrics given by others, and often like to present objections related to what is measured and how the information is used.
- As someone pointed out, all of this seems obviously related to principal-agent problems.
- And also related to the benefits of free market based economies in contrast to central planning. In a country-sized unit, if there is a single authority who has the power to set the metrics for all industries, there are a country-sized amount of opportunities for them to fail in interesting ways. (Metrics becoming detached from things they are supposed to measure; people lower in the command chain misrepresenting the figures they report; mistakes in designing the metric and acting on them; the full difficulty of solving too large optimization problems to make the optimal decision…) Having smaller units where the decisions and the measurements they are based on are more close to each other and the underlying reality seems to way to both sidestep measurement problems and also bring the agency close to the level of individuals concerned.
- Finally, (I think this idea was mine) precommitting into making a decision based on the results of some particular measurement procedure can be a powerful tactic, at least in two ways:
- If you seriously consider the possibility that the decision is going either way based on the information you obtain, it highlights the need for good information. It should motivate one to set up the experiments, measurement methods, and other details very carefully, and encourage thinking about the problem. What exactly are you going to measure? (I believe Taleb would write about “having skin in the game”.)
- It would also help to combat any internal psychological biases that otherwise could enter the decision-making procedure.
- On the other hand, there is always a possibility that you simply made a mistake in the measurement process. I suppose that for that reason, measurements need some ultimate feedback from real life.
Other links:
- https://en.wikipedia.org/wiki/OKR
- More links about aleatoric / epistemic uncertainty:
Conclusion
As the reader may notice, the discussion points I recorded above are quite general, instead of addressing the content of Chapters 1 and 2 in a very detailed and immediate way. While I did not write everything we talked about in the meeting (only my reflections afterwards based on my notes), it illustrates quite well what we talked about. I think this is mostly because the both chapters were quite short and introductory, and I assume we will return to the concepts presented in them as we progress forward.
In the next meet, we will discuss chapters 3 and 4; I am too lazy to prepare notes beforehand, but review both the contents and the points from subsequent discussion at the same time (like I ended up doing here).
In previous post I also made up some homework for myself to think about while reading the book; as I have not yet thought too much how to approach the homework tasks based on these chapters alone, I will wait a little bit before addressing them. (Sure, the book presents the numbered checklist I quoted above, but I’d like to see some if the author has more to say how to go about steps 1 and 2.)