🧵🧵🧵

my job is very simple, i go around a store and fill carts for people who have put in online orders

let me explain how this minimum wage job is at the bottom of several coordination problem hells at once and why nobody sticks around more than a month in it

🧵🧵🧵
first a review of Goodhart's law

"When a measure becomes a target, it ceases to be a good measure"

and its more optimistic, pithy corollary

"What gets measured gets done"

this is obviously not a binary thing but a thing of degree, which is why it's useful to know
when I got hired I got this nifty phone

It has a barcode scanner on it

All of the orders I'm expected to fulfill are in the app itself, so in theory, all I'm "supposed" to do is grab a fresh phone each day, load up that app, and start doing orders

The gateway to hell
First off let's think about the phone itself

I have to log in to it with my own personal work credentials

Which means that any potential metric that could be tracked on this bespoke lil Android is something I could be measured against

Called in and fired about. As you do.
But I don't actually *know* what metrics are being tracked, just that the *possibility* is there.

Even that is enough to invoke the paranoia behind Goodhart's law when it comes to always-on information technology in a way human supervision pales in comparison to.
What kinds of metrics would be the most obvious?

- Number of orders fulfilled in full
- Number of orders partially fulfilled
- Uptime in the app

Now here we run into an issue, because these metrics are not the ones my immediate boss has as metrics.
Phew. Sorry. I'm actually /at/ work right now, and I just hauled 30 2×12-20 footers across half a mile of store. Heavy enough that I had to make sure never to lose momentum while moving it, or else.

It took 2 hours and a favor from a guy with a forklift.

Which reminds me...
... one of the ways you can reframe Goodhart's law is that, it's not necessarily a bad thing -- unless you're dealing with shitty metrics.

Did the "obvious" metrics I listed before include a bonus for needing to use a forklift?

Or... For weight, period?
Maybe the metricians do keep tabs on these, of course. It's not impossible.

But these seem like the kinds of modifications the dev team would implement based on user feedback.
Which you're not going to get in a minimum wage job filled with high school dropouts scared of losing their job for speaking up.
Anyway.
Let's turn our attention to one of the three "obvious" metrics: partially-fulfilled orders.

City store space is expensive, so we don't actually carry that much inventory. There's a certain % of shelves that are unfilled each day. Which shelves these are are random, for [reasons]
But that's no issue. We have big warehouses that can refill inventory easily overnight!

Well, if we /know/ we have to refill Shelves X, Y and Z.

Whose job does that usually fall to? Me. The guy running around following orders. I'm the perfect scout for this kind of thing!
Except... Think about the context in which *I* discover that information.

95+% of the time, I only take notice that we're out of an item when I'm trying to fulfill an order and find we don't actually have it in stock.

A.K.A.: when I'm fulfilling a doomed-to-be-partial order.
⭐⭐⭐

Bonus Self-Shill Round! I'm actually considering getting a Master's degree in Operations Research, since my college is Top 5 in the nation for industrial engineering.

Like the way I'm thinking here? Hire me.

⭐⭐⭐
The problem with partial orders is you force customers to make a decision: Do I make two trips, or do I wait until the rest of my stuff comes in?

(I know this is how it works, because our workflow with partial orders is to issue a refund only for the things we couldn't find.)
But I work in a hardware store.

Chances are most people who are ordering plywood and nails can't use the plywood if they don't have the nails.

This causes needless frustration.
https://twitter.com/Virtual1nstinct/status/1304390401202237441?s=19

Let's introduce a term, "Goodhartian uncertainty", to denote making decisions under uncertainty of the utility function you are being judged by.
Goodhartian uncertainty requires you to develop a more robust mental model of the underlying processes, because you're not just sure how well you're doing against a benchmark -- you don't even know what the benchmarks themselves are.
Number of partial orders fulfilled might be a straight-up negative metric for me. Each one I fulfill marks me as a worse worker, because I didn't get the full order done.

It may start positive and shift negative as a ratio of partials versus completed orders to give some leeway.
But both seem like reasonable possibilities to me, and so under Goodhartian uncertainty, the winning move is to try to minimize the partial orders I personally do.

Wait. Personally? As in, my incentives have become misaligned from the company's?

Oh, fuck.

Game theory time.
I said before my immediate boss has different incentives to me, and the partial orders thing seems to really underscore it.

If I was directly under a bunch of lil order runner lemmings, what would I want? More or fewer partials? https://twitter.com/Virtual1nstinct/status/1304396494963965955?s=19
Again, I honestly don't know. As little insight as I have into my own metrics, I have even less into what theirs is. I haven't /been/ in their shoes, and I'm not stupid enough to think I'm smart enough to be able to replicate that experience in high fidelity.
Let's abandon that line of reasoning and focus on something more concrete. What does my boss seem to /want/ me to do?
They want me to start from the top of the app I use to fulfill orders, and work my way down.

So we have a pretty good guess that an important metric for them is "median time between an order being placed, and an order being fulfilled".

This is VERY DIFFERENT from my incentives.
Without fail, the first orders listed in the app are usually they're because other fulfillment guys *tried* and *failed* to complete them.

Since orders are listed chronologically, this means the top of the app isn't just filled with the oldest unfilled orders --
it's filled with the oldest, unfilled, *unfillable* orders.
My personal metrics don't reward running around the store, figuring out that we can't fulfill this-or-that order, and then relaying that back to my boss. I can't /personally/ mark orders as unfillable; /my boss/ is the one who has to clear those.
So if I were an automation engineer watching the data of my own profile, it would be very easy to see I haven't actually been able to do any orders half the day (because I'm running around telling my boss what we *can't* do), and then flag me as a "Low-Impact" worker.
Of course, everyone in my position in the store has come to realize this by one means or another, so we've all developed tactics to just... Avoid talking to our boss, so we don't... Lose our jobs due to corporate.

Fuck, this is getting nerve-wracking.
Semi-relevant. https://twitter.com/Virtual1nstinct/status/1273219160894562304?s=19
"Andrew, if your immediate boss is telling you to do something, you should probably do it. You're overthinking this."

A) Overthinking? Nice to meet you.

B) I agree. I'm actually the most willing person in my department to actually talk to my boss, by a good margin.
I'm sociable and sensitive. That's a double edged sword.

Even though my boss tells me to just try my best and not worry about having to report back partial orders, nine times out of ten with I come back reporting one, they send me back to double check in much closer detail.
And, yes, sometimes I do make mistakes, and I don't find merchandise that actually is there, because as literally this whole thread should be communicating, I'm under time-pressure from an unknown shadow entity.
You can imagine why I don't like being told "Go back and get someone who specializes in that aisle to help you" after I've been looking for 20 minutes, even if it means we /can/ fulfill the order at a mere cost of 2 hours and calling in a favor. https://twitter.com/Virtual1nstinct/status/1304411317940498432?s=19
But like I said -- I agree. I like my boss.

I listen to my boss because I trust them to keep my ass from being fired, because they see qualities about me as a worker that they find valuable.

What happens when that human element is lost?
You can follow @Virtual1nstinct.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.