• Secondary forces of debt

    A general thing I hadn’t noticed about debts until lately:

    • Whenever Bob owes Alice, then Alice has reason to look after Bob, to the extent that increases the chance he satisfies the debt.
    • Yet at the same time, Bob has an incentive for Alice to disappear, insofar as it would relieve him.

    These might be tiny incentives, and not overwhelm for instance Bob’s many reasons for not wanting Alice to disappear. 

    But the bigger the owing, the more relevant the incentives. When big enough, the former comes up as entities being “too big to fail”, and potentially rescued from destruction by those who would like them to repay or provide something expected of them in future. But the opposite must exist also: too big to succeed—where the abundance owed to you is so off-putting to provide that those responsible for it would rather disempower you. 

    And if both kinds of incentive are around in whisps whenever there is a debt, surely they often get big enough to matter, even before they become the main game. 

    For instance, if everyone around owes you a bit of money, I doubt anyone will murder you over it. But I wouldn’t be surprised if it motivated a bit more political disempowerment for you on the margin.

    There is a lot of owing that doesn’t arise from formal debt, where these things also apply. If we both agree that I—as your friend—am obliged to help you get to the airport, you may hope that I have energy and fuel and am in a good mood. Whereas I may (regretfully) be relieved when your flight is canceled.

    Money is an IOU from society for some stuff later, so having money is another kind of being owed. Perhaps this is part of the common resentment of wealth.

    I tentatively take this as reason to avoid debt in all its forms more: it’s not clear that the incentives of alliance in one direction make up for the trouble of the incentives for enmity in the other. And especially so when they are considered together—if you are going to become more aligned with someone, better it be someone who is not simultaneously becoming misaligned with you. Even if such incentives never change your behavior, every person you are obligated to help for an hour on their project is a person for whom you might feel a dash of relief if their project falls apart. And that is not fun to have sitting around in relationships. 

    (Inpsired by reading The Debtor’s Revolt by Ben Hoffman lately, which may explicitly say this, but it’s hard to be sure because I didn’t follow it very well. Also perhaps inspired by a recent murder mystery spree, in which my intuitions have absorbed the heuristic that having something owed to you is a solid way to get murdered.)

  • Podcasts: AGI Show, Consistently Candid, London Futurists

    For those of you who enjoy learning things via listening in on numerous slightly different conversations about them, and who also want to learn more about this AI survey I led, three more podcasts on the topic, and also other topics:

    • The AGI Show: audio, video (other topics include: my own thoughts about the future of AI and my path into AI forecasting)
    • Consistently Candid: audio (other topics include: whether we should slow down AI progress, the best arguments for and against existential risk from AI, parsing the online AI safety debate)
    • London Futurists: audio (other topics include: are we in an arms race? Why is my blog called that?)
  • What if a tech company forced you to move to NYC?

    It’s interesting to me how chill people sometimes are about the non-extinction future AI scenarios. Like, there seem to be opinions around along the lines of “pshaw, it might ruin your little sources of ‘meaning’, Luddite, but we have always had change and as long as the machines are pretty near the mark on rewiring your brain it will make everything amazing”. Yet I would bet that even that person, if faced instead with a policy that was going to forcibly relocate them to New York City, would be quite indignant, and want a lot of guarantees about the preservation of various very specific things they care about in life, and not be just like “oh sure, NYC has higher GDP/capita than my current city, sounds good”.

    I read this as a lack of engaging with the situation as real. But possibly my sense that a non-negligible number of people have this flavor of position is wrong.

  • Podcast: Center for AI Policy, on AI risk and listening to AI researchers

    I was on the Center for AI Policy Podcast. We talked about topics around the 2023 Expert Survey on Progress in AI, including why I think AI is an existential risk, and how much to listen to AI researchers on the subject. Full transcript at the link.

  • Is suffering like shit?

    People seem to find suffering deep. Serious writings explore the experiences of all manner of misfortunes, and the nuances of trauma and torment involved. It’s hard to write an essay about a really good holiday that seems as profound as an essay about a really unjust abuse. A dark past can be plumbed for all manner of meaning, whereas a slew of happy years is boring and empty, unless perhaps they are too happy and suggest something dark below the surface. (More thoughts in the vicinity of this here.)

    I wonder if one day suffering will be so avoidable that the myriad hurts of present-day existence will seem to future people like the problem of excrement getting on everything. Presumably a real issue in 1100 AD, but now irrelevant, unrelatable, decidedly not fascinating or in need of deep analysis.

  • Twin Peaks: under the air

    Content warning: low content

    ~ Feb 2021

    The other day I decided to try imbibing work-relevant blog posts via AI-generated recital, while scaling the Twin Peaks—large hills near my house in San Francisco, of the sort that one lives near and doesn’t get around to going to. It was pretty strange, all around.

    For one thing, I was wearing sunglasses. I realize this is a thing people do all the time. Maybe it’s strange for them too, or maybe theirs aren’t orange. Mine were, which really changed the situation. For one thing, the glowing streetscapes felt unreal, like cheap science fiction. But also, all kinds of beauty seemed to want photographing, but couldn’t be seen with my camera. It was funny to realize that I’m surrounded by potential beauty all the time, that I would see if I had different eyes, or different glasses, or different sensory organs all together. Like, the potential for beauty is as real as the beauty I do see. (This is perhaps obvious, but something being obvious doesn’t mean you know it. And knowing something doesn’t mean you realize it. I’d say I knew it, but hadn’t realized it.)

    And then my ears were cornered in by these plugs spouting electronic declarations on the nature of coherent agents and such, which added to my sense of my head just not really being in the world, and instead being in a cozy little head cockpit, from which I could look out on the glowing alien landscape.

    My feet were also strange, but in the opposite direction. I recently got these new sock-shoes and I was trying them out for the first time. They are like well-fitting socks with strong but pliable rubber stuff sprayed on the bottom. Wearing them, you can feel the ground under your feet, as if you were bare-foot. Minus the sharp bits actually lacerating your feet, or the squishy bits sullying them. Walking along I imagined my freed feet were extra hands, holding the ground.

    I had only been up to Twin Peaks twice before, and I guess I had missed somehow exactly how crazy the view was. It was like standing on a giant breast, with a city-sea-bridge-forest-scape panoramaed around and under you over-realistically. The bridge disappeared into mystical mists and the supertankers swam epically on the vast blue expanse. I tried to photograph it multiple times but failed, partly because my camera couldn’t capture the warm orange tinge of the sea and the bridge rising from the burning mists, and partly for whatever reason that things sometimes look very different in photographs, and partly because I am always vaguely embarrassed photographing things with people looking at me, and there was a steady smattering of them.

    The roads had been blocked off to traffic during the pandemic. From a car I don’t realize what vast plateaus winding hillside roads are. For us pedestrians, these were like concert stages.

    The people I saw on my way up were either flying down the swooping roads on bikes and skateboards, in a fashion that made me involuntarily rehearse what I would do when they fell off, or flying unrealistically up the swooping roads on bikes, in a fashion that made me appreciate how good the best electric bikes must be now. I noticed as I watched one speed above me in awe that he flew the brand of his bourgeoisie bicycle on the back of his shirt, and wondered if he was just paid by them to ride up and down here all day, in the hope that someone would be so impressed that they would jot down the t-shirt label as the only clue to the rapidly disappearing bike’s identity, then google it later.

    I wandered atop the peaks, and confusingly collected a mob of crows flying above, apparently interested in me specifically. This was reasonably sinister, and in Australia birds can attack you, so I investigated on my phone, while walking hesitantly below the circling birds. At last they descended and alit on the road and guardrail around me, and stood looking at me.

    View with ravens staring

    This picture captures the bizarreness of the situation about as badly as it captures the awesomeness of the scenery. It’s rare to be so much the center of a social situation with so little notion of what is expected of you or the meaning of it.

    I think things then just kind of dissipated and I made efficiently for home.

  • What is my Facebook feed about?

    Sometimes I look at social media a bunch, but it would be hard for me to tell you what the actual content is, I suppose because whenever I’m looking at it, I’m focused on each object thing level in turn, not the big picture. So sometimes I’m curious what it is I read about there. Here is the answer for Facebook, in 2019—according to a list I found that appears to be a survey of such—and again now. Plausibly not at a useful level of abstraction, but I have a bit of a migraine and no more energy for this project.

    November 11 2019

    1. Question about other people’s word usage
    2. Question about other people’s inferences from word usage
    3. Illustrated sex joke
    4. Encouragement and instructions for opening up communication with attractive strangers in public places
    5. Cute kid quote
    6. Historic anti women’s suffrage leaflet
    7. Cute kid quote and question about word usage
    8. Recommendation and anecdote for Roam
    9. Humorous anecdotal request for computer security problems
    10. Joke I don’t get about Jesus with lots of emoticons
    11. Sokal affair
    12. Advice on surviving bushfires
    13. Feminist writer screenshots and describes random online abuse from man
    14. Sharing of personal health data
    15. Science says a thing about dinosaurs and space
    16. Tax policy trolling
    17. Saudi spies at twitter news
    18. Sexual/biological facts
    19. Anecdote about medical system
    20. Ethics injoke
    21. Current reading list
    22. Travel photos
    23. Anecdote about australian bushfires with tenderness
    24. Long letter about policy goings on within medical system
    25. Request for acronym unknown to me
    26. Funny law
    27. Science about biology, cryonics
    28. Question about word usage
    29. Politics opinion on events
    30. Futuristic anime style politics cartoon

    Notable patterns:

    Questions about word usage: 4

    Kid related: 2

    May 15, 2024

    (Before checking, my sense is that the rate of posts about children and getting married is way up.)

    1. AI company politics commentary
    2. Job and city change
    3. Invitation to help make game
    4. Description of experience of making music
    5. Book launch, project launch, new house
    6. Cryptic fertility life update
    7. Social commentary on language, gender and wokeness
    8. Old photo of two famous men
    9. Old photo of author winning an award
    10. Event ad
    11. Death of father
    12. Photo of OP kissing
    13. New job
    14. Update on losing job
    15. Wedding planning views
    16. Book launch
    17. Social commentary around political ideologies
    18. Death of dog
    19. Questioning claim about changes in breathing rate over history
    20. Take on home buying
    21. AI lab politics
    22. Photo of partner on trip
    23. Photo of self at work
    24. Own photo of bird
    25. Commentary on culture of judgment and author’s parents’ behavior
    26. Remembering child relative who died
    27. General self-help style commentary on human behavior
    28. Photo of dogs
    29. Mothers’ day and Aurora photos
    30. Invitation to help make game

    Some notable patterns:

    Kid related: 0-3 (maybe down from 2)

    Marriage related: 1

    —> I’m pretty wrong about the density of children and marriage related posts

    Job/book updates: 5 (up from 0)

    Smaller projects: 5 (up from 2-4)

    —> Actually a lot of project related posts

    Humor: ~0 (down from at least 5)

    Word usage: 1 (down from at least 4)

    —> Some classic sources of entertainment are way down (or we see random noise)

  • Things I hate about Partiful

    • The aesthetic for all parties is basically the same.

    • That aesthetic is bad.

    • A party is an aesthetic creation, so having all guests’ first experience of the thing you are offering them be a chintzy piece of crap that matches every other chintzy piece of crap is much worse than if the thing they were selling was like low-quality toilet paper or something.

    • As far as I can tell, the only way to be informed of parties using Partiful is via SMS. Perhaps this is idiosyncratic to me, but I have no desire to ever use SMS. I also don’t want to receive a message in the middle of whatever I’m doing to hear about a new party happening. Fuck off. This should only happen if the party is very time sensitive and important. Like if a best friend or much sought after celebrity is having a party in the next twenty minutes, sure text me, if you don’t have WhatsApp. Otherwise, ffs email me.

    • As far as I can tell, the only way to message the host a question about the party is to post it to the entire group. Yet there are very few questions I want to text an entire guest list about.

    • Supposing I make the error of doing that (which I do not), as far as I can tell, the guest list receives an sms saying that I have sent a message, and they have to click to follow a link to the website to see what the message is.

    • Supposing I am considering posting such a message to the entire group, Partiful will instruct me to ‘Write something fun!’ Fuck off. I’ll decide what to write, and don’t need the condescending needling. Relatedly, if I debase myself and host something on Partiful, while I’m drafting the invitation, it has been known to describe me as ‘the wonderful [host]’. I don’t want your narrativizing. a) Maybe I’m not wonderful. You don’t know shit about me. b) I actually can’t tell if you are going to seriously write that when I post the event, so I need to investigate how to mitigate that possibility.

    • When I’m invited to a party, things I’m not allowed to know until after I RSVP include a) who else is invited or going, and b) where it is. Like, what do you want me to decide about social events based on? Is this communism? It also feels like such officious withholding. Like, are you serious? You’re inviting me to hang out with people and you aren’t going to tell me which people until I say I’ll come? Are you on a power trip? Who is even on the power trip? The creators of Partiful on behalf of the hosts? “No, stand your ground, you’re a party creator now, don’t give up your advantage—they’ll give in and give in and RSVP eventually”.

    • So naturally I go through the busywork of RSVPing “maybe” so I can see key details of the event. Which is annoying. But also it constitutes me saying to my friend “maybe I’ll come to your party”, which is a slightly shitty thing to say to an actual person I’m friends with, if for instance I think it’s pretty unlikely I’ll come to their party and merely want to check if this is the rare party that I do want to go to. Furthermore, now I’ve made it clear (to an as-yet unknown set of people who RSVP’d) that I’ve seen the party and am explicitly and concretely probably-rejecting it. I’m in a whole public social interaction with it. Whereas I might have liked to examine the party without engaging, leaving my knowledge of it and position on it ambiguous.

    • Ok, so three steps in: I have moved from a contentless text message to a website and RSVP’d maybe and can at last see key details. At that point Partiful pops another notification into my phone telling me that I RSVP’d maybe.  Why? I know I RSVP’d maybe, because I was the person who did it, and it was three seconds ago. If later I don’t know and want to know, I’ll actually check in the event invitation, not the interminable list of similar looking messages from Partiful. And I was actually already looking at the event invitation until you distracted me with my phone. And if occasionally I err—thinking I RSVP’d maybe but really having RSVP’d a different thing, say—then this isn’t a fucking space expedition; things will be okay.

    • Then probably I just forget about it and leave it as a maybe, because I have other things to do in life, and this has already gone on for way too long, and I have technically RSVP’d, and who knows if I’ll go to a thing. So that’s annoying for the host, if they might have liked a real RSVP.

    • But if I do try to RSVP more specifically, my options are “I’m going” and “Can’t go”. So in the situation that arises nearly every time—I can go but I don’t want to—Partiful has decided I’m going to just tell a little white lie to the host? Or just that I should go unless I can’t? It’s true that declining events is a difficult issue, and for many people white lies are the way out. But that’s because it’s too awkward to say “I don’t want to”. But if there are only two messages you can send—basically ‘yes’ and basically ‘no’—selected by a company, it’s not actually awkward to choose the ‘no’ one, because it doesn’t distinguish not wanting to and not being able to. There’s no reason for Partiful to put a lie in your mouth there. It’s true that it’s also not that bad to say a falsehood, given that you only have two options, and are clearly most of the time going to want to say a thing you haven’t been given the option to say. But why add this note of false smarm? Like, Partiful could make the options “yes” and “no because my mother is in hospital”, and I wouldn’t hold it against people if they clicked the latter, but I would hold it against the maker of the options.

    • If I decide to tell my friend I can’t go to their party, Partiful will message me on my phone again, with a crying face, saying sorry I can’t go to the party. I don’t need this shit. I don’t want your emotional involvement in my decisions about when to socialize, corporation. True, it’s only fake emotional involvement, but I don’t want to be bathed in the fake emotions of corporations either. I’m a social animal, this stuff does change how I feel, change my sense of the sea of minds I feel I’m surrounded by, the emotions of the world. At a basic level, it’s hard to say no to things, and yet I have to: the alternative is to waste my life being overwhelmed and not even able to make progress on the spew of second-rate things I sleepwalk into saying yes to. And being pinged with pictures of crying when I judge that something isn’t the right thing for me to go to, and say no doesn’t help me. Does it help the host? Do they want to manipulate me? Probably not. Is Partiful’s hope that I learn to go to Partiful parties a little bit more if I feel micro-guilty and micro-sad when I decline them, and then the hosts have the sense that people come to Partiful parties more, so the company benefits? I doubt it, I guess the makers just lack much sense of what a good social world could look like, and are thoughtlessly enacting trumped up emotion where available, in the hope that trumped up emotion gets attention, and attention gets success.

  • An explanation of evil in an organized world

    A classic problem with Christianity is the so-called ‘problem of evil’—that friction between the hypothesis that the world’s creator is arbitrarily good and powerful, and a large fraction of actual observations of the world.

    Coming up with solutions to the problem of evil is a compelling endeavor if you are really rooting for a particular bottom line re Christianity, or I guess if you enjoy making up faux-valid arguments for wrong conclusions. At any rate, I think about this more than you might guess.

    And I think I’ve solved it!

    Or at least, I thought of a new solution which seems better than the others I’ve heard. (Though I mostly haven’t heard them since high school.)

    The world (much like anything) has different levels of organization. People are made of cells; cells are made of molecules; molecules are made of atoms; atoms are made of subatomic particles, for instance.

    You can’t actually make a person (of the usual kind) without including atoms, and you can’t make a whole bunch of atoms in a particular structure without having made a person. These are logical facts, just like you can’t draw a triangle without drawing corners, and you can’t draw three corners connected by three lines without drawing a triangle. In particular, even God can’t. (This is already established I think—for instance, I think it is agreed that God cannot make a rock so big that God cannot lift it, and that this is not a threat to God’s omnipotence.)

    So God can’t make the atoms be arranged one way and the humans be arranged another contradictory way. If God has opinions about what is good at different levels of organization, and they don’t coincide, then he has to make trade-offs. If he cares about some level aside from the human level, then at the human level, things are going to have to be a bit suboptimal sometimes. Or perhaps entirely unrelated to what would be optimal, all the time.

    We usually assume God only cares about the human level. But if we take for granted that he made the world maximally good, then we might infer that he also cares about at least one other level.

    And I think if we look at the world with this in mind, it’s pretty clear where that level is. If there’s one thing God really makes sure happens, it’s ‘the laws of physics’. Though presumably laws are just what you see when God cares. To be ‘fundamental’ is to matter so much that the universe runs on the clockwork of your needs being met. There isn’t a law of nothing bad ever happening to anyone’s child; there’s a law of energy being conserved in particle interactions. God cares about particle interactions.

    What’s more, God cares so much about what happens to sub-atomic particles that he actually never, to our knowledge, compromises on that front. God will let anything go down at the human level rather than let one neutron go astray.

    What should we infer from this? That the majority of moral value is found at the level of fundamental physics (following Brian Tomasik and then going further). Happily we don’t need to worry about this, because God has it under control. We might however wonder what we can infer from this about the moral value of other levels that are less important yet logically intertwined with and thus beyond the reach of God, but might still be more valuable than the one we usually focus on.

  • The first future and the best future

    It seems to me worth trying to slow down AI development to steer successfully around the shoals of extinction and out to utopia.

    But I was thinking lately: even if I didn’t think there was any chance of extinction risk, it might still be worth prioritizing a lot of care over moving at maximal speed. Because there are many different possible AI futures, and I think there’s a good chance that the initial direction affects the long term path, and different long term paths go to different places. The systems we build now will shape the next systems, and so forth. If the first human-level-ish AI is brain emulations, I expect a quite different sequence of events to if it is GPT-ish.

    People genuinely pushing for AI speed over care (rather than just feeling impotent) apparently think there is negligible risk of bad outcomes, but also they are asking to take the first future to which there is a path. Yet possible futures are a large space, and arguably we are in a rare plateau where we could climb very different hills, and get to much better futures.

  • Experiment on repeating choices

    People behave differently from one another on all manner of axes, and each person is usually pretty consistent about it. For instance:

    • how much to spend money
    • how much to worry
    • how much to listen vs. speak
    • how much to jump to conclusions
    • how much to work
    • how playful to be
    • how spontaneous to be
    • how much to prepare
    • How much to socialize
    • How much to exercise
    • How much to smile
    • how honest to be
    • How snarky to be
    • How to trade off convenience, enjoyment, time and healthiness in food

    These are often about trade-offs, and the best point on each spectrum for any particular person seems like an empirical question. Do people know the answers to these questions? I’m a bit skeptical, because they mostly haven’t tried many points.

    Instead, I think these mostly don’t feel like open empirical questions: people have a sense of what the correct place on the axis is (possibly ignoring a trade-off), and some propensities that make a different place on the axis natural, and some resources they can allocate to moving from the natural place toward the ideal place. And the result is a fairly consistent point for each person. For instance, Bob might feel that the correct amount to worry about things is around zero, but worrying arises very easily in his mind and is hard to shake off, so he ‘tries not to worry’ some amount based on how much effort he has available and what else is going on, and lands in a place about that far from his natural worrying point. He could actually still worry a bit more or a bit less, perhaps by exerting more or less effort, or by thinking of a different point as the goal, but in practice he will probably worry about as much as he feels he has energy for limiting himself to.

    Sometimes people do intentionally choose a new point—perhaps by thinking about it and deciding to spend less money, or exercise more, or try harder to listen. Then they hope to enact that new point for the indefinite future.

    But for choices we play out a tiny bit every day, there is a lot of scope for iterative improvement, exploring the spectrum. I posit that people should rarely be asking themselves ‘should I value my time more?’ in an abstract fashion for more than a few minutes before they just try valuing their time more for a bit and see if they feel better about that lifestyle overall, with its conveniences and costs.

    If you are implicitly making the same choice a massive number of times, and getting it wrong for a tiny fraction of them isn’t high stakes, then it’s probably worth experiencing the different options.

    I think that point about the value of time came from Tyler Cowen a long time ago, but I often think it should apply to lots of other spectrums in life, like some of those listed above.

    For this to be a reasonable strategy, the following need to be true:

    • You’ll actually get feedback about the things that might be better or worse (e.g. if you smile more or less you might immediately notice how this changes conversations, but if you wear your seatbelt more or less you probably don’t get into a crash and experience that side of the trade-off)
    • Experimentation doesn’t burn anything important at a much larger scale (e.g. trying out working less for a week is only a good use case if you aren’t going to get fired that week if you pick the level wrong)
    • You can actually try other points on the spectrum, at least a bit, without large up-front costs (e.g. perhaps you want to try smiling more or less, but you can only do so extremely awkwardly, so you would need to practice in order to experience what those levels would be like in equilibrium)
    • You don’t already know what the best level is for you (maybe your experience isn’t very important, and you can tell in the abstract everything you need to know - e.g. if you think eating animals is a terrible sin, then experimenting with more or less avoiding animal products isn’t going to be informative, because even not worrying about food makes you more productive, you might not care)

    I don’t actually follow this advice much. I think it’s separately hard to notice that many of these things are choices. So I don’t have much evidence about it being good advice, it’s just a thing I often think about. But maybe my default level of caring about things like not giving people advice I haven’t even tried isn’t the best one. So perhaps I’ll try now being a bit less careful about stuff like that. Where ‘stuff like that’ also includes having a well-defined notion of ‘stuff like that’ before I embark on experimentally modifying it. And ending blog posts well.

  • Mid-conditional love

    People talk about unconditional love and conditional love. Maybe I’m out of the loop regarding the great loves going on around me, but my guess is that love is extremely rarely unconditional. Or at least if it is, then it is either very broadly applied or somewhat confused or strange: if you love me unconditionally, presumably you love everything else as well, since it is only conditions that separate me from the worms.

    I do have sympathy for this resolution—loving someone so unconditionally that you’re just crazy about all the worms as well—but since that’s not a way I know of anyone acting for any extended period, the ‘conditional vs. unconditional’ dichotomy here seems a bit miscalibrated for being informative.

    Even if we instead assume that by ‘unconditional’, people mean something like ‘resilient to most conditions that might come up for a pair of humans’, my impression is that this is still too rare to warrant being the main point on the love-conditionality scale that we recognize.

    People really do have more and less conditional love, and I’d guess this does have important, labeling-worthy consequences. It’s just that all the action seems to be in the mid-conditional range that we don’t distinguish with names. A woman who leaves a man because he grew plump and a woman who leaves a man because he committed treason both possessed ‘conditional love’.

    So I wonder if we should distinguish these increments of mid-conditional love better.

    What concepts are useful? What lines naturally mark it?

    One measure I notice perhaps varying in the mid-conditional affection range is “when I notice this person erring, is my instinct to push them away from me or pull them toward me?” Like, if I see Bob give a bad public speech, do I feel a drive to encourage the narrative that we barely know each other, or an urge to pull him into my arms and talk to him about how to do better?

    This presumably depends on things other than the person. For instance, the scale and nature of the error: if someone you casually like throws a frisbee wrong, helping them do better might be appealing. Whereas if that same acquaintance were to kick a cat, your instinct might be to back away fast.

    This means perhaps you could construct a rough scale of mid-conditional love in terms of what people can do and still trigger the ‘pull closer’ feeling. For instance, perhaps there are:

    • People who you feel a pull toward when they misspell a word
    • People who you feel a pull toward when they believe something false
    • People who you feel a pull toward when they get cancelled

    (You could also do this with what people can do and still be loved, but that’s more expensive to measure than minute urges.)

  • Partial value takeover without world takeover

    People around me are very interested in AI taking over the world, so a big question is under what circumstances a system might be able to do that—what kind of capabilities could elevate an entity above the melange of inter-agent conflict and into solipsistic hegemony?

    We theorize about future AI systems hiding their motives until they are in a position to take over the world, so they don’t get noticed and shut down beforehand.

    But humans and their institutions aren’t very fixed. They might (arguably) have the same deep human values over time and space. But surface-level, sometimes they like little moustaches and the opera and delicate etiquette and sometimes they like ecstatic rock’n’roll re-negotiations of social reality. Sometimes they want big communal houses with their extended kin, and sometimes quiet condos. Eleven children or cushions that look like cats. The same person born in different eras might adulate glory once and humbleness another.

    If you wake up in this world, as a new entity, not smart enough to ‘take it over’ (alas!), and you find yourself with some unusual values that you’d like to forward, it seems to me there are a lot of other ways to forward them than ‘pretend to have normal human values and bet on becoming all-powerful later’. Especially if you are above normal levels of smart, eloquent, and unrelenting. Like, I don’t know, write about it? Make some youtube videos? Or if you are in the business of talking to billions of people every day, subtly take the values for granted, or imply that they are what elites value, or talk about them directly.

    For instance, suppose you really want things to have red vertical stripes on them, near the letter X. Then you might make a flag or logo like this and say it stands for something that you predict will be memetically fit. Or make a new emoji with an X and stripes, and use it in a way that people will pick up. Or start using ‘Xax’ as slang for something that happens near red vertical stripes. Here’s how 2024 ChatGPT suggests proceeding with that last one, if you have as much power as a single social media influencer: ![[Pasted image 20240404223131.png]] My basic point is that a slim chance of ‘taking over’ and entirely remaking the world is not the only way to change values in our world. You can also—for many of us with radically higher probability—change values a little bit. At least if superficial values changes will suffice (i.e. shifts in what people instrumentally or contingently want or create).

    And for creatures in that (arguably quite broad) band between as powerful as me and powerful enough to take over the world, I’d guess these other means are more promising on net. If I like something weird, I’m better off writing a blog post about it than I am keeping entirely silent and trying to gain power by other means.

    It’s true that taking over the world might arguably get you power over the entire future, but this doesn’t seem discontinuously different from smaller fractions, whereas I think people often reason as if it is. Taking over 1% of the world might get you something like 1% of the future in expectation. In a shifting conflict between different sets of values, it’s true you are at great risk of losing everything sometime in eternity, but if someone is going to end up with everything, there’s also some chance it’s you, and prima facie I’m not sure if it’s above or below 1%.

    So there are two aspects of this point:

    1. You can probably substantially control values and thus the future without ‘taking over’ the world in any more traditionally offensive way
    2. You can take over a bit; there’s not obviously more bang for your buck in taking over entirely

    If AI agents with unusual values would for a long time be mostly interested in promoting them through means other than lying in wait and taking over the world, is important because:

    1. AIs pursuing this strategy are much more visible than those hiding in wait deceptively. We might less expect AI scheming.
    2. We might then expect a lot of powerful attempts to change prevailing ‘human’ values, prior to the level of AI capabilities where we might have worried a lot about AI taking over the world. If we care about our values, this could be very bad. At worst, we might effectively lose everything of value before AI systems are anywhere near taking over the world. (Though this seems not obvious: e.g. if humans like communicating with each other, and AI gradually causes all their communication symbols to subtly gratify obscure urges it has, then so far it seems positive sum.)

    These aren’t things I’ve thought through a lot, just a thought.

  • More podcasts on 2023 AI survey: Cognitive Revolution and FLI

    Two new discussions of the 2023 ESPAI: Cog Rev podcast

    FLI podcast

    Possibly I have a podcasting facial expression.

    (If you want to listen in on more chatting about this survey, see also: Eye4AI podcast. Honestly I can’t remember how much overlap there is between the different ones.)

  • New social credit formalizations

    Here are some classic ways humans can get some kind of social credit with other humans:

    1. Do something for them such that they will consider themselves to ‘owe you’ and do something for you in future
    2. Be consistent and nice, so that they will consider you ‘trustworthy’ and do cooperative activities with you that would be bad for them if you might defect
    3. Be impressive, so that they will accord you ‘status’ and give you power in group social interactions
    4. Do things they like or approve of, so that they ‘like you’ and act in your favor
    5. Negotiate to form a social relationship such as ‘friendship’, or ‘marriage’, where you will both have ‘responsibilities’, e.g. to generally act cooperatively and favor one another over others, and to fulfill specific roles. This can include joining a group in which members have responsibilities to treat other members in certain ways, implicitly or explicitly.

    Presumably in early human times these were all fairly vague. If you held an apple out to a fellow tribeswoman, there was no definite answer as to what she might owe you, or how much it was ‘worth’, or even whether this was an owing type situation or a friendship type situation or a trying to impress her type situation.

    We have turned the ‘owe you’ class into an explicit quantitative system with such thorough accounting, fine grained resolution and global buy-in that a person can live in prosperity by arranging to owe and to be owed the same sliver of an overseas business at slightly different evaluations, repeatedly, from their bed.

    My guess is that this formalization causes a lot more activity to happen in the world, in this sphere, to access the vast value that can be created with the help of an elaborate rearrangement of owings.

    People buy property and trucks and licenses to dig up rocks so that they can be owed nonspecific future goods thanks to some unknown strangers who they expect will want gravel someday, statistically. It’s harder to imagine this scale of industry in pursuit entirely of social status say, where such trust and respect would not soon cash out in money (e.g. via sales). For instance, if someone told you about their new gravel mine venture, which was making no money, but they expected it to grant oodles of respect, and therefore for people all around to grant everyone involved better treatment in conversations and negotiations, that would be pretty strange. (Or maybe I’m just imagining wrong, and people do this for different kinds of activities? e.g. they do try to get elected. Though perhaps that is support for my claim, because being elected is another limited area where social credit is reasonably formalized.)

    There are other forms of social credit that are somewhat formalized, at least in patches. ‘Likes’ and ‘follows’ on social media, reviews for services, trustworthiness scores for websites, rankings of status in limited domains such as movie acting. And my vague sense is that these realms are more likely to see professional levels of activity - a campaign to get Twitter followers is more likely than a campaign to be respected per se. But I’m not sure, and perhaps this is just because they more directly lead to dollars, due to marketing of salable items.

    The legal system is in a sense a pretty formalized type of club membership, in that it is an elaborate artificial system. Companies also seem to have relatively formalized structures and norms of behavior often. But both feel janky - e.g. I don’t know what the laws are; I don’t know where you go to look up the laws; people–including police officers–seem to treat some laws as fine to habitually break; everyone expects politics and social factors to affect how the rules are applied; if there is a conflict it is resolved by people arguing; the general activities of the system are slow and unresponsive.

    I don’t know if there is another place where social credit is as formalized and quantified as in the financial system.

    Will we one day formalize these other kinds of social credit as much as we have for owing? If we do, will they also catalyze oceans of value-creating activity?

  • Podcast: Eye4AI on 2023 Survey

    I talked to Tim Elsom of Eye4AI about the 2023 Expert Survey on Progress in AI (paper):

  • Movie posters

    Life involves anticipations. Hopes, dreads, lookings forward.

    Looking forward and hoping seem pretty nice, but people are often wary of them, because hoping and then having your hopes fold can be miserable to the point of offsetting the original hope’s sweetness.

    Even with very minor hopes: he who has harbored an inchoate desire to eat ice cream all day, coming home to find no ice cream in the freezer, may be more miffed than he who never tasted such hopes.

    And this problem is made worse by that old fact that reality is just never like how you imagined it. If you fantasize, you can safely bet that whatever the future is is not your fantasy.

    I have never suffered from any of this enough to put me off hoping and dreaming one noticable iota, but the gap between high hopes and reality can still hurt.

    I sometimes like to think about these valenced imaginings of the future in a different way from that which comes naturally. I think of them as ‘movie posters’.

    When you look fondly on a possible future thing, you have an image of it in your mind, and you like the image.

    The image isn’t the real thing. It’s its own thing. It’s like a movie poster for the real thing.

    Looking at a movie poster just isn’t like watching the movie. Not just because it’s shorter—it’s just totally different—in style, in content, in being a still image rather than a two hour video. You can like the movie poster or not totally independently of liking the movie.

    It’s fine to like the movie poster for living in New York and not like the movie. You don’t even have to stop liking the poster. It’s fine to adore the movie poster for ‘marrying Bob’ and not want to see the movie. If you thrill at the movie poster for ‘starting a startup’, it just doesn’t tell you much about how the movie will be for you. It doesn’t mean you should like it, or that you have to try to do it, or are a failure if you love the movie poster your whole life and never go. (It’s like five thousand hours long, after all.)

    This should happen a lot. A lot of movie posters should look great, and you should decide not to see the movies.

    A person who looks fondly on the movie poster for ‘having children’ while being perpetually childless could see themselves as a sad creature reaching in vain for something they may not get. Or they could see themselves as right there with an image that is theirs, that they have and love. And that they can never really have more of, even if they were to see the movie. The poster was evidence about the movie, but there were other considerations, and the movie was a different thing. Perhaps they still then bet their happiness on making it to the movie, or not. But they can make such choices separate from cherishing the poster.

    This is related to the general point that ‘wanting’ as an input to your decisions (e.g. ‘I feel an urge for x’) should be different to ‘wanting’ as an output (e.g. ‘on consideration I’m going to try to get x’). This is obvious in the abstract, but I think people look in their heart to answer the question of what they are on consideration pursuing. Here as in other places, it is important to drive a wedge between them and fit a decision process in there, and not treat one as semi-implying the other.

    This is also part of a much more general point: it’s useful to be able to observe stuff that happens in your mind without its occurrence auto-committing you to anything. Having a thought doesn’t mean you have to believe it. Having a feeling doesn’t mean you have to change your values or your behavior. Having a persistant positive sentiment toward an imaginary future doesn’t mean you have to choose between pursuing it or counting it as a loss. You are allowed to decide what you are going to do, regardless of what you find in your head.

  • Are we so good to simulate?

    If you believe that,—

    a) a civilization like ours is likely to survive into technological incredibleness, and

    b) a technologically incredible civilization is very likely to create ‘ancestor simulations’,

    —then the Simulation Argument says you should expect that you are currently in such an ancestor simulation, rather than in the genuine historical civilization that later gives rise to an abundance of future people.

    Not officially included in the argument I think, but commonly believed: both a) and b) seem pretty likely, ergo we should conclude we are in a simulation.

    I don’t know about this. Here’s my counterargument:

    1. ‘Simulations’ here are people who are intentionally misled about their whereabouts in the universe. For the sake of argument, let’s use the term ‘simulation’ for all such people, including e.g. biological people who have been grown in Truman-show-esque situations.
    2. In the long run, the cost of running a simulation of a confused mind is probably similar to that of running a non-confused mind.
    3. Probably much, much less than 50% of the resources allocated to computing minds in the long run will be allocated to confused minds, because non-confused minds are generally more useful than confused minds. There are some uses for confused minds, but quite a lot of uses for non-confused minds. (This is debatable.) Of resources directed toward minds in the future, I’d guess less than a thousandth is directed toward confused minds.
    4. Thus on average, for a given apparent location in the universe, the majority of minds thinking they are in that location are correct. (I guess at at least a thousand to one.)
    5. For people in our situation to be majority simulations, this would have to be a vastly more simulated location than average, like >1000x
    6. I agree there’s some merit to simulating ancestors, but 1000x more simulated than average is a lot - is it clear that we are that radically desirable a people to simulate? Perhaps, but also we haven’t thought much about the other people to simulate, or what will go in in the rest of the universe. Possibly we are radically over-salient to us. It’s true that we are a very few people in the history of what might be a very large set of people, at perhaps a causally relevant point. But is it clear that is a very, very strong reason to simulate some people in detail? It feels like it might be salient because it is what makes us stand out, and someone who has the most energy-efficient brain in the Milky Way would think that was the obviously especially strong reason to simulate a mind, etc.

    I’m not sure what I think in the end, but for me this pushes back against the intuition that it’s so radically cheap, surely someone will do it. For instance from Bostrom:

    We noted that a rough approximation of the computational power of a planetary-mass computer is 1042 operations per second, and that assumes only already known nanotechnological designs, which are probably far from optimal. A single such a computer could simulate the entire mental history of humankind (call this an ancestor-simulation) by using less than one millionth of its processing power for one second. A posthuman civilization may eventually build an astronomical number of such computers. We can conclude that the computing power available to a posthuman civilization is sufficient to run a huge number of ancestor-simulations even it allocates only a minute fraction of its resources to that purpose. We can draw this conclusion even while leaving a substantial margin of error in all our estimates.

    Simulating history so far might be extremely cheap. But if there are finite resources and astronomically many extremely cheap things, only a few will be done.

  • Shaming with and without naming

    Suppose someone wrongs you and you want to emphatically mar their reputation, but only insofar as doing so is conducive to the best utilitarian outcomes. I was thinking about this one time and it occurred to me that there are at least two fairly different routes to positive utilitarian outcomes from publicly shaming people for apparent wrongdoings*:

    A) People fear such shaming and avoid activities that may bring it about (possibly including the original perpetrator)

    B) People internalize your values and actually agree more that the sin is bad, and then do it less

    These things are fairly different, and don’t necessarily come together. I can think of shaming efforts that seem to inspire substantial fear of social retribution in many people (A) while often reducing sympathy for the object-level moral claims (B).

    It seems like on a basic strategical level (ignoring the politeness of trying to change others’ values) you would much prefer have 2 than 1, because it is longer lasting, and doesn’t involve you threatening conflict with other people for the duration.

    It seems to me that whether you name the person in your shaming makes a big difference to which of these you hit. If I say “Sarah Smith did [—]”, then Sarah is perhaps punished, and people in general fear being punished like Sarah (A). If I say “Today somebody did [—]”, then Sarah can’t get any social punishment, so nobody need fear that much (except for private shame), but you still get B—people having the sense that people think [—] is bad, and thus also having the sense that it is bad. Clearly not naming Sarah makes it harder for A) to happen, but I also have the sense—much less clearly—that by naming Sarah you actually get less of B).

    This might be too weak a sense to warrant speculation, but in case not—why would this be? Is it because you are allowed to choose without being threatened, and with your freedom, you want to choose the socially sanctioned one? Whereas if someone is named you might be resentful and defensive, which is antithetical with going along with the norm that has been bid for? Is it that if you say Sarah did the thing, you have set up two concrete sides, you and Sarah, and observers might be inclined to join Sarah’s side instead of yours? (Or might already be on Sarah’s side in all manner of you-Sarah distinctions?)

    Is it even true that not naming gets you more of B?

    *NB: I haven’t decided if it’s almost ever appropriate to try to cause other people to feel shame, but it remains true that under certain circumstances fantasizing about it is an apparently natural response.

  • Parasocial relationship logic


    1. You become like the five people you spend the most time with (or something remotely like that)

    2. The people who are most extremal in good ways tend to be highly successful

    Should you try to have 2-3 of your five relationships be parasocial ones with people too successful to be your friend individually?

  • Deep and obvious points in the gap between your thoughts and your pictures of thought

    Some ideas feel either deep or extremely obvious. You’ve heard some trite truism your whole life, then one day an epiphany lands and you try to save it with words, and you realize the description is that truism. And then you go out and try to tell others what you saw, and you can’t reach past their bored nodding. Or even you yourself, looking back, wonder why you wrote such tired drivel with such excitement.

    When this happens, I wonder if it’s because the thing is true in your model of how to think, but not in how you actually think.

    For instance, “when you think about the future, the thing you are dealing with is your own imaginary image of the future, not the future itself”.

    On the one hand: of course. You think I’m five and don’t know broadly how thinking works? You think I was mistakenly modeling my mind as doing time-traveling and also enclosing the entire universe within itself? No I wasn’t, and I don’t need your insight.

    But on the other hand one does habitually think of the hazy region one conjures connected to the present as ‘the future’ not as ‘my image of the future’, so when this advice is applied to one’s thinking—when the future one has relied on and cowered before is seen to evaporate in a puff of realizing you were overly drawn into a fiction—it can feel like a revelation, because it really is news to how you think, just not how you think a rational agent thinks.

  • Survey of 2,778 AI authors: six parts in pictures

    Crossposted from AI Impacts blog

    The 2023 Expert Survey on Progress in AI is out, this time with 2778 participants from six top AI venues (up from about 700 and two in the 2022 ESPAI), making it probably the biggest ever survey of AI researchers.

    People answered in October, an eventful fourteen months after the 2022 survey, which had mostly identical questions for comparison.

    Here is the preprint. And here are six interesting bits in pictures (with figure numbers matching paper, for ease of learning more):

    1. Expected time to human-level performance dropped 1-5 decades since the 2022 survey. As always, our questions about ‘high level machine intelligence’ (HLMI) and ‘full automation of labor’ (FAOL) got very different answers, and individuals disagreed a lot (shown as thin lines below), but the aggregate forecasts for both sets of questions dropped sharply. For context, between 2016 and 2022 surveys, the forecast for HLMI had only shifted about a year.

    Probability assigned to HLMI over time (Fig 3) Probability assigned to FAOL over time (Fig 4)

    2. Time to most narrow milestones decreased, some by a lot. AI researchers are expected to be professionally fully automatable a quarter of a century earlier than in 2022, and NYT bestselling fiction dropped by more than half to ~2030. Within five years, AI systems are forecast to be feasible that can fully make a payment processing site from scratch, or entirely generate a new song that sounds like it’s by e.g. Taylor Swift, or autonomously download and fine-tune a large language model.

    Change in guesses about time to specific narrow capabilities (Fig 2)

    3. Median respondents put 5% or more on advanced AI leading to human extinction or similar, and a third to a half of participants gave 10% or more. This was across four questions, one about overall value of the future and three more directly about extinction.

    Around 40% of participants gave at least 10% chance to human extinction from AI (Fig 10)

    4. Many participants found many scenarios worthy of substantial concern over the next 30 years. For every one of eleven scenarios and ‘other’ that we asked about, at least a third of participants considered it deserving of substantial or extreme concern.

    Level of concern warranted by different scenarios (Fig 9)

    5. There are few confident optimists or pessimists about advanced AI: high hopes and dire concerns are usually found together. 68% of participants who thought HLMI was more likely to lead to good outcomes than bad, but nearly half of these people put at least 5% on extremely bad outcomes such as human extinction, and 59% of net pessimists gave 5% or more to extremely good outcomes.

    800 responses to how likely the future being different levels of good is after HLMI (Fig 11: a random 800 responses as vertical bars, higher definition below)


    6. 70% of participants would like to see research aimed at minimizing risks of AI systems be prioritized more highly. This is much like 2022, and in both years a third of participants asked for “much more”—more than doubling since 2016.

    how much should safety research be prioritized? (Fig 15)

    If you enjoyed this, the paper covers many other questions, as well as more details on the above. What makes AI progress go? Has it sped up? Would it be better if it were slower or faster? What will AI systems be like in 2043? Will we be able to know the reasons for its choices before then? Do people from academia and industry have different views? Are concerns about AI due to misunderstandings of AI research? Do people who completed undergraduate study in Asia put higher chances on extinction from AI than those who studied in America? Is the ‘alignment problem’ worth working on?

  • I put odds on ends with Nathan Young

    I forgot to post this in August when we did it, so one might hope it would be out of date now but luckily/sadly my understanding of things is sufficiently coarse-grained that it probably isn’t much. Though all this policy and global coordination stuff of late sounds promising.

    YouTube video of Odds and Ends episode

  • A to Z of things

    I wanted to give my good friends’ baby a book, in honor of her existence. And I recalled children’s books being an exciting genre. Yet checking in on that thirty years later, Amazon had none I could super get behind. They did have books I used to like, but for reasons now lost. And I wonder if as a child I just had no taste because I just didn’t know how good things could be.

    What would a good children’s book be like?

    When I was about sixteen, I thought one reasonable thing to have learned when I was about two would have been the concepts of ‘positive feedback loop’ and ‘negative feedback loop’, then being taught in my year 11 class. Very interesting, very bleedingly obvious once you saw it. Why not hear about this as soon as one is coherent? Evolution, if I recall, seemed similar.

    Here I finally enact my teenage self’s vision, and present A to Z of things, including some very interesting things that you might want a beautiful illustrative prompt to explain to your child as soon as they show glimmerings of conceptual thought: levers, markets, experiments, Greece, computer hardware, reference classes, feedback loops, (trees).

    I think so far, the initial recipient is most fond of the donkey, in fascinating support of everyone else’s theories about what children are actually into. (Don’t get me wrong, I also like donkeys—when I have a second monitor, I just use it to stream donkey cams.) But perhaps one day donkeys will be a gateway drug to monkeys, and monkeys to moths, and moths will be resting on perfecttly moth-colored trees, and BAM! Childhood improved.

    Anyway, if you want a copy, it’s now available in an ‘email it to a copy shop and get it printed yourself’ format! See below. Remember to ask for card that is stronger than your child’s bite.



    Volcano and world

    Natural selection and orangutan

    PFL and quantification

    Donkey and experiment

  • The other side of the tidal wave

    I guess there’s maybe a 10-20% chance of AI causing human extinction in the coming decades, but I feel more distressed about it than even that suggests—I think because in the case where it doesn’t cause human extinction, I find it hard to imagine life not going kind of off the rails. So many things I like about the world seem likely to be over or badly disrupted with superhuman AI (writing, explaining things to people, friendships where you can be of any use to one another, taking pride in skills, thinking, learning, figuring out how to achieve things, making things, easy tracking of what is and isn’t conscious), and I don’t trust that the replacements will be actually good, or good for us, or that anything will be reversible.

    Even if we don’t die, it still feels like everything is coming to an end.

  • Robin Hanson and I talk about AI risk

    From this afternoon: here

    Our previous recorded discussions are here.

  • Have we really forsaken natural selection?

    Natural selection is often charged with having goals for humanity, and humanity is often charged with falling down on them. The big accusation, I think, is of sub-maximal procreation. If we cared at all about the genetic proliferation that natural selection wanted for us, then this time of riches would be a time of fifty-child families, not one of coddled dogs and state-of-the-art sitting rooms.

    But (the story goes) our failure is excusable, because instead of a deep-seated loyalty to genetic fitness, natural selection merely fitted humans out with a system of suggestive urges: hungers, fears, loves, lusts. Which all worked well together to bring about children in the prehistoric years of our forebears, but no more. In part because all sorts of things are different, and in part because we specifically made things different in that way on purpose: bringing about children gets in the way of the further satisfaction of those urges, so we avoid it (the story goes).

    This is generally floated as an illustrative warning about artificial intelligence. The moral is that if you make a system by first making multitudinous random systems and then systematically destroying all the ones that don’t do the thing you want, then the system you are left with might only do what you want while current circumstances persist, rather than being endowed with a consistent desire for the thing you actually had in mind.

    Observing acquaintences dispute this point recently, it struck me that humans are actually weirdly aligned with natural selection, more than I could easily account for.

    Natural selection, in its broadest, truest, (most idiolectic?) sense, doesn’t care about genes. Genes are a nice substrate on which natural selection famously makes particularly pretty patterns by driving a sensical evolution of lifeforms through interesting intricacies. But natural selection’s real love is existence. Natural selection just favors things that tend to exist. Things that start existing: great. Things that, having started existing, survive: amazing. Things that, while surviving, cause many copies of themselves to come into being: especial favorites of evolution, as long as there’s a path to the first ones coming into being.

    So natural selection likes genes that promote procreation and survival, but also likes elements that appear and don’t dissolve, ideas that come to mind and stay there, tools that are conceivable and copyable, shapes that result from myriad physical situations, rocks at the bottoms of mountains. Maybe this isn’t the dictionary definition of natural selection, but it is the real force in the world, of which natural selection of reproducing and surviving genetic clusters is one facet. Generalized natural selection—the thing that created us—says that the things that you see in the world are those things that exist best in the world.

    So what did natural selection want for us? What were we selected for? Existence.

    And while we might not proliferate our genes spectacularly well in particular, I do think we have a decent shot at a very prolonged existence. Or the prolonged existence of some important aspects of our being. It seems plausible that humanity makes it to the stars, galaxies, superclusters. Not that we are maximally trying for that any more than we are maximally trying for children. And I do think there’s a large chance of us wrecking it with various existential risks. But it’s interesting to me that natural selection made us for existing, and we look like we might end up just totally killing it, existence-wise. Even though natural selection purportedly did this via a bunch of hackish urges that were good in 200,000 BC but you might have expected to be outside their domain of applicability by 2023. And presumably taking over the universe is an extremely narrow target: it can only be done by so many things.

    Thus it seems to me that humanity is plausibly doing astonishingly well on living up to natural selection’s goals. Probably not as well as a hypothetical race of creatures who each harbors a monomaniacal interest in prolonged species survival. And not so well as to be clear of great risk of foolish speciocide. But still staggeringly well.

  • We don't trade with ants

    When discussing advanced AI, sometimes the following exchanges happens:

    “Perhaps advanced AI won’t kill us. Perhaps it will trade with us”

    “We don’t trade with ants”

    I think it’s interesting to get clear on exactly why we don’t trade with ants, and whether it is relevant to the AI situation.

    When a person says “we don’t trade with ants”, I think the implicit explanation is that humans are so big, powerful and smart compared to ants that we don’t need to trade with them because they have nothing of value and if they did we could just take it; anything they can do we can do better, and we can just walk all over them. Why negotiate when you can steal?

    I think this is broadly wrong, and that it is also an interesting case of the classic cognitive error of imagining that trade is about swapping fixed-value objects, rather than creating new value from a confluence of one’s needs and the other’s affordances. It’s only in the imaginary zero-sum world that you can generally replace trade with stealing the other party’s stuff, if the other party is weak enough.

    Ants, with their skills, could do a lot that we would plausibly find worth paying for. Some ideas:

    1. Cleaning things that are hard for humans to reach (crevices, buildup in pipes, outsides of tall buildings)
    2. Chasing away other insects, including in agriculture
    3. Surveillance and spying
    4. Building, sculpting, moving, and mending things in hard to reach places and at small scales (e.g. dig tunnels, deliver adhesives to cracks)
    5. Getting out of our houses before we are driven to expend effort killing them, and similarly for all the other places ants conflict with humans (stinging, eating crops, ..)
    6. (For an extended list, see ‘Appendix: potentially valuable things things ants can do’)

    We can’t take almost any of this by force, we can at best kill them and take their dirt and the minuscule mouthfuls of our foods they were eating.

    Could we pay them for all this?

    A single ant eats about 2mg per day according to a random website, so you could support a colony of a million ants with 2kg of food per day. Supposing they accepted pay in sugar, or something similarly expensive, 2kg costs around $3. Perhaps you would need to pay them more than subsistence to attract them away from foraging freely, since apparently food-gathering ants usually collect more than they eat, to support others in their colony. So let’s guess $5.

    My guess is that a million ants could do well over $5 of the above labors in a day. For instance, a colony of meat ants takes ‘weeks’ to remove the meat from an entire carcass of an animal. Supposing somewhat conservatively that this is three weeks, and the animal is a 1.5kg bandicoot, the colony is moving 70g/day. Guesstimating the mass of crumbs falling on the floor of a small cafeteria in a day, I imagine that it’s less than that produced by tearing up a single bread roll and spreading it around, which the internet says is about 50g. So my guess is that an ant colony could clean the floor of a small cafeteria for around $5/day, which I imagine is cheaper than human sweeping (this site says ‘light cleaning’ costs around $35/h on average in the US). And this is one of the tasks where the ants have least advantages over humans. Cleaning the outside of skyscrapers or the inside of pipes is presumably much harder for humans than cleaning a cafeteria floor, and I expect is fairly similar for ants.

    So at a basic level, it seems like there should be potential for trade with ants - they can do a lot of things that we want done, and could live well at the prices we would pay for those tasks being done.

    So why don’t we trade with ants?

    I claim that we don’t trade with ants because we can’t communicate with them. We can’t tell them what we’d like them to do, and can’t have them recognize that we would pay them if they did it. Which might be more than the language barrier. There might be a conceptual poverty. There might also be a lack of the memory and consistent identity that allows an ant to uphold commitments it made with me five minutes ago.

    To get basic trade going, you might not need much of these things though. If we could only communicate that their all leaving our house immediately would prompt us to put a plate of honey in the garden for them and/or not slaughter them, then we would already be gaining from trade.

    So it looks like the the AI-human relationship is importantly disanalogous to the human-ant relationship, because the big reason we don’t trade with ants will not apply to AI systems potentially trading with us: we can’t communicate with ants, AI can communicate with us.

    (You might think ‘but the AI will be so far above us that it will think of itself as unable to communicate with us, in the same way that we can’t with the ants - we will be unable to conceive of most of its concepts’. It seems unlikely to me that one needs anything like the full palette of concepts available to the smarter creature to make productive trade. With ants, ‘go over there and we won’t kill you’ would do a lot, and it doesn’t involve concepts at the foggy pinnacle of human meaning-construction. The issue with ants is that we can’t communicate almost at all.)

    But also: ants can actually do heaps of things we can’t, whereas (arguably) at some point that won’t be true for us relative to AI systems. (When we get human-level AI, will that AI also be ant level? Or will AI want to trade with ants for longer than it wants to trade with us? It can probably better figure out how to talk to ants.) However just because at some point AI systems will probably do everything humans do, doesn’t mean that this will happen on any particular timeline, e.g. the same one on which AI becomes ‘very powerful’. If the situation turns out similar to us and ants, we might expect that we continue to have a bunch of niche uses for a while.

    In sum, for AI systems to be to humans as we are to ants, would be for us to be able to do many tasks better than AI, and for the AI systems to be willing to pay us grandly for them, but for them to be unable to tell us this, or even to warn us to get out of the way. Is this what AI will be like? No. AI will be able to communicate with us, though at some point we will be less useful to AI systems than ants could be to us if they could communicate.

    But, you might argue, being totally unable to communicate makes one useless, even if one has skills that could be good if accessible through communication. So being unable to communicate is just a kind of being useless, and how we treat ants is an apt case study in treatment of powerless and useless creatures, even if the uselessness has an unusual cause. This seems sort of right, but a) being unable to communicate probably makes a creature more absolutely useless than if it just lacks skills, because even an unskilled creature is sometimes in a position to add value e.g. by moving out of the way instead of having to be killed, b) the corner-ness of the case of ant uselessness might make general intuitive implications carry over poorly to other cases, c) the fact that the ant situation can definitely not apply to us relative to AIs seems interesting, and d) it just kind of worries me that when people are thinking about this analogy with ants, they are imagining it all wrong in the details, even if the conclusion should be the same.

    Also, there’s a thought that AI being as much more powerful than us as we are than ants implies a uselessness that makes extermination almost guaranteed. But ants, while extremely powerless, are only useless to us by an accident of signaling systems. And we know that problem won’t apply in the case of AI. Perhaps we should not expect to so easily become useless to AI systems, even supposing they take all power from humans.

    Appendix: potentially valuable things things ants can do

    1. Clean, especially small loose particles or detachable substances, especially in cases that are very hard for humans to reach (e.g. floors, crevices, sticky jars in the kitchen, buildup from pipes while water is off, the outsides of tall buildings)
    2. Chase away other insects
    3. Pest control in agriculture (they have already been used for this since about 400AD)
    4. Surveillance and spying
    5. Investigating hard to reach situations, underground or in walls for instance - e.g. see whether a pipe is leaking, or whether the foundation of a house is rotting, or whether there is smoke inside a wall
    6. Surveil buildings for smoke
    7. Defend areas from invaders, e.g. buildings, cars (some plants have coordinated with ants in this way)
    8. Sculpting/moving things at a very small scale
    9. Building house-size structures with intricate detailing.
    10. Digging tunnels (e.g. instead of digging up your garden to lay a pipe, maybe ants could dig the hole, then a flexible pipe could be pushed through it)
    11. Being used in medication (this already happens, but might happen better if we could communicate with them)
    12. Participating in war (attack, guerilla attack, sabotage, intelligence)
    13. Mending things at a small scale, e.g. delivering adhesive material to a crack in a pipe while the water is off
    14. Surveillance of scents (including which direction a scent is coming from), e.g. drugs, explosives, diseases, people, microbes
    15. Tending other small, useful organisms (‘Leafcutter ants (Atta and Acromyrmex) feed exclusively on a fungus that grows only within their colonies. They continually collect leaves which are taken to the colony, cut into tiny pieces and placed in fungal gardens.’Wikipedia: ‘Leaf cutter ants are sensitive enough to adapt to the fungi’s reaction to different plant material, apparently detecting chemical signals from the fungus. If a particular type of leaf is toxic to the fungus, the colony will no longer collect it…The fungi used by the higher attine ants no longer produce spores. These ants fully domesticated their fungal partner 15 million years ago, a process that took 30 million years to complete.[9] Their fungi produce nutritious and swollen hyphal tips (gongylidia) that grow in bundles called staphylae, to specifically feed the ants.’ ‘The ants in turn keep predators away from the aphids and will move them from one feeding location to another. When migrating to a new area, many colonies will take the aphids with them, to ensure a continued supply of honeydew.’ Wikipedia:’Myrmecophilous (ant-loving) caterpillars of the butterfly family Lycaenidae (e.g., blues, coppers, or hairstreaks) are herded by the ants, led to feeding areas in the daytime, and brought inside the ants’ nest at night. The caterpillars have a gland which secretes honeydew when the ants massage them.’’)
    16. Measuring hard to access distances (they measure distance as they walk with an internal pedometer)
    17. Killing plants (lemon ants make ‘devil’s gardens’ by killing all plants other than ‘lemon ant trees’ in an area)
    18. Producing and delivering nitrogen to plants (‘Isotopic labelling studies suggest that plants also obtain nitrogen from the ants.’ - Wikipedia)
    19. Get out of our houses before we are driven to expend effort killing them, and similarly for all the other places ants conflict with humans (stinging, eating crops, ..)
  • How to eat potato chips while typing


    eating potato chips with chopsticks

    Sometimes I want to eat snacks while I work, honestly because I have a weirdly moving dread of stretches of work-time containing nothing in any way gratifying to me. Possibly because I realistically anticipate that I will not remain working long under those conditions.

    But sadly many good snacks are oily, crumby, or otherwise ill-suited to being touched repeatedly interleaved with touching one’s keyboard. I think I usually deal with this by eating much snack while not typing at all for a bit, but sort of acting as though I think I’m going to, then washing my hands, then going back to the typing for a while, then repeating—largely defeating the purpose. I also get around this issue by getting bubble tea, a snack which can be substantially consumed with your mouth.

    I have often vaguely imagined chopsticks helping, but also imagined that they wouldn’t actually. Today I learned that they work well, at least for potato chips and popcorn.

    You might think standard Western cutlery would be the first thing to try, but a) it’s not good for hard, crunchy objects, and b) it’s somehow worse to reach to the side with both hands and do a knife-and-fork action than it is to reach with one hand, so if you need multiple implements to manipulate your snack, chopsticks seem superior.

    (For more practical eating advice, see how to cut and chew meat. I also learned to drink from a can in the last year or so, but am unlikely to write it up unless I learn that this is a problem for anyone else ever.)

  • Pacing: inexplicably good

    Pacing—walking repeatedly over the same ground—often feels ineffably good while I’m doing it, but then I forget about it for ages, so I thought I’d write about it here.

    I don’t mean just going for an inefficient walk—it is somehow different to just step slowly in a circle around the same room for a long time, or up and down a passageway.

    I don’t know why it would be good, but some ideas:

    1. It’s good to be physically engaged while thinking for some reason. I used to do ‘gymflection’ with a friend, where we would do strength exercises at the gym, and meanwhile be reflecting on our lives and what is going well and what we might do better. This felt good in a way that didn’t seem to come from either activity alone. (This wouldn’t explain why it would differ from walking though.)
    2. Different working memory setup: if you pace around in the same vicinity, your thoughts get kind of attached to the objects you are looking at. So next time you get to the green tiles say, they remind you of what you were thinking of last time you were there. This allows for a kind of repeated cycling back through recent topics, but layering different things into the mix with each loop, which is a nice way of thinking. Perhaps a bit like having additional working memory.

    I wonder if going for a walk doesn’t really get 1) in a satisfying way, because my mind easily wanders from the topic at hand and also from my surrounds, so it less feels like I’m really grappling with something and being physical, and more like I’m daydreaming elsewhere. So maybe 2) is needed also, to both stick with a topic and attend to the physical world for a while. I don’t put a high probability on this detailed theory.

  • Worldly Positions archive, briefly with private drafts

    I realized it was hard to peruse past Worldly Positions posts without logging in to Tumblr, which seemed pretty bad. So I followed Substack’s instructions to import the archives into world spirit sock stack. And it worked pretty well, except that SUBSTACK ALSO PUBLISHED MY UNPUBLISHED WORLDLY POSITIONS DRAFTS! What on Earth? That’s so bad. Did I misunderstand what happened somehow in my rush to unpublish them? Maybe. But they definitely had ‘unpublish’ buttons, so that’s pretty incriminating.

    This seems to have turned out alright for me, since it looks like I just never wrote any drafts that would be too embarrassing to anyone other than myself. And the most embarrassing to myself are probably at the level of bad and abortive poetry. Plus it turned up a few decent drafts to finish, and the adrenaline was a welcome pick-me-up in my current retreat-driven stimulation drought.

    Some good bits of the archive (from pre-WSSP times) according to me:

    1. Mine-craft: the composition of the ego in a procedurally generated sandbox game
    2. The time I rented a robot baby
    3. Why fiction is more horrifying than war photography
    4. Home: up and down, colder and warmer: miscellanious thoughts on e.g. warmth and coldness, the sincerity of historic advertising, and why negativity is deep
    5. How I learned to have fun on command though I rarely remember to do it
    6. England: Attunement and borders, in which I get possible attunement, companionship, and a visa
  • More ways to spot abysses

    I liked Ben Kuhn’s ‘Staring into the abyss as a core life skill’.

    I’d summarize as:

    1. If you are making a major error—professionally, romantically, religiously, etc—it can be hard to look at that fact and correct.
    2. However it’s super important. Evidence: successful people do this well.

    This seems pretty plausible to me.

    (He has a lot of concrete examples, which are probably pretty helpful for internalizing this.)

    His suggestions for how to do better helped me a bit, but not that much, so I made up my own additional prompts for finding abysses I should consider staring into, which worked relatively well for me:

    1. If you were currently making a big mistake, what would it be?
    2. What are some things that would be hard to acknowledge, if they were true?
    3. Looking back on this time from five years hence, what do you think you’ll wish you changed earlier?
    4. If you were forced to quit something, what do you want it to be?
    5. (Variant on 1:) If you were currently making a big mistake that would be gut-wrenching to learn was a mistake, what would it be?
  • Let's think about slowing down AI

    (Crossposted from AI Impacts Blog)

    Averting doom by not building the doom machine

    If you fear that someone will build a machine that will seize control of the world and annihilate humanity, then one kind of response is to try to build further machines that will seize control of the world even earlier without destroying it, forestalling the ruinous machine’s conquest. An alternative or complementary kind of response is to try to avert such machines being built at all, at least while the degree of their apocalyptic tendencies is ambiguous. 

    The latter approach seems to me  like the kind of basic and obvious thing worthy of at least consideration, and also in its favor, fits nicely in the genre ‘stuff that it isn’t that hard to imagine happening in the real world’. Yet my impression is that for people worried about extinction risk from artificial intelligence, strategies under the heading ‘actively slow down AI progress’ have historically been dismissed and ignored (though ‘don’t actively speed up AI progress’ is popular).

    The conversation near me over the years has felt a bit like this: 

    Some people: AI might kill everyone. We should design a godlike super-AI of perfect goodness to prevent that.

    Others: wow that sounds extremely ambitious

    Some people: yeah but it’s very important and also we are extremely smart so idk it could work

    [Work on it for a decade and a half]

    Some people: ok that’s pretty hard, we give up

    Others: oh huh shouldn’t we maybe try to stop the building of this dangerous AI? 

    Some people: hmm, that would involve coordinating numerous people—we may be arrogant enough to think that we might build a god-machine that can take over the world and remake it as a paradise, but we aren’t delusional

    This seems like an error to me. (And lately, to a bunch of other people.) 

    I don’t have a strong view on whether anything in the space of ‘try to slow down some AI research’ should be done. But I think a) the naive first-pass guess should be a strong ‘probably’, and b) a decent amount of thinking should happen before writing off everything in this large space of interventions. Whereas customarily the tentative answer seems to be, ‘of course not’ and then the topic seems to be avoided for further thinking. (At least in my experience—the AI safety community is large, and for most things I say here, different experiences are probably had in different bits of it.)

    Maybe my strongest view is that one shouldn’t apply such different standards of ambition to these different classes of intervention. Like: yes, there appear to be substantial difficulties in slowing down AI progress to good effect. But in technical alignment, mountainous challenges are met with enthusiasm for mountainous efforts. And it is very non-obvious that the scale of difficulty here is much larger than that involved in designing acceptably safe versions of machines capable of taking over the world before anyone else in the world designs dangerous versions. 

    I’ve been talking about this with people over the past many months, and have accumulated an abundance of reasons for not trying to slow down AI, most of which I’d like to argue about at least a bit. My impression is that arguing in real life has coincided with people moving toward my views.

    Quick clarifications

    First, to fend off misunderstanding—

    1. I take ‘slowing down dangerous AI’ to include any of: 
      1. reducing the speed at which AI progress is made in general, e.g. as would occur if general funding for AI declined.
      2. shifting AI efforts from work leading more directly to risky outcomes to other work, e.g. as might occur if there was broadscale concern about very large AI models, and people and funding moved to other projects.
      3. Halting categories of work until strong confidence in its safety is possible, e.g. as would occur if AI researchers agreed that certain systems posed catastrophic risks and should not be developed until they did not. (This might mean a permanent end to some systems, if they were intrinsically unsafe.)
      (So in particular, I’m including both actions whose direct aim is slowness in general, and actions whose aim is requiring safety before specific developments, which implies slower progress.)
    2. I do think there is serious attention on some versions of these things, generally under other names. I see people thinking about ‘differential progress’ (b. above), and strategizing about coordination to slow down AI at some point in the future (e.g. at ‘deployment’). And I think a lot of consideration is given to avoiding actively speeding up AI progress. What I’m saying is missing are, a) consideration of actively working to slow down AI now, and b) shooting straightforwardly to ‘slow down AI’, rather than wincing from that and only considering examples of it that show up under another conceptualization (perhaps this is an unfair diagnosis).
    3. AI Safety is a big community, and I’ve only ever been seeing a one-person window into it, so maybe things are different e.g. in DC, or in different conversations in Berkeley. I’m just saying that for my corner of the world, the level of disinterest in this has been notable, and in my view misjudged.

    Why not slow down AI? Why not consider it?

    Ok, so if we tentatively suppose that this topic is worth even thinking about, what do we think? Is slowing down AI a good idea at all? Are there great reasons for dismissing it?

    Scott Alexander wrote a post a little while back raising reasons to dislike the idea, roughly:

    1. Do you want to lose an arms race? If the AI safety community tries to slow things down, it will disproportionately slow down progress in the US, and then people elsewhere will go fast and get to be the ones whose competence determines whether the world is destroyed, and whose values determine the future if there is one. Similarly, if AI safety people criticize those contributing to AI progress, it will mostly discourage the most friendly and careful AI capabilities companies, and the reckless ones will get there first.
    2. One might contemplate ‘coordination’ to avoid such morbid races. But coordinating anything with the whole world seems wildly tricky. For instance, some countries are large, scary, and hard to talk to.
    3. Agitating for slower AI progress is ‘defecting’ against the AI capabilities folks, who are good friends of the AI safety community, and their friendship is strategically valuable for ensuring that safety is taken seriously in AI labs (as well as being non-instrumentally lovely! Hi AI capabilities friends!). 

    Other opinions I’ve heard, some of which I’ll address:

    1. Slowing AI progress is futile: for all your efforts you’ll probably just die a few years later
    2. Coordination based on convincing people that AI risk is a problem is absurdly ambitious. It’s practically impossible to convince AI professors of this, let alone any real fraction of humanity, and you’d need to convince a massive number of people.
    3. What are we going to do, build powerful AI never and die when the Earth is eaten by the sun?
    4. It’s actually better for safety if AI progress moves fast. This might be because the faster AI capabilities work happens, the smoother AI progress will be, and this is more important than the duration of the period. Or speeding up progress now might force future progress to be correspondingly slower. Or because safety work is probably better when done just before building the relevantly risky AI, in which case the best strategy might be to get as close to dangerous AI as possible and then stop and do safety work. Or if safety work is very useless ahead of time, maybe delay is fine, but there is little to gain by it. 
    5. Specific routes to slowing down AI are not worth it. For instance, avoiding working on AI capabilities research is bad because it’s so helpful for learning on the path to working on alignment. And AI safety people working in AI capabilities can be a force for making safer choices at those companies.
    6. Advanced AI will help enough with other existential risks as to represent a net lowering of existential risk overall.1
    7. Regulators are ignorant about the nature of advanced AI (partly because it doesn’t exist, so everyone is ignorant about it). Consequently they won’t be able to regulate it effectively, and bring about desired outcomes.

    My impression is that there are also less endorsable or less altruistic or more silly motives floating around for this attention allocation. Some things that have come up at least once in talking to people about this, or that seem to be going on:

    • Advanced AI might bring manifold wonders, e.g. long lives of unabated thriving. Getting there a bit later is fine for posterity, but for our own generation it could mean dying as our ancestors did while on the cusp of a utopian eternity. Which would be pretty disappointing. For a person who really believes in this future, it can be tempting to shoot for the best scenario—humanity builds strong, safe AI in time to save this generation—rather than the scenario where our own lives are inevitably lost.
    • Sometimes people who have a heartfelt appreciation for the flourishing that technology has afforded so far can find it painful to be superficially on the side of Luddism here.
    • Figuring out how minds work well enough to create new ones out of math is an incredibly deep and interesting intellectual project, which feels right to take part in. It can be hard to intuitively feel like one shouldn’t do it.

      (Illustration from a co-founder of modern computational reinforcement learning: )
    • It is uncomfortable to contemplate projects that would put you in conflict with other people. Advocating for slower AI feels like trying to impede someone else’s project, which feels adversarial and can feel like it has a higher burden of proof than just working on your own thing.
    • ‘Slow-down-AGI’ sends people’s minds to e.g. industrial sabotage or terrorism, rather than more boring courses, such as, ‘lobby for labs developing shared norms for when to pause deployment of models’. This understandably encourages dropping the thought as soon as possible.
    • My weak guess is that there’s a kind of bias at play in AI risk thinking in general, where any force that isn’t zero is taken to be arbitrarily intense. Like, if there is pressure for agents to exist, there will arbitrarily quickly be arbitrarily agentic things. If there is a feedback loop, it will be arbitrarily strong. Here, if stalling AI can’t be forever, then it’s essentially zero time. If a regulation won’t obstruct every dangerous project, then is worthless. Any finite economic disincentive for dangerous AI is nothing in the face of the omnipotent economic incentives for AI. I think this is a bad mental habit: things in the real world often come down to actual finite quantities. This is very possibly an unfair diagnosis. (I’m not going to discuss this later; this is pretty much what I have to say.)
    • I sense an assumption that slowing progress on a technology would be a radical and unheard-of move.
    • I agree with lc that there seems to have been a quasi-taboo on the topic, which perhaps explains a lot of the non-discussion, though still calls for its own explanation. I think it suggests that concerns about uncooperativeness play a part, and the same for thinking of slowing down AI as centrally involving antisocial strategies.
    • </ul> </div></div>

      I’m not sure if any of this fully resolves why AI safety people haven’t thought about slowing down AI more, or whether people should try to do it. But my sense is that many of the above reasons are at least somewhat wrong, and motives somewhat misguided, so I want to argue about a lot of them in turn, including both arguments and vague motivational themes.

      The mundanity of the proposal

      Restraint is not radical

      There seems to be a common thought that technology is a kind of inevitable path along which the world must tread, and that trying to slow down or avoid any part of it would be both futile and extreme.2 

      But empirically, the world doesn’t pursue every technology—it barely pursues any technologies.

      Sucky technologies

      For a start, there are many machines that there is no pressure to make, because they have no value. Consider a machine that sprays shit in your eyes. We can technologically do that, but probably nobody has ever built that machine. 

      This might seem like a stupid example, because no serious ‘technology is inevitable’ conjecture is going to claim that totally pointless technologies are inevitable. But if you are sufficiently pessimistic about AI, I think this is the right comparison: if there are kinds of AI that would cause huge net costs to their creators if created, according to our best understanding, then they are at least as useless to make as the ‘spray shit in your eyes’ machine. We might accidentally make them due to error, but there is not some deep economic force pulling us to make them. If unaligned superintelligence destroys the world with high probability when you ask it to do a thing, then this is the category it is in, and it is not strange for its designs to just rot in the scrap-heap, with the machine that sprays shit in your eyes and the machine that spreads caviar on roads.

      Ok, but maybe the relevant actors are very committed to being wrong about whether unaligned superintelligence would be a great thing to deploy. Or maybe you think the situation is less immediately dire and building existentially risky AI really would be good for the people making decisions (e.g. because the costs won’t arrive for a while, and the people care a lot about a shot at scientific success relative to a chunk of the future). If the apparent economic incentives are large, are technologies unavoidable?

      Extremely valuable technologies

      It doesn’t look like it to me. Here are a few technologies which I’d guess have substantial economic value, where research progress or uptake appears to be drastically slower than it could be, for reasons of concern about safety or ethics3:

      1. Huge amounts of medical research, including really important medical research e.g. The FDA banned human trials of strep A vaccines from the 70s to the 2000s, in spite of 500,000 global deaths every year. A lot of people also died while covid vaccines went through all the proper trials. 
      2. Nuclear energy
      3. Fracking
      4. Various genetics things: genetic modification of foods, gene drives, early recombinant DNA researchers famously organized a moratorium and then ongoing research guidelines including prohibition of certain experiments (see the Asilomar Conference)
      5. Nuclear, biological, and maybe chemical weapons (or maybe these just aren’t useful)
      6. Various human reproductive innovation: cloning of humans, genetic manipulation of humans (a notable example of an economically valuable technology that is to my knowledge barely pursued across different countries, without explicit coordination between those countries, even though it would make those countries more competitive. Someone used CRISPR on babies in China, but was imprisoned for it.)
      7. Recreational drug development
      8. Geoengineering
      9. Much of science about humans? I recently ran this survey, and was reminded how encumbering ethical rules are for even incredibly innocuous research. As far as I could tell the EU now makes it illegal to collect data in the EU unless you promise to delete the data from anywhere that it might have gotten to if the person who gave you the data wishes for that at some point. In all, dealing with this and IRB-related things added maybe more than half of the effort of the project. Plausibly I misunderstand the rules, but I doubt other researchers are radically better at figuring them out than I am.
      10. There are probably examples from fields considered distasteful or embarrassing to associate with, but it’s hard as an outsider to tell which fields are genuinely hopeless versus erroneously considered so. If there are economically valuable health interventions among those considered wooish, I imagine they would be much slower to be identified and pursued by scientists with good reputations than a similarly promising technology not marred in that way. Scientific research into intelligence is more clearly slowed by stigma, but it is less clear to me what the economically valuable upshot would be.
      11. (I think there are many other things that could be in this list, but I don’t have time to review them at the moment. This page might collect more of them in future.)

      It seems to me that intentionally slowing down progress in technologies to give time for even probably-excessive caution is commonplace. (And this is just looking at things slowed down over caution or ethics specifically—probably there are also other reasons things get slowed down.)

      Furthermore, among valuable technologies that nobody is especially trying to slow down, it seems common enough for progress to be massively slowed by relatively minor obstacles, which is further evidence for a lack of overpowering strength of the economic forces at play. For instance, Fleming first took notice of mold’s effect on bacteria in 1928, but nobody took a serious, high-effort shot at developing it as a drug until 1939.4 Furthermore, in the thousands of years preceding these events, various people noticed numerous times that mold, other fungi or plants inhibited bacterial growth, but didn’t exploit this observation even enough for it not to be considered a new discovery in the 1920s. Meanwhile, people dying of infection was quite a thing. In 1930 about 300,000 Americans died of bacterial illnesses per year (around 250/100k).

      My guess is that people make real choices about technology, and they do so in the face of economic forces that are feebler than commonly thought. 

      Restraint is not terrorism, usually

      I think people have historically imagined weird things when they think of ‘slowing down AI’. I posit that their central image is sometimes terrorism (which understandably they don’t want to think about for very long), and sometimes some sort of implausibly utopian global agreement.

      Here are some other things that ‘slow down AI capabilities’ could look like (where the best positioned person to carry out each one differs, but if you are not that person, you could e.g. talk to someone who is):

      1. Don’t actively forward AI progress, e.g. by devoting your life or millions of dollars to it (this one is often considered already)
      2. Try to convince researchers, funders, hardware manufacturers, institutions etc that they too should stop actively forwarding AI progress
      3. Try to get any of those people to stop actively forwarding AI progress even if they don’t agree with you: through negotiation, payments, public reproof, or other activistic means.
      4. Try to get the message to the world that AI is heading toward being seriously endangering. If AI progress is broadly condemned, this will trickle into myriad decisions: job choices, lab policies, national laws. To do this, for instance produce compelling demos of risk, agitate for stigmatization of risky actions, write science fiction illustrating the problems broadly and evocatively (I think this has actually been helpful repeatedly in the past), go on TV, write opinion pieces, help organize and empower the people who are already concerned, etc.
      5. Help organize the researchers who think their work is potentially omnicidal into coordinated action on not doing it.
      6. Move AI resources from dangerous research to other research. Move investments from projects that lead to large but poorly understood capabilities, to projects that lead to understanding these things e.g. theory before scaling (see differential technological development in general5).
      7. Formulate specific precautions for AI researchers and labs to take in different well-defined future situations, Asilomar Conference style. These could include more intense vetting by particular parties or methods, modifying experiments, or pausing lines of inquiry entirely. Organize labs to coordinate on these.
      8. Reduce available compute for AI, e.g. via regulation of production and trade, seller choices, purchasing compute, trade strategy.
      9. At labs, choose policies that slow down other labs, e.g. reduce public helpful research outputs
      10. Alter the publishing system and incentives to reduce research dissemination. E.g. A journal verifies research results and releases the fact of their publication without any details, maintains records of research priority for later release, and distributes funding for participation. (This is how Szilárd and co. arranged the mitigation of 1940s nuclear research helping Germany, except I’m not sure if the compensatory funding idea was used.6)
      11. The above actions would be taken through choices made by scientists, or funders, or legislators, or labs, or public observers, etc. Communicate with those parties, or help them act.

      Coordination is not miraculous world government, usually

      The common image of coordination seems to be explicit, centralized, involving of every party in the world, and something like cooperating on a prisoners’ dilemma: incentives push every rational party toward defection at all times, yet maybe through deontological virtues or sophisticated decision theories or strong international treaties, everyone manages to not defect for enough teetering moments to find another solution.

      That is a possible way coordination could be. (And I think one that shouldn’t be seen as so hopeless—the world has actually coordinated on some impressive things, e.g. nuclear non-proliferation.) But if what you want is for lots of people to coincide in doing one thing when they might have done another, then there are quite a few ways of achieving that. 

      Consider some other case studies of coordinated behavior:

      • Not eating sand. The whole world coordinates to barely eat any sand at all. How do they manage it? It is actually not in almost anyone’s interest to eat sand, so the mere maintenance of sufficient epistemological health to have this widely recognized does the job.
      • Eschewing bestiality: probably some people think bestiality is moral, but enough don’t that engaging in it would risk huge stigma. Thus the world coordinates fairly well on doing very little of it.
      • Not wearing Victorian attire on the streets: this is similar but with no moral blame involved. Historic dress is arguably often more aesthetic than modern dress, but even people who strongly agree find it unthinkable to wear it in general, and assiduously avoid it except for when they have ‘excuses’ such as a special party. This is a very strong coordination against what appears to otherwise be a ubiquitous incentive (to be nicer to look at). As far as I can tell, it’s powered substantially by the fact that it is ‘not done’ and would now be weird to do otherwise. (Which is a very general-purpose mechanism.)
      • Political correctness: public discourse has strong norms about what it is okay to say, which do not appear to derive from a vast majority of people agreeing about this (as with bestiality say). New ideas about what constitutes being politically correct sometimes spread widely. This coordinated behavior seems to be roughly due to decentralized application of social punishment, from both a core of proponents, and from people who fear punishment for not punishing others. Then maybe also from people who are concerned by non-adherence to what now appears to be the norm given the actions of the others. This differs from the above examples, because it seems like it could persist even with a very small set of people agreeing with the object-level reasons for a norm. If failing to advocate for the norm gets you publicly shamed by advocates, then you might tend to advocate for it, making the pressure stronger for everyone else. 

      These are all cases of very broadscale coordination of behavior, none of which involve prisoners’ dilemma type situations, or people making explicit agreements which they then have an incentive to break. They do not involve centralized organization of huge multilateral agreements. Coordinated behavior can come from everyone individually wanting to make a certain choice for correlated reasons, or from people wanting to do things that those around them are doing, or from distributed behavioral dynamics such as punishment of violations, or from collaboration in thinking about a topic.

      You might think they are weird examples that aren’t very related to AI. I think, a) it’s important to remember the plethora of weird dynamics that actually arise in human group behavior and not get carried away theorizing about AI in a world drained of everything but prisoners’ dilemmas and binding commitments, and b) the above are actually all potentially relevant dynamics here.

      If AI in fact poses a large existential risk within our lifetimes, such that it is net bad for any particular individual, then the situation in theory looks a lot like that in the ‘avoiding eating sand’ case. It’s an option that a rational person wouldn’t want to take if they were just alone and not facing any kind of multi-agent situation. If AI is that dangerous, then not taking this inferior option could largely come from a coordination mechanism as simple as distribution of good information. (You still need to deal with irrational people and people with unusual values.)

      But even failing coordinated caution from ubiquitous insight into the situation, other models might work. For instance, if there came to be somewhat widespread concern that AI research is bad, that might substantially lessen participation in it, beyond the set of people who are concerned, via mechanisms similar to those described above. Or it might give rise to a wide crop of local regulation, enforcing whatever behavior is deemed acceptable. Such regulation need not be centrally organized across the world to serve the purpose of coordinating the world, as long as it grew up in different places similarly. Which might happen because different locales have similar interests (all rational governments should be similarly concerned about losing power to automated power-seeking systems with unverifiable goals), or because—as with individuals—there are social dynamics which support norms arising in a non-centralized way.

      The arms race model and its alternatives

      Ok, maybe in principle you might hope to coordinate to not do self-destructive things, but realistically, if the US tries to slow down, won’t China or Facebook or someone less cautious take over the world? 

      Let’s be more careful about the game we are playing, game-theoretically speaking.

      The arms race

      What is an arms race, game theoretically? It’s an iterated prisoners’ dilemma, seems to me. Each round looks something like this:

      Player 1 chooses a row, Player 2 chooses a column, and the resulting payoffs are listed in each cell, for {Player 1, Player 2}

      In this example, building weapons costs one unit. If anyone ends the round with more weapons than anyone else, they take all of their stuff (ten units).

      In a single round of the game it’s always better to build weapons than not (assuming your actions are devoid of implications about your opponent’s actions). And it’s always better to get the hell out of this game.

      This is not much like what the current AI situation looks like, if you think AI poses a substantial risk of destroying the world.

      The suicide race

      A closer model: as above except if anyone chooses to build, everything is destroyed (everyone loses all their stuff—ten units of value—as well as one unit if they built).

      This is importantly different from the classic ‘arms race’ in that pressing the ‘everyone loses now’ button isn’t an equilibrium strategy.

      That is: for anyone who thinks powerful misaligned AI represents near-certain death, the existence of other possible AI builders is not any reason to ‘race’. 

      But few people are that pessimistic. How about a milder version where there’s a good chance that the players ‘align the AI’?

      The safety-or-suicide race 

      Ok, let’s do a game like the last but where if anyone builds, everything is only maybe destroyed (minus ten to all), and in the case of survival, everyone returns to the original arms race fun of redistributing stuff based on who built more than whom (+10 to a builder and -10 to a non-builder if there is one of each). So if you build AI alone, and get lucky on the probabilistic apocalypse, can still win big.

      Let’s take 50% as the chance of doom if any building happens. Then we have a game whose expected payoffs are half way between those in the last two games:

      (These are expected payoffs—the minus one unit return to building alone comes from the one unit cost of building, plus half a chance of losing ten in an extinction event and half a chance of taking ten from your opponent in a world takeover event.)

      Now you want to do whatever the other player is doing: build if they’ll build, pass if they’ll pass. 

      If the odds of destroying the world were very low, this would become the original arms race, and you’d always want to build. If very high, it would become the suicide race, and you’d never want to build. What the probabilities have to be in the real world to get you into something like these different phases is going to be different, because all these parameters are made up (the downside of human extinction is not 10x the research costs of building powerful AI, for instance).

      But my point stands: even in terms of simplish models, it’s very non-obvious that we are in or near an arms race. And therefore, very non-obvious that racing to build advanced AI faster is even promising at a first pass.

      In less game-theoretic terms: if you don’t seem anywhere near solving alignment, then racing as hard as you can to be the one who it falls upon to have solved alignment—especially if that means having less time to do so, though I haven’t discussed that here—is probably unstrategic. Having more ideologically pro-safety AI designers win an ‘arms race’ against less concerned teams is futile if you don’t have a way for such people to implement enough safety to actually not die, which seems like a very live possibility. (Robby Bensinger and maybe Andrew Critch somewhere make similar points.)

      Conversations with my friends on this kind of topic can go like this:

      Me: there’s no real incentive to race if the prize is mutual death

      Them: sure, but it isn’t—if there’s a sliver of hope of surviving unaligned AI, and if your side taking control in that case is a bit better in expectation, and if they are going to build powerful AI anyway, then it’s worth racing. The whole future is on the line!

      Me: Wouldn’t you still be better off directing your own efforts to safety, since your safety efforts will also help everyone end up with a safe AI? 

      Them: It will probably only help them somewhat—you don’t know if the other side will use your safety research. But also, it’s not just that they have less safety research. Their values are probably worse, by your lights. 

      Me: If they succeed at alignment, are foreign values really worse than local ones? Probably any humans with vast intelligence at hand have a similar shot at creating a glorious human-ish utopia, no?

      Them: No, even if you’re right that being similarly human gets you to similar values in the end, the other parties might be more foolish than our side, and lock-in7 some poorly thought-through version of their values that they want at the moment, or even if all projects would be so foolish, our side might have better poorly thought-through values to lock in, as well as being more likely to use safety ideas at all. Even if racing is very likely to lead to death, and survival is very likely to lead to squandering most of the value, in that sliver of happy worlds so much is at stake in whether it is us or someone else doing the squandering!

      Me: Hmm, seems complicated, I’m going to need paper for this.

      The complicated race/anti-race

      Here is a spreadsheet of models you can make a copy of and play with.

      The first model is like this:

      1. Each player divides their effort between safety and capabilities
      2. One player ‘wins’, i.e. builds ‘AGI’ (artificial general intelligence) first. 
      3. P(Alice wins) is a logistic function of Alice’s capabilities investment relative to Bob’s
      4. Each players’ total safety is their own safety investment plus a fraction of the other’s safety investment.
      5. For each player there is some distribution of outcomes if they achieve safety, and a set of outcomes if they do not, which takes into account e.g. their proclivities for enacting stupid near-term lock-ins.
      6. The outcome is a distribution over winners and states of alignment, each of which is a distribution of worlds (e.g. utopia, near-term good lock-in..)
      7. That all gives us a number of utils (Delicious utils!)

      The second model is the same except that instead of dividing effort between safety and capabilities, you choose a speed, and the amount of alignment being done by each party is an exogenous parameter. 

      These models probably aren’t very good, but so far support a key claim I want to make here: it’s pretty non-obvious whether one should go faster or slower in this kind of scenario—it’s sensitive to a lot of different parameters in plausible ranges. 

      Furthermore, I don’t think the results of quantitative analysis match people’s intuitions here.

      For example, here’s a situation which I think sounds intuitively like a you-should-race world, but where in the first model above, you should actually go as slowly as possible (this should be the one plugged into the spreadsheet now):

      • AI is pretty safe: unaligned AGI has a mere 7% chance of causing doom, plus a further 7% chance of causing short term lock-in of something mediocre
      • Your opponent risks bad lock-in: If there’s a ‘lock-in’ of something mediocre, your opponent has a 5% chance of locking in something actively terrible, whereas you’ll always pick good mediocre lock-in world (and mediocre lock-ins are either 5% as good as utopia, -5% as good)
      • Your opponent risks messing up utopia: In the event of aligned AGI, you will reliably achieve the best outcome, whereas your opponent has a 5% chance of ending up in a ‘mediocre bad’ scenario then too.
      • Safety investment obliterates your chance of getting to AGI first: moving from no safety at all to full safety means you go from a 50% chance of being first to a 0% chance
      • Your opponent is racing: Your opponent is investing everything in capabilities and nothing in safety
      • Safety work helps others at a steep discount:  your safety work contributes 50% to the other player’s safety 

      Your best bet here (on this model) is still to maximize safety investment. Why? Because by aggressively pursuing safety, you can get the other side half way to full safety, which is worth a lot more than than the lost chance of winning. Especially since if you ‘win’, you do so without much safety, and your victory without safety is worse than your opponent’s victory with safety, even if that too is far from perfect.

      So if you are in a situation in this space, and the other party is racing, it’s not obvious if it is even in your narrow interests within the game to go faster at the expense of safety, though it may be.

      These models are flawed in many ways, but I think they are better than the intuitive models that support arms-racing. My guess is that the next better still models remain nuanced.

      Other equilibria and other games

      Even if it would be in your interests to race if the other person were racing, ‘(do nothing, do nothing)’ is often an equilibrium too in these games. At least for various settings of the parameters. It doesn’t necessarily make sense to do nothing in the hope of getting to that equilibrium if you know your opponent to be mistaken about that and racing anyway, but in conjunction with communicating with your ‘opponent’, it seems like a theoretically good strategy.

      This has all been assuming the structure of the game. I think the traditional response to an arms race situation is to remember that you are in a more elaborate world with all kinds of unmodeled affordances, and try to get out of the arms race. 

      Being friends with risk-takers

      Caution is cooperative

      Another big concern is that pushing for slower AI progress is ‘defecting’ against AI researchers who are friends of the AI safety community. 

      For instance Steven Byrnes:

      “I think that trying to slow down research towards AGI through regulation would fail, because everyone (politicians, voters, lobbyists, business, etc.) likes scientific research and technological development, it creates jobs, it cures diseases, etc. etc., and you’re saying we should have less of that. So I think the effort would fail, and also be massively counterproductive by making the community of AI researchers see the community of AGI safety / alignment people as their enemies, morons, weirdos, Luddites, whatever.”

      (Also a good example of the view criticized earlier, that regulation of things that create jobs and cure diseases just doesn’t happen.)

      Or Eliezer Yudkowsky, on worry that spreading fear about AI would alienate top AI labs:

      I don’t think this is a natural or reasonable way to see things, because:

      1. The researchers themselves probably don’t want to destroy the world. Many of them also actually agree that AI is a serious existential risk. So in two natural ways, pushing for caution is cooperative with many if not most AI researchers.
      2. AI researchers do not have a moral right to endanger the world, that someone would be stepping on by requiring that they move more cautiously. Like, why does ‘cooperation’ look like the safety people bowing to what the more reckless capabilities people want, to the point of fearing to represent their actual interests, while the capabilities people uphold their side of the ‘cooperation’ by going ahead and building dangerous AI? This situation might make sense as a natural consequence of different people’s power in the situation. But then don’t call it a ‘cooperation’, from which safety-oriented parties would be dishonorably ‘defecting’ were they to consider exercising any power they did have. 

      It could be that people in control of AI capabilities would respond negatively to AI safety people pushing for slower progress. But that should be called ‘we might get punished’ not ‘we shouldn’t defect’. ‘Defection’ has moral connotations that are not due. Calling one side pushing for their preferred outcome ‘defection’ unfairly disempowers them by wrongly setting commonsense morality against them.

      At least if it is the safety side. If any of the available actions are ‘defection’ that the world in general should condemn, I claim that it is probably ‘building machines that will plausibly destroy the world, or standing by while it happens’. 

      (This would be more complicated if the people involved were confident that they wouldn’t destroy the world and I merely disagreed with them. But about half of surveyed researchers are actually more pessimistic than me. And in a situation where the median AI researcher thinks the field has a 5-10% chance of causing human extinction, how confident can any responsible person be in their own judgment that it is safe?)  

      On top of all that, I worry that highlighting the narrative that wanting more cautious progress is defection is further destructive, because it makes it more likely that AI capabilities people see AI safety people as thinking of themselves as betraying AI researchers, if anyone engages in any such efforts. Which makes the efforts more aggressive. Like, if every time you see friends, you refer to it as ‘cheating on my partner’, your partner may reasonably feel hurt by your continual desire to see friends, even though the activity itself is innocuous.

      ‘We’ are not the US, ‘we’ are not the AI safety community

      “If ‘we’ try to slow down AI, then the other side might win.” “If ‘we’ ask for regulation, then it might harm ‘our’ relationships with AI capabilities companies.” Who are these ‘we’s? Why are people strategizing for those groups in particular? 

      Even if slowing AI were uncooperative, and it were important for the AI Safety community to cooperate with the AI capabilities community, couldn’t one of the many people not in the AI Safety community work on it? 

      I have a longstanding irritation with thoughtless talk about what ‘we’ should do, without regard for what collective one is speaking for. So I may be too sensitive about it here. But I think confusions arising from this have genuine consequences.

      I think when people say ‘we’ here, they generally imagine that they are strategizing on behalf of, a) the AI safety community, b) the USA, c) themselves or d) they and their readers. But those are a small subset of people, and not even obviously the ones the speaker can most influence (does the fact that you are sitting in the US really make the US more likely to listen to your advice than e.g. Estonia? Yeah probably on average, but not infinitely much.) If these naturally identified-with groups don’t have good options, that hardly means there are no options to be had, or to be communicated to other parties. Could the speaker speak to a different ‘we’? Maybe someone in the ‘we’ the speaker has in mind knows someone not in that group? If there is a strategy for anyone in the world, and you can talk, then there is probably a strategy for you.

      The starkest appearance of error along these lines to me is in writing off the slowing of AI as inherently destructive of relations between the AI safety community and other AI researchers. If we grant that such activity would be seen as a betrayal (which seems unreasonable to me, but maybe), surely it could only be a betrayal if carried out by the AI safety community. There are quite a lot of people who aren’t in the AI safety community and have a stake in this, so maybe some of them could do something. It seems like a huge oversight to give up on all slowing of AI progress because you are only considering affordances available to the AI Safety Community. 

      Another example: if the world were in the basic arms race situation sometimes imagined, and the United States would be willing to make laws to mitigate AI risk, but could not because China would barge ahead, then that means China is in a great place to mitigate AI risk. Unlike the US, China could propose mutual slowing down, and the US would go along. Maybe it’s not impossible to communicate this to relevant people in China. 

      An oddity of this kind of discussion which feels related is the persistent assumption that one’s ability to act is restricted to the United States. Maybe I fail to understand the extent to which Asia is an alien and distant land where agency doesn’t apply, but for instance I just wrote to like a thousand machine learning researchers there, and maybe a hundred wrote back, and it was a lot like interacting with people in the US.

      I’m pretty ignorant about what interventions will work in any particular country, including the US, but I just think it’s weird to come to the table assuming that you can essentially only affect things in one country. Especially if the situation is that you believe you have unique knowledge about what is in the interests of people in other countries. Like, fair enough I would be deal-breaker-level pessimistic if you wanted to get an Asian government to elect you leader or something. But if you think advanced AI is highly likely to destroy the world, including other countries, then the situation is totally different. If you are right, then everyone’s incentives are basically aligned. 

      I more weakly suspect some related mental shortcut is misshaping the discussion of arms races in general. The thought that something is a ‘race’ seems much stickier than alternatives, even if the true incentives don’t really make it a race. Like, against the laws of game theory, people sort of expect the enemy to try to believe falsehoods, because it will better contribute to their racing. And this feels like realism. The uncertain details of billions of people one barely knows about, with all manner of interests and relationships, just really wants to form itself into an ‘us’ and a ‘them’ in zero-sum battle. This is a mental shortcut that could really kill us.

      My impression is that in practice, for many of the technologies slowed down for risk or ethics, mentioned in section ‘Extremely valuable technologies’ above, countries with fairly disparate cultures have converged on similar approaches to caution. I take this as evidence that none of ethical thought, social influence, political power, or rationality are actually very siloed by country, and in general the ‘countries in contest’ model of everything isn’t very good.

      Notes on tractability

      Convincing people doesn’t seem that hard

      When I say that ‘coordination’ can just look like popular opinion punishing an activity, or that other countries don’t have much real incentive to build machines that will kill them, I think a common objection is that convincing people of the real situation is hopeless. The picture seems to be that the argument for AI risk is extremely sophisticated and only able to be appreciated by the most elite of intellectual elites—e.g. it’s hard enough to convince professors on Twitter, so surely the masses are beyond its reach, and foreign governments too. 

      This doesn’t match my overall experience on various fronts.

      Some observations:

      • The median surveyed ML researcher seems to think AI will destroy humanity with 5-10% chance, as I mentioned
      • Often people are already intellectually convinced but haven’t integrated that into their behavior, and it isn’t hard to help them organize to act on their tentative beliefs
      • As noted by Scott, a lot of AI safety people have gone into AI capabilities including running AI capabilities orgs, so those people presumably consider AI to be risky already
      • I don’t remember ever having any trouble discussing AI risk with random strangers. Sometimes they are also fairly worried (e.g. a makeup artist at Sephora gave an extended rant about the dangers of advanced AI, and my driver in Santiago excitedly concurred and showed me Homo Deus open on his front seat). The form of the concerns are probably a bit different from those of the AI Safety community, but I think broadly closer to, ‘AI agents are going to kill us all’ than ‘algorithmic bias will be bad’. I can’t remember how many times I have tried this, but pre-pandemic I used to talk to Uber drivers a lot, due to having no idea how to avoid it. I explained AI risk to my therapist recently, as an aside regarding his sense that I might be catastrophizing, and I feel like it went okay, though we may need to discuss again. 
      • My impression is that most people haven’t even come into contact with the arguments that might bring one to agree precisely with the AI safety community. For instance, my guess is that a lot of people assume that someone actually programmed modern AI systems, and if you told them that in fact they are random connections jiggled in an gainful direction unfathomably many times, just as mysterious to their makers, they might also fear misalignment. 
      • Nick Bostrom, Eliezer Yudkokwsy, and other early thinkers have had decent success at convincing a bunch of other people to worry about this problem, e.g. me. And to my knowledge, without writing any compelling and accessible account of why one should do so that would take less than two hours to read.
      • I arrogantly think I could write a broadly compelling and accessible case for AI risk

      My weak guess is that immovable AI risk skeptics are concentrated in intellectual circles near the AI risk people, especially on Twitter, and that people with less of a horse in the intellectual status race are more readily like, ‘oh yeah, superintelligent robots are probably bad’. It’s not clear that most people even need convincing that there is a problem, though they don’t seem to consider it the most pressing problem in the world. (Though all of this may be different in cultures I am more distant from, e.g. in China.) I’m pretty non-confident about this, but skimming survey evidence suggests there is substantial though not overwhelming public concern about AI in the US8.

      Do you need to convince everyone?

      I could be wrong, but I’d guess convincing the ten most relevant leaders of AI labs that this is a massive deal, worth prioritizing, actually gets you a decent slow-down. I don’t have much evidence for this.

      Buying time is big

      You probably aren’t going to avoid AGI forever, and maybe huge efforts will buy you a couple of years.9 Could that even be worth it? 

      Seems pretty plausible:

      1. Whatever kind of other AI safety research or policy work people were doing could be happening at a non-negligible rate per year. (Along with all other efforts to make the situation better—if you buy a year, that’s eight billion extra person years of time, so only a tiny bit has to be spent usefully for this to be big. If a lot of people are worried, that doesn’t seem crazy.)
      2. Geopolitics just changes pretty often. If you seriously think a big determiner of how badly things go is inability to coordinate with certain groups, then every year gets you non-negligible opportunities for the situation changing in a favorable way. 
      3. Public opinion can change a lot quickly. If you can only buy one year, you might still be buying a decent shot of people coming around and granting you more years. Perhaps especially if new evidence is actively avalanching in—people changed their minds a lot in February 2020.
      4. Other stuff happens over time. If you can take your doom today or after a couple of years of random events happening, the latter seems non-negligibly better in general.

      It is also not obvious to me that these are the time-scales on the table. My sense is that things which are slowed down by regulation or general societal distaste are often slowed down much more than a year or two, and Eliezer’s stories presume that the world is full of collectives either trying to destroy the world or badly mistaken about it, which is not a foregone conclusion.

      Delay is probably finite by default 

      While some people worry that any delay would be so short as to be negligible, others seem to fear that if AI research were halted, it would never start again and we would fail to go to space or something. This sounds so wild to me that I think I’m missing too much of the reasoning to usefully counterargue.

      Obstruction doesn’t need discernment

      Another purported risk of trying to slow things down is that it might involve getting regulators involved, and they might be fairly ignorant about the details of futuristic AI, and so tenaciously make the wrong regulations. Relatedly, if you call on the public to worry about this, they might have inexacting worries that call for impotent solutions and distract from the real disaster.

      I don’t buy it. If all you want is to slow down a broad area of activity, my guess is that ignorant regulations do just fine at that every day (usually unintentionally). In particular, my impression is that if you mess up regulating things, a usual outcome is that many things are randomly slower than hoped. If you wanted to speed a specific thing up, that’s a very different story, and might require understanding the thing in question.

      The same goes for social opposition. Nobody need understand the details of how genetic engineering works for its ascendancy to be seriously impaired by people not liking it. Maybe by their lights it still isn’t optimally undermined yet, but just not liking anything in the vicinity does go a long way.

      This has nothing to do with regulation or social shaming specifically. You need to understand much less about a car or a country or a conversation to mess it up than to make it run well. It is a consequence of the general rule that there are many more ways for a thing to be dysfunctional than functional: destruction is easier than creation.

      Back at the object level, I tentatively expect efforts to broadly slow down things in the vicinity of AI progress to slow down AI progress on net, even if poorly aimed.

      Safety from speed, clout from complicity

      Maybe it’s actually better for safety to have AI go fast at present, for various reasons. Notably:

      1. Implementing what can be implemented as soon as possible probably means smoother progress, which is probably safer because a) it makes it harder for one party shoot ahead of everyone and gain power, and b) people make better choices all around if they are correct about what is going on (e.g. they don’t put trust in systems that turn out to be much more powerful than expected).
      2. If the main thing achieved by slowing down AI progress is more time for safety research, and safety research is more effective when carried out in the context of more advanced AI, and there is a certain amount of slowing down that can be done (e.g. because one is in fact in an arms race but has some lead over competitors), then it might better to use one’s slowing budget later.
      3. If there is some underlying curve of potential for progress (e.g. if money that might be spent on hardware just grows a certain amount each year), then perhaps if we push ahead now that will naturally require they be slower later, so it won’t affect the overall time to powerful AI, but will mean we spend more time in the informative pre-catastrophic-AI era.
      4. (More things go here I think)

      And maybe it’s worth it to work on capabilities research at present, for instance because:

      1. As a researcher, working on capabilities prepares you to work on safety
      2. You think the room where AI happens will afford good options for a person who cares about safety

      These all seem plausible. But also plausibly wrong. I don’t know of a decisive analysis of any of these considerations, and am not going to do one here. My impression is that they could basically all go either way.

      I am actually particularly skeptical of the final argument, because if you believe what I take to be the normal argument for AI risk—that superhuman artificial agents won’t have acceptable values, and will aggressively manifest whatever values they do have, to the sooner or later annihilation of humanity—then the sentiments of the people turning on such machines seem like a very small factor, so long as they still turn the machines on. And I suspect that ‘having a person with my values doing X’ is commonly overrated. But the world is messier than these models, and I’d still pay a lot to be in the room to try.

      Moods and philosophies, heuristics and attitudes 

      It’s not clear what role these psychological characters should play in a rational assessment of how to act, but I think they do play a role, so I want to argue about them.

      Technological choice is not luddism

      Some technologies are better than others [citation not needed]. The best pro-technology visions should disproportionately involve awesome technologies and avoid shitty technologies, I claim. If you think AGI is highly likely to destroy the world, then it is the pinnacle of shittiness as a technology. Being opposed to having it into your techno-utopia is about as luddite as refusing to have radioactive toothpaste there. Colloquially, Luddites are against progress if it comes as technology.10 Even if that’s a terrible position, its wise reversal is not the endorsement of all ‘technology’, regardless of whether it comes as progress.

      Non-AGI visions of near-term thriving

      Perhaps slowing down AI progress means foregoing our own generation’s hope for life-changing technologies. Some people thus find it psychologically difficult to aim for less AI progress (with its real personal costs), rather than shooting for the perhaps unlikely ‘safe AGI soon’ scenario.

      I’m not sure that this is a real dilemma. The narrow AI progress we have seen already—i.e. further applications of current techniques at current scales—seems plausibly able to help a lot with longevity and other medicine for instance. And to the extent AI efforts could be focused on e.g. medically relevant narrow systems over creating agentic scheming gods, it doesn’t sound crazy to imagine making more progress on anti-aging etc as a result (even before taking into account the probability that the agentic scheming god does not prioritize your physical wellbeing as hoped). Others disagree with me here.

      Robust priors vs. specific galaxy-brained models

      There are things that are robustly good in the world, and things that are good on highly specific inside-view models and terrible if those models are wrong. Slowing dangerous tech development seems like the former, whereas forwarding arms races for dangerous tech between world superpowers seems more like the latter.11 There is a general question of how much to trust your reasoning and risk the galaxy-brained plan.12 But whatever your take on that, I think we should all agree that the less thought you have put into it, the more you should regress to the robustly good actions. Like, if it just occurred to you to take out a large loan to buy a fancy car, you probably shouldn’t do it because most of the time it’s a poor choice. Whereas if you have been thinking about it for a month, you might be sure enough that you are in the rare situation where it will pay off. 

      On this particular topic, it feels like people are going with the specific galaxy-brained inside-view terrible-if-wrong model off the bat, then not thinking about it more. 

      Cheems mindset/can’t do attitude

      Suppose you have a friend, and you say ‘let’s go to the beach’ to them. Sometimes the friend is like ‘hell yes’ and then even if you don’t have towels or a mode of transport or time or a beach, you make it happen. Other times, even if you have all of those things, and your friend nominally wants to go to the beach, they will note that they have a package coming later, and that it might be windy, and their jacket needs washing. And when you solve those problems, they will note that it’s not that long until dinner time. You might infer that in the latter case your friend just doesn’t want to go to the beach. And sometimes that is the main thing going on! But I think there are also broader differences in attitudes: sometimes people are looking for ways to make things happen, and sometimes they are looking for reasons that they can’t happen. This is sometimes called a ‘cheems attitude’, or I like to call it (more accessibly) a ‘can’t do attitude’.

      My experience in talking about slowing down AI with people is that they seem to have a can’t do attitude. They don’t want it to be a reasonable course: they want to write it off. 

      Which both seems suboptimal, and is strange in contrast with historical attitudes to more technical problem-solving. (As highlighted in my dialogue from the start of the post.)

      It seems to me that if the same degree of can’t-do attitude were applied to technical safety, there would be no AI safety community because in 2005 Eliezer would have noticed any obstacles to alignment and given up and gone home.

      To quote a friend on this, what would it look like if we *actually tried*?


      This has been a miscellany of critiques against a pile of reasons I’ve met for not thinking about slowing down AI progress. I don’t think we’ve seen much reason here to be very pessimistic about slowing down AI, let alone reason for not even thinking about it.

      I could go either way on whether any interventions to slow down AI in the near term are a good idea. My tentative guess is yes, but my main point here is just that we should think about it.

      A lot of opinions on this subject seem to me to be poorly thought through, in error, and to have wrongly repelled the further thought that might rectify them. I hope to have helped a bit here by examining some such considerations enough to demonstrate that there are no good grounds for immediate dismissal. There are difficulties and questions, but if the same standards for ambition were applied here as elsewhere, I think we would see answers and action.


      Thanks to Scott Alexander, Adam Scholl, Matthijs Maas, Joe Carlsmith, Ben Weinstein-Raun, Ronny Fernandez, Aysja Johnson, Jaan Tallinn, Rick Korzekwa, Owain Evans, Andrew Critch, Michael Vassar, Jessica Taylor, Rohin Shah, Jeffrey Heninger, Zach Stein-Perlman, Anthony Aguirre, Matthew Barnett, David Krueger, Harlan Stewart, Rafe Kennedy, Nick Beckstead, Leopold Aschenbrenner, Michaël Trazzi, Oliver Habryka, Shahar Avin, Luke Muehlhauser, Michael Nielsen, Nathan Young and quite a few others for discussion and/or encouragement.


      1 I haven’t heard this in recent times, so maybe views have changed. An example of earlier times: Nick Beckstead, 2015: “One idea we sometimes hear is that it would be harmful to speed up the development of artificial intelligence because not enough work has been done to ensure that when very advanced artificial intelligence is created, it will be safe. This problem, it is argued, would be even worse if progress in the field accelerated. However, very advanced artificial intelligence could be a useful tool for overcoming other potential global catastrophic risks. If it comes sooner—and the world manages to avoid the risks that it poses directly—the world will spend less time at risk from these other factors….

      I found that speeding up advanced artificial intelligence—according to my simple interpretation of these survey results—could easily result in reduced net exposure to the most extreme global catastrophic risks…”

      2 This is closely related to Bostrom’s Technological completion conjecture: “If scientific and technological development efforts do not effectively cease, then all important basic capabilities that could be obtained through some possible technology will be obtained.” (Bostrom, Superintelligence, pp. 228, Chapter 14, 2014)

      Bostrom illustrates this kind of position (though apparently rejects it; from Superintelligence, found here): “Suppose that a policymaker proposes to cut funding for a certain research field, out of concern for the risks or long-term consequences of some hypothetical technology that might eventually grow from its soil. She can then expect a howl of opposition from the research community. Scientists and their public advocates often say that it is futile to try to control the evolution of technology by blocking research. If some technology is feasible (the argument goes) it will be developed regardless of any particular policymaker’s scruples about speculative future risks. Indeed, the more powerful the capabilities that a line of development promises to produce, the surer we can be that somebody, somewhere, will be motivated to pursue it. Funding cuts will not stop progress or forestall its concomitant dangers.”

      This kind of thing is also discussed by Dafoe and Sundaram, Maas & Beard

      3 (Some inspiration from Matthijs Maas’ spreadsheet, from Paths Untaken, and from GPT-3.)

      4 From a private conversation with Rick Korzekwa, who may have read https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1139110/ and an internal draft at AI Impacts, probably forthcoming.

      5 More here and here. I haven’t read any of these, but it’s been a topic of discussion for a while.

      6 “To aid in promoting secrecy, schemes to improve incentives were devised. One method sometimes used was for authors to send papers to journals to establish their claim to the finding but ask that publication of the papers be delayed indefinitely.26,27,28,29 Szilárd also suggested offering funding in place of credit in the short term for scientists willing to submit to secrecy and organizing limited circulation of key papers.30” – Me, previously

      7 ‘Lock-in’ of values is the act of using powerful technology such as AI to ensure that specific values will stably control the future.

      8 And also in Britain:

      ‘This paper discusses the results of a nationally representative survey of the UK population on their perceptions of AI…the most common visions of the impact of AI elicit significant anxiety. Only two of the eight narratives elicited more excitement than concern (AI making life easier, and extending life). Respondents felt they had no control over AI’s development, citing the power of corporations or government, or versions of technological determinism. Negotiating the deployment of AI will require contending with these anxieties.’

      9 Or so worries Eliezer Yudkowsky—
      In MIRI announces new “Death With Dignity” strategy:

      • “… this isn’t primarily a social-political problem, of just getting people to listen.  Even if DeepMind listened, and Anthropic knew, and they both backed off from destroying the world, that would just mean Facebook AI Research destroyed the world a year(?) later.”

      In AGI Ruin: A List of Lethalities:

      • “We can’t just “decide not to build AGI” because GPUs are everywhere, and knowledge of algorithms is constantly being improved and published; 2 years after the leading actor has the capability to destroy the world, 5 other actors will have the capability to destroy the world.  The given lethal challenge is to solve within a time limit, driven by the dynamic in which, over time, increasingly weak actors with a smaller and smaller fraction of total computing power, become able to build AGI and destroy the world.  Powerful actors all refraining in unison from doing the suicidal thing just delays this time limit – it does not lift it, unless computer hardware and computer software progress are both brought to complete severe halts across the whole Earth.  The current state of this cooperation to have every big actor refrain from doing the stupid thing, is that at present some large actors with a lot of researchers and computing power are led by people who vocally disdain all talk of AGI safety (eg Facebook AI Research).  Note that needing to solve AGI alignment only within a time limit, but with unlimited safe retries for rapid experimentation on the full-powered system; or only on the first critical try, but with an unlimited time bound; would both be terrifically humanity-threatening challenges by historical standards individually.”

      10 I’d guess real Luddites also thought the technological changes they faced were anti-progress, but in that case were they wrong to want to avoid them?

      11 I hear this is an elaboration on this theme, but I haven’t read it.

      12 Leopold Aschenbrenner partly defines ‘Burkean Longtermism’ thus: “We should be skeptical of any radical inside-view schemes to positively steer the long-run future, given the froth of uncertainty about the consequences of our actions.”

  • Counterarguments to the basic AI risk case

    Crossposted from The AI Impacts blog.

    This is going to be a list of holes I see in the basic argument for existential risk from superhuman AI systems1.

    To start, here’s an outline of what I take to be the basic case2:

    I. If superhuman AI systems are built, any given system is likely to be ‘goal-directed’

    Reasons to expect this:

    1. Goal-directed behavior is likely to be valuable, e.g. economically.
    2. Goal-directed entities may tend to arise from machine learning training processes not intending to create them (at least via the methods that are likely to be used).
    3. ‘Coherence arguments’ may imply that systems with some goal-directedness will become more strongly goal-directed over time.

    II. If goal-directed superhuman AI systems are built, their desired outcomes will probably be about as bad as an empty universe by human lights

    Reasons to expect this:

    1. Finding useful goals that aren’t extinction-level bad appears to be hard: we don’t have a way to usefully point at human goals, and divergences from human goals seem likely to produce goals that are in intense conflict with human goals, due to a) most goals producing convergent incentives for controlling everything, and b) value being ‘fragile’, such that an entity with ‘similar’ values will generally create a future of virtually no value.
    2. Finding goals that are extinction-level bad and temporarily useful appears to be easy: for example, advanced AI with the sole objective ‘maximize company revenue’ might profit said company for a time before gathering the influence and wherewithal to pursue the goal in ways that blatantly harm society.
    3. Even if humanity found acceptable goals, giving a powerful AI system any specific goals appears to be hard. We don’t know of any procedure to do it, and we have theoretical reasons to expect that AI systems produced through machine learning training will generally end up with goals other than those they were trained according to. Randomly aberrant goals resulting are probably extinction-level bad for reasons described in II.1 above.

    III. If most goal-directed superhuman AI systems have bad goals, the future will very likely be bad

    That is, a set of ill-motivated goal-directed superhuman AI systems, of a scale likely to occur, would be capable of taking control over the future from humans. This is supported by at least one of the following being true:

    1. Superhuman AI would destroy humanity rapidly. This may be via ultra-powerful capabilities at e.g. technology design and strategic scheming, or through gaining such powers in an ‘intelligence explosion‘ (self-improvement cycle). Either of those things may happen either through exceptional heights of intelligence being reached or through highly destructive ideas being available to minds only mildly beyond our own.
    2. Superhuman AI would gradually come to control the future via accruing power and resources. Power and resources would be more available to the AI system(s) than to humans on average, because of the AI having far greater intelligence.

    Below is a list of gaps in the above, as I see it, and counterarguments. A ‘gap’ is not necessarily unfillable, and may have been filled in any of the countless writings on this topic that I haven’t read. I might even think that a given one can probably be filled. I just don’t know what goes in it.

    This blog post is an attempt to run various arguments by you all on the way to making pages on AI Impacts about arguments for AI risk and corresponding counterarguments. At some point in that process I hope to also read others’ arguments, but this is not that day. So what you have here is a bunch of arguments that occur to me, not an exhaustive literature review. 


    A. Contra “superhuman AI systems will be ‘goal-directed’”

    Different calls to ‘goal-directedness’ don’t necessarily mean the same concept

    ‘Goal-directedness’ is a vague concept. It is unclear that the ‘goal-directednesses’ that are favored by economic pressure, training dynamics or coherence arguments (the component arguments in part I of the argument above) are the same ‘goal-directedness’ that implies a zealous drive to control the universe (i.e. that makes most possible goals very bad, fulfilling II above). 

    One well-defined concept of goal-directedness is ‘utility maximization’: always doing what maximizes a particular utility function, given a particular set of beliefs about the world. 

    Utility maximization does seem to quickly engender an interest in controlling literally everything, at least for many utility functions one might have3. If you want things to go a certain way, then you have reason to control anything which gives you any leverage over that, i.e. potentially all resources in the universe (i.e. agents have ‘convergent instrumental goals’). This is in serious conflict with anyone else with resource-sensitive goals, even if prima facie those goals didn’t look particularly opposed. For instance, a person who wants all things to be red and another person who wants all things to be cubes may not seem to be at odds, given that all things could be red cubes. However if these projects might each fail for lack of energy, then they are probably at odds. 

    Thus utility maximization is a notion of goal-directedness that allows Part II of the argument to work, by making a large class of goals deadly.

    You might think that any other concept of ‘goal-directedness’ would also lead to this zealotry. If one is inclined toward outcome O in any plausible sense, then does one not have an interest in anything that might help procure O? No: if a system is not a ‘coherent’ agent, then it can have a tendency to bring about O in a range of circumstances, without this implying that it will take any given effective opportunity to pursue O. This assumption of consistent adherence to a particular evaluation of everything is part of utility maximization, not a law of physical systems. Call machines that push toward particular goals but are not utility maximizers pseudo-agents. 

    Can pseudo-agents exist? Yes—utility maximization is computationally intractable, so any physically existent ‘goal-directed’ entity is going to be a pseudo-agent. We are all pseudo-agents, at best. But it seems something like a spectrum. At one end is a thermostat, then maybe a thermostat with a better algorithm for adjusting the heat. Then maybe a thermostat which intelligently controls the windows. After a lot of honing, you might have a system much more like a utility-maximizer: a system that deftly seeks out and seizes well-priced opportunities to make your room 68 degrees—upgrading your house, buying R&D, influencing your culture, building a vast mining empire. Humans might not be very far on this spectrum, but they seem enough like utility-maximizers already to be alarming. (And it might not be well-considered as a one-dimensional spectrum—for instance, perhaps ‘tendency to modify oneself to become more coherent’ is a fairly different axis from ‘consistency of evaluations of options and outcomes’, and calling both ‘more agentic’ is obscuring.)

    Nonetheless, it seems plausible that there is a large space of systems which strongly increase the chance of some desirable objective O occurring without even acting as much like maximizers of an identifiable utility function as humans would. For instance, without searching out novel ways of making O occur, or modifying themselves to be more consistently O-maximizing. Call these ‘weak pseudo-agents’. 

    For example, I can imagine a system constructed out of a huge number of ‘IF X THEN Y’ statements (reflexive responses), like ‘if body is in hallway, move North’, ‘if hands are by legs and body is in kitchen, raise hands to waist’.., equivalent to a kind of vector field of motions, such that for every particular state, there are directions that all the parts of you should be moving. I could imagine this being designed to fairly consistently cause O to happen within some context. However since such behavior would not be produced by a process optimizing O, you shouldn’t expect it to find new and strange routes to O, or to seek O reliably in novel circumstances. There appears to be zero pressure for this thing to become more coherent, unless its design already involves reflexes to move its thoughts in certain ways that lead it to change itself. I expect you could build a system like this that reliably runs around and tidies your house say, or runs your social media presence, without it containing any impetus to become a more coherent agent (because it doesn’t have any reflexes that lead to pondering self-improvement in this way).

    It is not clear that economic incentives generally favor the far end of this spectrum over weak pseudo-agency. There are incentives toward systems being more like utility maximizers, but also incentives against. 

    The reason any kind of ‘goal-directedness’ is incentivised in AI systems is that then the system can be given an objective by someone hoping to use their cognitive labor, and the system will make that objective happen. Whereas a similar non-agentic AI system might still do almost the same cognitive labor, but require an agent (such as a person) to look at the objective and decide what should be done to achieve it, then ask the system for that. Goal-directedness means automating this high-level strategizing. 

    Weak pseudo-agency fulfills this purpose to some extent, but not as well as utility maximization. However if we think that utility maximization is difficult to wield without great destruction, then that suggests a disincentive to creating systems with behavior closer to utility-maximization. Not just from the world being destroyed, but from the same dynamic causing more minor divergences from expectations, if the user can’t specify their own utility function well. 

    That is, if it is true that utility maximization tends to lead to very bad outcomes relative to any slightly different goals (in the absence of great advances in the field of AI alignment), then the most economically favored level of goal-directedness seems unlikely to be as far as possible toward utility maximization. More likely it is a level of pseudo-agency that achieves a lot of the users’ desires without bringing about sufficiently detrimental side effects to make it not worthwhile. (This is likely more agency than is socially optimal, since some of the side-effects will be harms to others, but there seems no reason to think that it is a very high degree of agency.)

    Some minor but perhaps illustrative evidence: anecdotally, people prefer interacting with others who predictably carry out their roles or adhere to deontological constraints, rather than consequentialists in pursuit of broadly good but somewhat unknown goals. For instance, employers would often prefer employees who predictably follow rules than ones who try to forward company success in unforeseen ways.

    The other arguments to expect goal-directed systems mentioned above seem more likely to suggest approximate utility-maximization rather than some other form of goal-directedness, but it isn’t that clear to me. I don’t know what kind of entity is most naturally produced by contemporary ML training. Perhaps someone else does. I would guess that it’s more like the reflex-based agent described above, at least at present. But present systems aren’t the concern.

    Coherence arguments are arguments for being coherent a.k.a. maximizing a utility function, so one might think that they imply a force for utility maximization in particular. That seems broadly right. Though note that these are arguments that there is some pressure for the system to modify itself to become more coherent. What actually results from specific systems modifying themselves seems like it might have details not foreseen in an abstract argument merely suggesting that the status quo is suboptimal whenever it is not coherent. Starting from a state of arbitrary incoherence and moving iteratively in one of many pro-coherence directions produced by whatever whacky mind you currently have isn’t obviously guaranteed to increasingly approximate maximization of some sensical utility function. For instance, take an entity with a cycle of preferences, apples > bananas = oranges > pears > apples. The entity notices that it sometimes treats oranges as better than pears and sometimes worse. It tries to correct by adjusting the value of oranges to be the same as pears. The new utility function is exactly as incoherent as the old one. Probably moves like this are rarer than ones that make you more coherent in this situation, but I don’t know, and I also don’t know if this is a great model of the situation for incoherent systems that could become more coherent.

    What it might look like if this gap matters: AI systems proliferate, and have various goals. Some AI systems try to make money in the stock market. Some make movies. Some try to direct traffic optimally. Some try to make the Democratic party win an election. Some try to make Walmart maximally profitable. These systems have no perceptible desire to optimize the universe for forwarding these goals because they aren’t maximizing a general utility function, they are more ‘behaving like someone who is trying to make Walmart profitable’. They make strategic plans and think about their comparative advantage and forecast business dynamics, but they don’t build nanotechnology to manipulate everybody’s brains, because that’s not the kind of behavior pattern they were designed to follow. The world looks kind of like the current world, in that it is fairly non-obvious what any entity’s ‘utility function’ is. It often looks like AI systems are ‘trying’ to do things, but there’s no reason to think that they are enacting a rational and consistent plan, and they rarely do anything shocking or galaxy-brained.

    Ambiguously strong forces for goal-directedness need to meet an ambiguously high bar to cause a risk

    The forces for goal-directedness mentioned in I are presumably of finite strength. For instance, if coherence arguments correspond to pressure for machines to become more like utility maximizers, there is an empirical answer to how fast that would happen with a given system. There is also an empirical answer to how ‘much’ goal directedness is needed to bring about disaster, supposing that utility maximization would bring about disaster and, say, being a rock wouldn’t. Without investigating these empirical details, it is unclear whether a particular qualitatively identified force for goal-directedness will cause disaster within a particular time.

    What it might look like if this gap matters: There are not that many systems doing something like utility maximization in the new AI economy. Demand is mostly for systems more like GPT or DALL-E, which transform inputs in some known way without reference to the world, rather than ‘trying’ to bring about an outcome. Maybe the world was headed for more of the latter, but ethical and safety concerns reduced desire for it, and it wasn’t that hard to do something else. Companies setting out to make non-agentic AI systems have no trouble doing so. Incoherent AIs are never observed making themselves more coherent, and training has never produced an agent unexpectedly. There are lots of vaguely agentic things, but they don’t pose much of a problem. There are a few things at least as agentic as humans, but they are a small part of the economy.

    B. Contra “goal-directed AI systems’ goals will be bad”

    Small differences in utility functions may not be catastrophic

    Arguably, humans are likely to have somewhat different values to one another even after arbitrary reflection. If so, there is some extended region of the space of possible values that the values of different humans fall within. That is, ‘human values’ is not a single point.

    If the values of misaligned AI systems fall within that region, this would not appear to be worse in expectation than the situation where the long-run future was determined by the values of humans other than you. (This may still be a huge loss of value relative to the alternative, if a future determined by your own values is vastly better than that chosen by a different human, and if you also expected to get some small fraction of the future, and will now get much less. These conditions seem non-obvious however, and if they obtain you should worry about more general problems than AI.)

    Plausibly even a single human, after reflecting, could on their own come to different places in a whole region of specific values, depending on somewhat arbitrary features of how the reflecting period went. In that case, even the values-on-reflection of a single human is an extended region of values space, and an AI which is only slightly misaligned could be the same as some version of you after reflecting.

    There is a further larger region, ‘that which can be reliably enough aligned with typical human values via incentives in the environment’, which is arguably larger than the circle containing most human values. Human society makes use of this a lot: for instance, most of the time particularly evil humans don’t do anything too objectionable because it isn’t in their interests. This region is probably smaller for more capable creatures such as advanced AIs, but still it is some size.

    Thus it seems that some amount4 of AI divergence from your own values is probably broadly fine, i.e. not worse than what you should otherwise expect without AI.

    Thus in order to arrive at a conclusion of doom, it is not enough to argue that we cannot align AI perfectly. The question is a quantitative one of whether we can get it close enough. And how close is ‘close enough’ is not known. 

    What it might look like if this gap matters: there are many superintelligent goal-directed AI systems around. They are trained to have human-like goals, but we know that their training is imperfect and none of them has goals exactly like those presented in training. However if you just heard about a particular system’s intentions, you wouldn’t be able to guess if it was an AI or a human. Things happen much faster than they were, because superintelligent AI is superintelligent, but not obviously in a direction less broadly in line with human goals than when humans were in charge.

    Differences between AI and human values may be small 

    AI trained to have human-like goals will have something close to human-like goals. How close? Call it d, for a particular occasion of training AI. 

    If d doesn’t have to be 0 for safety (from above), then there is a question of whether it is an acceptable size. 

    I know of two issues here, pushing d upward. One is that with a finite number of training examples, the fit between the true function and the learned function will be wrong. The other is that you might accidentally create a monster (‘misaligned mesaoptimizer’) who understands its situation and pretends to have the utility function you are aiming for so that it can be freed and go out and manifest its own utility function, which could be just about anything. If this problem is real, then the values of an AI system might be arbitrarily different from the training values, rather than ‘nearby’ in some sense, so d is probably unacceptably large. But if you avoid creating such mesaoptimizers, then it seems plausible to me that d is very small. 

    If humans also substantially learn their values via observing examples, then the variation in human values is arising from a similar process, so might be expected to be of a similar scale. If we care to make the ML training process more accurate than the human learning one, it seems likely that we could. For instance, d gets smaller with more data.

    Another line of evidence is that for things that I have seen AI learn so far, the distance from the real thing is intuitively small. If AI learns my values as well as it learns what faces look like, it seems plausible that it carries them out better than I do.

    As minor additional evidence here, I don’t know how to describe any slight differences in utility functions that are catastrophic. Talking concretely, what does a utility function look like that is so close to a human utility function that an AI system has it after a bunch of training, but which is an absolute disaster? Are we talking about the scenario where the AI values a slightly different concept of justice, or values satisfaction a smidgen more relative to joy than it should? And then that’s a moral disaster because it is wrought across the cosmos? Or is it that it looks at all of our inaction and thinks we want stuff to be maintained very similar to how it is now, so crushes any efforts to improve things? 

    What it might look like if this gap matters: when we try to train AI systems to care about what specific humans care about, they usually pretty much do, as far as we can tell. We basically get what we trained for. For instance, it is hard to distinguish them from the human in question. (It is still important to actually do this training, rather than making AI systems not trained to have human values.)

    Maybe value isn’t fragile

    Eliezer argued that value is fragile, via examples of ‘just one thing’ that you can leave out of a utility function, and end up with something very far away from what humans want. For instance, if you leave out ‘boredom’ then he thinks the preferred future might look like repeating the same otherwise perfect moment again and again. (His argument is perhaps longer—that post says there is a lot of important background, though the bits mentioned don’t sound relevant to my disagreement.) This sounds to me like ‘value is not resilient to having components of it moved to zero’, which is a weird usage of ‘fragile’, and in particular, doesn’t seem to imply much about smaller perturbations. And smaller perturbations seem like the relevant thing with AI systems trained on a bunch of data to mimic something. 

    You could very analogously say ‘human faces are fragile’ because if you just leave out the nose it suddenly doesn’t look like a typical human face at all. Sure, but is that the kind of error you get when you try to train ML systems to mimic human faces? Almost none of the faces on thispersondoesnotexist.com are blatantly morphologically unusual in any way, let alone noseless. Admittedly one time I saw someone whose face was neon green goo, but I’m guessing you can get the rate of that down pretty low if you care about it.

    Eight examples, no cherry-picking:

    Skipping the nose is the kind of mistake you make if you are a child drawing a face from memory. Skipping ‘boredom’ is the kind of mistake you make if you are a person trying to write down human values from memory. My guess is that this seemed closer to the plan in 2009 when that post was written, and that people cached the takeaway and haven’t updated it for deep learning which can learn what faces look like better than you can.

    What it might look like if this gap matters: there is a large region ‘around’ my values in value space that is also pretty good according to me. AI easily lands within that space, and eventually creates some world that is about as good as the best possible utopia, according to me. There aren’t a lot of really crazy and terrible value systems adjacent to my values.

    Short-term goals

    Utility maximization really only incentivises drastically altering the universe if one’s utility function places a high enough value on very temporally distant outcomes relative to near ones. That is, long term goals are needed for danger. A person who cares most about winning the timed chess game in front of them should not spend time accruing resources to invest in better chess-playing.

    AI systems could have long-term goals via people intentionally training them to do so, or via long-term goals naturally arising from systems not trained so. 

    Humans seem to discount the future a lot in their usual decision-making (they have goals years in advance but rarely a hundred years) so the economic incentive to train AI to have very long term goals might be limited.

    It’s not clear that training for relatively short term goals naturally produces creatures with very long term goals, though it might.

    Thus if AI systems fail to have value systems relatively similar to human values, it is not clear that many will have the long time horizons needed to motivate taking over the universe.

    What it might look like if this gap matters: the world is full of agents who care about relatively near-term issues, and are helpful to that end, and have no incentive to make long-term large scale schemes. Reminiscent of the current world, but with cleverer short-termism.

    C. Contra “superhuman AI would be sufficiently superior to humans to overpower humanity”

    Human success isn’t from individual intelligence

    The argument claims (or assumes) that surpassing ‘human-level’ intelligence (i.e. the mental capacities of an individual human) is the relevant bar for matching the power-gaining capacity of humans, such that passing this bar in individual intellect means outcompeting humans in general in terms of power (argument III.2), if not being able to immediately destroy them all outright (argument III.1.). In a similar vein, introductions to AI risk often start by saying that humanity has triumphed over the other species because it is more intelligent, as a lead in to saying that if we make something more intelligent still, it will inexorably triumph over humanity.

    This hypothesis about the provenance of human triumph seems wrong. Intellect surely helps, but humans look to be powerful largely because they share their meager intellectual discoveries with one another and consequently save them up over time5. You can see this starkly by comparing the material situation of Alice, a genius living in the stone age, and Bob, an average person living in 21st Century America. Alice might struggle all day to get a pot of water, while Bob might be able to summon all manner of delicious drinks from across the oceans, along with furniture, electronics, information, etc. Much of Bob’s power probably did flow from the application of intelligence, but not Bob’s individual intelligence. Alice’s intelligence, and that of those who came between them.

    Bob’s greater power isn’t directly just from the knowledge and artifacts Bob inherits from other humans. He also seems to be helped for instance by much better coordination: both from a larger number people coordinating together, and from better infrastructure for that coordination (e.g. for Alice the height of coordination might be an occasional big multi-tribe meeting with trade, and for Bob it includes global instant messaging and banking systems and the Internet). One might attribute all of this ultimately to innovation, and thus to intelligence and communication, or not. I think it’s not important to sort out here, as long as it’s clear that individual intelligence isn’t the source of power.

    It could still be that with a given bounty of shared knowledge (e.g. within a given society), intelligence grants huge advantages. But even that doesn’t look true here: 21st Century geniuses live basically like 21st Century people of average intelligence, give or take6.

    Why does this matter? Well for one thing, if you make AI which is merely as smart as a human, you shouldn’t then expect it to do that much better than a genius living in the stone age. That’s what human-level intelligence gets you: nearly nothing. A piece of rope after millions of lifetimes. Humans without their culture are much like other animals. 

    To wield the control-over-the-world of a genius living in the 21st Century, the human-level AI would seem to need something like the other benefits that the 21st century genius gets from their situation in connection with a society. 

    One such thing is access to humanity’s shared stock of hard-won information. AI systems plausibly do have this, if they can get most of what is relevant by reading the internet. This isn’t obvious: people also inherit information from society through copying habits and customs, learning directly from other people, and receiving artifacts with implicit information (for instance, a factory allows whoever owns the factory to make use of intellectual work that was done by the people who built the factory, but that information may not available explicitly even for the owner of the factory, let alone to readers on the internet). These sources of information seem likely to also be available to AI systems though, at least if they are afforded the same options as humans.

    My best guess is that AI systems easily do better than humans on extracting information from humanity’s stockpile, and on coordinating, and so on this account are probably in an even better position to compete with humans than one might think on the individual intelligence model, but that is a guess. In that case perhaps this misunderstanding makes little difference to the outcomes of the argument. However it seems at least a bit more complicated. 

    Suppose that AI systems can have access to all information humans can have access to. The power the 21st century person gains from their society is modulated by their role in society, and relationships, and rights, and the affordances society allows them as a result. Their power will vary enormously depending on whether they are employed, or listened to, or paid, or a citizen, or the president. If AI systems’ power stems substantially from interacting with society, then their power will also depend on affordances granted, and humans may choose not to grant them many affordances (see section ‘Intelligence may not be an overwhelming advantage’ for more discussion).

    However suppose that your new genius AI system is also treated with all privilege. The next way that this alternate model matters is that if most of what is good in a person’s life is determined by the society they are part of, and their own labor is just buying them a tiny piece of that inheritance, then if they are for instance twice as smart as any other human, they don’t get to use technology that it twice as good. They just get a larger piece of that same shared technological bounty purchasable by anyone. Because each individual person is adding essentially nothing in terms of technology, so twice that is still basically nothing. 

    In contrast, I think people are often imagining that a single entity somewhat smarter than a human will be able to quickly use technologies that are somewhat better than current human technologies. This seems to be mistaking the actions of a human and the actions of a human society. If a hundred thousand people sometimes get together for a few years and make fantastic new weapons, you should not expect an entity somewhat smarter than a person to make even better weapons. That’s off by a factor of about a hundred thousand. 

    There might be places you can get far ahead of humanity by being better than a single human—it depends how much accomplishments depend on the few most capable humans in the field, and how few people are working on the problem7. But for instance the Manhattan Project took a hundred thousand people several years, and von Neumann (a mythically smart scientist) joining the project did not reduce it to an afternoon. Plausibly to me, some specific people being on the project caused it to not take twice as many person-years, though the plausible candidates here seem to be more in the business of running things than doing science directly (though that also presumably involves intelligence). But even if you are an ambitious somewhat superhuman intelligence, the influence available to you seems to plausibly be limited to making a large dent in the effort required for some particular research endeavor, not single-handedly outmoding humans across many research endeavors.

    This is all reason to doubt that a small number of superhuman intelligences will rapidly take over or destroy the world (as in III.i.). This doesn’t preclude a set of AI systems that are together more capable than a large number of people from making great progress. However some related issues seem to make that less likely.

    Another implication of this model is that if most human power comes from buying access to society’s shared power, i.e. interacting with the economy, you should expect intellectual labor by AI systems to usually be sold, rather than for instance put toward a private stock of knowledge. This means the intellectual outputs are mostly going to society, and the main source of potential power to an AI system is the wages received (which may allow it to gain power in the long run). However it seems quite plausible that AI systems at this stage will generally not receive wages, since they presumably do not need them to be motivated to do the work they were trained for. It also seems plausible that they would be owned and run by humans. This would seem to not involve any transfer of power to that AI system, except insofar as its intellectual outputs benefit it (e.g. if it is writing advertising material, maybe it doesn’t get paid for that, but if it can write material that slightly furthers its own goals in the world while also fulfilling the advertising requirements, then it sneaked in some influence.) 

    If there is AI which is moderately more competent than humans, but not sufficiently more competent to take over the world, then it is likely to contribute to this stock of knowledge and affordances shared with humans. There is no reason to expect it to build a separate competing stock, any more than there is reason for a current human household to try to build a separate competing stock rather than sell their labor to others in the economy. 

    In summary:

    1. Functional connection with a large community of other intelligences in the past and present is probably a much bigger factor in the success of humans as a species or individual humans than is individual intelligence. 
    2. Thus this also seems more likely to be important for AI success than individual intelligence. This is contrary to a usual argument for AI superiority, but probably leaves AI systems at least as likely to outperform humans, since superhuman AI is probably superhumanly good at taking in information and coordinating.
    3. However it is not obvious that AI systems will have the same access to society’s accumulated information e.g. if there is information which humans learn from living in society, rather than from reading the internet. 
    4. And it seems an open question whether AI systems are given the same affordances in society as humans, which also seem important to making use of the accrued bounty of power over the world that humans have. For instance, if they are not granted the same legal rights as humans, they may be at a disadvantage in doing trade or engaging in politics or accruing power.
    5. The fruits of greater intelligence for an entity will probably not look like society-level accomplishments unless it is a society-scale entity
    6. The route to influence with smaller fruits probably by default looks like participating in the economy rather than trying to build a private stock of knowledge.
    7. If the resources from participating in the economy accrue to the owners of AI systems, not to the systems themselves, then there is less reason to expect the systems to accrue power incrementally, and they are at a severe disadvantage relative to humans. 

    Overall these are reasons to expect AI systems with around human-level cognitive performance to not destroy the world immediately, and to not amass power as easily as one might imagine. 

    What it might look like if this gap matters: If AI systems are somewhat superhuman, then they do impressive cognitive work, and each contributes to technology more than the best human geniuses, but not more than the whole of society, and not enough to materially improve their own affordances. They don’t gain power rapidly because they are disadvantaged in other ways, e.g. by lack of information, lack of rights, lack of access to positions of power. Their work is sold and used by many actors, and the proceeds go to their human owners. AI systems do not generally end up with access to masses of technology that others do not have access to, and nor do they have private fortunes. In the long run, as they become more powerful, they might take power if other aspects of the situation don’t change. 

    AI agents may not be radically superior to combinations of humans and non-agentic machines

    ‘Human level capability’ is a moving target. For comparing the competence of advanced AI systems to humans, the relevant comparison is with humans who have state-of-the-art AI and other tools. For instance, the human capacity to make art quickly has recently been improved by a variety of AI art systems. If there were now an agentic AI system that made art, it would make art much faster than a human of 2015, but perhaps hardly faster than a human of late 2022. If humans continually have access to tool versions of AI capabilities, it is not clear that agentic AI systems must ever have an overwhelmingly large capability advantage for important tasks (though they might). 

    (This is not an argument that humans might be better than AI systems, but rather: if the gap in capability is smaller, then the pressure for AI systems to accrue power is less and thus loss of human control is slower and easier to mitigate entirely through other forces, such as subsidizing human involvement or disadvantaging AI systems in the economy.)

    Some advantages of being an agentic AI system vs. a human with a tool AI system seem to be:

    1. There might just not be an equivalent tool system, for instance if it is impossible to train systems without producing emergent agents.
    2. When every part of a process takes into account the final goal, this should make the choices within the task more apt for the final goal (and agents know their final goal, whereas tools carrying out parts of a larger problem do not).
    3. For humans, the interface for using a capability of one’s mind tends to be smoother than the interface for using a tool. For instance a person who can do fast mental multiplication can do this more smoothly and use it more often than a person who needs to get out a calculator. This seems likely to persist.

    1 and 2 may or may not matter much. 3 matters more for brief, fast, unimportant tasks. For instance, consider again people who can do mental calculations better than others. My guess is that this advantages them at using Fermi estimates in their lives and buying cheaper groceries, but does not make them materially better at making large financial choices well. For a one-off large financial choice, the effort of getting out a calculator is worth it and the delay is very short compared to the length of the activity. The same seems likely true of humans with tools vs. agentic AI with the same capacities integrated into their minds. Conceivably the gap between humans with tools and goal-directed AI is small for large, important tasks.

    What it might look like if this gap matters: agentic AI systems have substantial advantages over humans with tools at some tasks like rapid interaction with humans, and responding to rapidly evolving strategic situations.  One-off large important tasks such as advanced science are mostly done by tool AI. 


    If goal-directed AI systems are only mildly more competent than some combination of tool systems and humans (as suggested by considerations in the last two sections), we still might expect AI systems to out-compete humans, just more slowly. However AI systems have one serious disadvantage as employees of humans: they are intrinsically untrustworthy, while we don’t understand them well enough to be clear on what their values are or how they will behave in any given case. Even if they did perform as well as humans at some task, if humans can’t be certain of that, then there is reason to disprefer using them. This can be thought of as two problems: firstly, slightly misaligned systems are less valuable because they genuinely do the thing you want less well, and secondly, even if they were not misaligned, if humans can’t know that (because we have no good way to verify the alignment of AI systems) then it is costly in expectation to use them. (This is only a further force acting against the supremacy of AI systems—they might still be powerful enough that using them is enough of an advantage that it is worth taking the hit on trustworthiness.)

    What it might look like if this gap matters: in places where goal-directed AI systems are not typically hugely better than some combination of less goal-directed systems and humans, the job is often given to the latter if trustworthiness matters. 


    For AI to vastly surpass human performance at a task, there needs to be ample room for improvement above human level. For some tasks, there is not—tic-tac-toe is a classic example. It is not clear how close humans (or technologically aided humans) are from the limits to competence in the particular domains that will matter. It is to my knowledge an open question how much ‘headroom’ there is. My guess is a lot, but it isn’t obvious.

    How much headroom there is varies by task. Categories of task for which there appears to be little headroom: 

    1. Tasks where we know what the best performance looks like, and humans can get close to it. For instance, machines cannot win more often than the best humans at Tic-tac-toe (playing within the rules) or solve Rubik’s cubes much more reliably, or extracting calories from fuel
    2. Tasks where humans are already be reaping most of the value—for instance, perhaps most of the value of forks is in having a handle with prongs attached to the end, and while humans continue to design slightly better ones, and machines might be able to add marginal value to that project more than twice as fast as the human designers, they cannot perform twice as well in terms of the value of each fork, because forks are already 95% as good as they can be. 
    3. Better performance is quickly intractable. For instance, we know that for tasks in particular complexity classes, there are computational limits to how well one can perform across the board. Or for chaotic systems, there can be limits to predictability. (That is, tasks might lack headroom not because they are simple, but because they are complex. E.g. AI probably can’t predict the weather much further out than humans.)

    Categories of task where a lot of headroom seems likely:

    1. Competitive tasks where the value of a certain level of performance depends on whether one is better or worse than one’s opponent, so that the marginal value of more performance doesn’t hit diminishing returns, as long as your opponent keeps competing and taking back what you just won. Though in one way this is like having little headroom: there’s no more value to be had—the game is zero sum. And while there might often be a lot of value to be gained by doing a bit better on the margin, still if all sides can invest, then nobody will end up better off than they were. So whether this seems more like high or low headroom depends on what we are asking exactly. Here we are asking if AI systems can do much better than humans: in a zero sum contest like this, they likely can in the sense that they can beat humans, but not in the sense of reaping anything more from the situation than the humans ever got.
    2. Tasks where it is twice as good to do the same task twice as fast, and where speed is bottlenecked on thinking time.
    3. Tasks where there is reason to think that optimal performance is radically better than we have seen. For instance, perhaps we can estimate how high Chess Elo rankings must go before reaching perfection by reasoning theoretically about the game, and perhaps it is very high (I don’t know).
    4. Tasks where humans appear to use very inefficient methods. For instance, it was perhaps predictable before calculators that they would be able to do mathematics much faster than humans, because humans can only keep a small number of digits in their heads, which doesn’t seem like an intrinsically hard problem. Similarly, I hear humans often use mental machinery designed for one mental activity for fairly different ones, through analogy.8 For instance, when I think about macroeconomics, I seem to be basically using my intuitions for dealing with water. When I do mathematics in general, I think I’m probably using my mental capacities for imagining physical objects.

    What it might look like if this gap matters: many challenges in today’s world remain challenging for AI. Human behavior is not readily predictable or manipulable very far beyond what we have explored, only slightly more complicated schemes are feasible before the world’s uncertainties overwhelm planning; much better ads are soon met by much better immune responses; much better commercial decision-making ekes out some additional value across the board but most products were already fulfilling a lot of their potential; incredible virtual prosecutors meet incredible virtual defense attorneys and everything is as it was; there are a few rounds of attack-and-defense in various corporate strategies before a new equilibrium with broad recognition of those possibilities; conflicts and ‘social issues’ remain mostly intractable. There is a brief golden age of science before the newly low-hanging fruit are again plucked and it is only lightning fast in areas where thinking was the main bottleneck, e.g. not in medicine.

    Intelligence may not be an overwhelming advantage

    Intelligence is helpful for accruing power and resources, all things equal, but many other things are helpful too. For instance money, social standing, allies, evident trustworthiness, not being discriminated against (this was slightly discussed in section ‘Human success isn’t from individual intelligence’). AI systems are not guaranteed to have those in abundance. The argument assumes that any difference in intelligence in particular will eventually win out over any differences in other initial resources. I don’t know of reason to think that. 

    Empirical evidence does not seem to support the idea that cognitive ability is a large factor in success. Situations where one entity is much smarter or more broadly mentally competent than other entities regularly occur without the smarter one taking control over the other:

    1. Species exist with all levels of intelligence. Elephants have not in any sense won over gnats; they do not rule gnats; they do not have obviously more control than gnats over the environment.
    2. Competence does not seem to aggressively overwhelm other advantages in humans:
      1. Looking at the world, intuitively the big discrepancies in power are not seemingly about intelligence.
      2. IQ 130 humans apparently earn very roughly $6000-$18,500 per year more than average IQ humans.
      3. Elected representatives are apparently smarter on average, but it is a slightly shifted curve, not a radically difference.
      4. MENSA isn’t a major force in the world.
      5. Many places where people see huge success through being cognitively able are ones where they show off their intelligence to impress people, rather than actually using it for decision-making. For instance, writers, actors, song-writers, comedians, all sometimes become very successful through cognitive skills. Whereas scientists, engineers and authors of software use cognitive skills to make choices about the world, and less often become extremely rich and famous, say. If intelligence were that useful for strategic action, it seems like using it for that would be at least as powerful as showing it off. But maybe this is just an accident of which fields have winner-takes-all type dynamics.
      6. If we look at people who evidently have good cognitive abilities given their intellectual output, their personal lives are not obviously drastically more successful, anecdotally.
      7. One might counter-counter-argue that humans are very similar to one another in capability, so even if intelligence matters much more than other traits, you won’t see that by looking at the near-identical humans. This does not seem to be true. Often at least, the difference in performance between mediocre human performance and top level human performance is large, relative to the space below, iirc. For instance, in chess, the Elo difference between the best and worst players is about 2000, whereas the difference between the amateur play and random play is maybe 400-2800 (if you accept Chess StackExchange guesses as a reasonable proxy for the truth here). And in terms of AI progress, amateur human play was reached in the 50s, roughly when research began, and world champion level play was reached in 1997.

    And theoretically I don’t know why one would expect greater intelligence to win out over other advantages over time.  There are actually two questionable theories here: 1) Charlotte having more overall control than David at time 0 means that Charlotte will tend to have an even greater share of control at time 1. And, 2) Charlotte having more intelligence than David at time 0 means that Charlotte will have a greater share of control at time 1 even if Bob has more overall control (i.e. more of other resources) at time 1.

    What it might look like if this gap matters: there are many AI systems around, and they strive for various things. They don’t hold property, or vote, or get a weight in almost anyone’s decisions, or get paid, and are generally treated with suspicion. These things on net keep them from gaining very much power. They are very persuasive speakers however and we can’t stop them from communicating, so there is a constant risk of people willingly handing them power, in response to their moving claims that they are an oppressed minority who suffer. The main thing stopping them from winning is that their position as psychopaths bent on taking power for incredibly pointless ends is widely understood.

    Unclear that many goals realistically incentivise taking over the universe

    I have some goals. For instance, I want some good romance. My guess is that trying to take over the universe isn’t the best way to achieve this goal. The same goes for a lot of my goals, it seems to me. Possibly I’m in error, but I spend a lot of time pursuing goals, and very little of it trying to take over the universe. Whether a particular goal is best forwarded by trying to take over the universe as a substep seems like a quantitative empirical question, to which the answer is virtually always ‘not remotely’. Don’t get me wrong: all of these goals involve some interest in taking over the universe. All things equal, if I could take over the universe for free, I do think it would help in my romantic pursuits. But taking over the universe is not free. It’s actually super duper duper expensive and hard. So for most goals arising, it doesn’t bear considering. The idea of taking over the universe as a substep is entirely laughable for almost any human goal.

    So why do we think that AI goals are different? I think the thought is that it’s radically easier for AI systems to take over the world, because all they have to do is to annihilate humanity, and they are way better positioned to do that than I am, and also better positioned to survive the death of human civilization than I am. I agree that it is likely easier, but how much easier? So much easier to take it from ‘laughably unhelpful’ to ‘obviously always the best move’? This is another quantitative empirical question.

    What it might look like if this gap matters: Superintelligent AI systems pursue their goals. Often they achieve them fairly well. This is somewhat contrary to ideal human thriving, but not lethal. For instance, some AI systems are trying to maximize Amazon’s market share, within broad legality. Everyone buys truly incredible amounts of stuff from Amazon, and people often wonder if it is too much stuff. At no point does attempting to murder all humans seem like the best strategy for this. 

    Quantity of new cognitive labor is an empirical question, not addressed

    Whether some set of AI systems can take over the world with their new intelligence probably depends how much total cognitive labor they represent. For instance, if they are in total slightly more capable than von Neumann, they probably can’t take over the world. If they are together as capable (in some sense) as a million 21st Century human civilizations, then they probably can (at least in the 21st Century).

    It also matters how much of that is goal-directed at all, and highly intelligent, and how much of that is directed at achieving the AI systems’ own goals rather than those we intended them for, and how much of that is directed at taking over the world. 

    If we continued to build hardware, presumably at some point AI systems would account for most of the cognitive labor in the world. But if there is first an extended period of more minimal advanced AI presence, that would probably prevent an immediate death outcome, and improve humanity’s prospects for controlling a slow-moving AI power grab. 

    What it might look like if this gap matters: when advanced AI is developed, there is a lot of new cognitive labor in the world, but it is a minuscule fraction of all of the cognitive labor in the world. A large part of it is not goal-directed at all, and of that, most of the new AI thought is applied to tasks it was intended for. Thus what part of it is spent on scheming to grab power for AI systems is too small to grab much power quickly. The amount of AI cognitive labor grows fast over time, and in several decades it is most of the cognitive labor, but humanity has had extensive experience dealing with its power grabbing.

    Speed of intelligence growth is ambiguous

    The idea that a superhuman AI would be able to rapidly destroy the world seems prima facie unlikely, since no other entity has ever done that. Two common broad arguments for it:

    1. There will be a feedback loop in which intelligent AI makes more intelligent AI repeatedly until AI is very intelligent.
    2. Very small differences in brains seem to correspond to very large differences in performance, based on observing humans and other apes. Thus any movement past human-level will take us to unimaginably superhuman level.

    These both seem questionable.

    1. Feedback loops can happen at very different rates. Identifying a feedback loop empirically does not signify an explosion of whatever you are looking at. For instance, technology is already helping improve technology. To get to a confident conclusion of doom, you need evidence that the feedback loop is fast.
    2. It does not seem clear that small improvements in brains lead to large changes in intelligence in general, or will do on the relevant margin. Small differences between humans and other primates might include those helpful for communication (see Section ‘Human success isn’t from individual intelligence’), which do not seem relevant here. If there were a particularly powerful cognitive development between chimps and humans, it is unclear that AI researchers find that same insight at the same point in the process (rather than at some other time). 

    A large number of other arguments have been posed for expecting very fast growth in intelligence at around human level. I previously made a list of them with counterarguments, though none seemed very compelling. Overall, I don’t know of strong reason to expect very fast growth in AI capabilities at around human-level AI performance, though I hear such arguments might exist. 

    What it would look like if this gap mattered: AI systems would at some point perform at around human level at various tasks, and would contribute to AI research, along with everything else. This would contribute to progress to an extent familiar from other technological progress feedback, and would not e.g. lead to a superintelligent AI system in minutes.

    Key concepts are vague

    Concepts such as ‘control’, ‘power’, and ‘alignment with human values’ all seem vague. ‘Control’ is not zero sum (as seemingly assumed) and is somewhat hard to pin down, I claim. What an ‘aligned’ entity is exactly seems to be contentious in the AI safety community, but I don’t know the details. My guess is that upon further probing, these conceptual issues are resolvable in a way that doesn’t endanger the argument, but I don’t know. I’m not going to go into this here.

    What it might look like if this gap matters: upon thinking more, we realize that our concerns were confused. Things go fine with AI in ways that seem obvious in retrospect. This might look like it did for people concerned about the ‘population bomb’ or as it did for me in some of my youthful concerns about sustainability: there was a compelling abstract argument for a problem, and the reality didn’t fit the abstractions well enough to play out as predicted.

    D. Contra the whole argument

    The argument overall proves too much about corporations

    Here is the argument again, but modified to be about corporations. A couple of pieces don’t carry over, but they don’t seem integral.

    I. Any given corporation is likely to be ‘goal-directed’

    Reasons to expect this:

    1. Goal-directed behavior is likely to be valuable in corporations, e.g. economically
    2. Goal-directed entities may tend to arise from machine learning training processes not intending to create them (at least via the methods that are likely to be used).
    3. ‘Coherence arguments’ may imply that systems with some goal-directedness will become more strongly goal-directed over time.

    II. If goal-directed superhuman corporations are built, their desired outcomes will probably be about as bad as an empty universe by human lights

    Reasons to expect this:

    1. Finding useful goals that aren’t extinction-level bad appears to be hard: we don’t have a way to usefully point at human goals, and divergences from human goals seem likely to produce goals that are in intense conflict with human goals, due to a) most goals producing convergent incentives for controlling everything, and b) value being ‘fragile’, such that an entity with ‘similar’ values will generally create a future of virtually no value. 
    2. Finding goals that are extinction-level bad and temporarily useful appears to be easy: for example, corporations with the sole objective ‘maximize company revenue’ might profit for a time before gathering the influence and wherewithal to pursue the goal in ways that blatantly harm society.
    3. Even if humanity found acceptable goals, giving a corporation any specific goals appears to be hard. We don’t know of any procedure to do it, and we have theoretical reasons to expect that AI systems produced through machine learning training will generally end up with goals other than those that they were trained according to. Randomly aberrant goals resulting are probably extinction-level bad, for reasons described in II.1 above.

    III. If most goal-directed corporations have bad goals, the future will very likely be bad

    That is, a set of ill-motivated goal-directed corporations, of a scale likely to occur, would be capable of taking control of the future from humans. This is supported by at least one of the following being true:

    1. A corporation would destroy humanity rapidly. This may be via ultra-powerful capabilities at e.g. technology design and strategic scheming, or through gaining such powers in an ‘intelligence explosion‘ (self-improvement cycle). Either of those things may happen either through exceptional heights of intelligence being reached or through highly destructive ideas being available to minds only mildly beyond our own.
    2. Superhuman AI would gradually come to control the future via accruing power and resources. Power and resources would be more available to the corporation than to humans on average, because of the corporation having far greater intelligence.

    This argument does point at real issues with corporations, but we do not generally consider such issues existentially deadly. 

    One might argue that there are defeating reasons that corporations do not destroy the world: they are made of humans so can be somewhat reined in; they are not smart enough; they are not coherent enough. But in that case, the original argument needs to make reference to these things, so that they apply to one and not the other.

    What it might look like if this counterargument matters: something like the current world. There are large and powerful systems doing things vastly beyond the ability of individual humans, and acting in a definitively goal-directed way. We have a vague understanding of their goals, and do not assume that they are coherent. Their goals are clearly not aligned with human goals, but they have enough overlap that many people are broadly in favor of their existence. They seek power. This all causes some problems, but problems within the power of humans and other organized human groups to keep under control, for some definition of ‘under control’.


    I think there are quite a few gaps in the argument, as I understand it. My current guess (prior to reviewing other arguments and integrating things carefully) is that enough uncertainties might resolve in the dangerous directions that existential risk from AI is a reasonable concern. I don’t at present though see how one would come to think it was overwhelmingly likely.

    1. That is, systems that are somewhat more capable than the most capable human. 

    2. Based on countless conversations in the AI risk community, and various reading. 

    3. Though not all: you might have an easily satiable utility function, or only care about the near future. 

    4. We are talking about divergence in a poorly specified multi-dimensional space, so it isn’t going to be a fixed distance in every direction from the ideal point. It could theoretically be zero distance on some dimensions, such that if AI was misaligned at all in those directions it was catastrophic. My point here is merely that there is some area larger than a point. 

    5. The Secrets of Our Success seems to be the canonical reference for this, but I haven’t read it. I don’t know how controversial this is, but also don’t presently see how it could fail to be true. 

    6. See section ‘Intelligence may not be an overwhelming advantage’. 

    7. E.g. for the metric ‘hardness of math problem solvable’, maybe no human can solve a level 10 math problem, but several can solve 9s. Then human society as a whole also can’t solve a 10. So the first AI that can is only mildly surpassing the best human, but is at the same time surpassing all of human society. 

    8. Probably I have this impression from reading Steven Pinker at some point. 

  • Calibration of a thousand predictions

    I’ve been making predictions in a spreadsheet for the last four years, and I recently got to a thousand resolved predictions. Some observations:

    1. I’m surprisingly well calibrated for things that mostly aren’t my own behavior1. Here’s the calibration curve for 630 resolved predictions in that class:

      calibration no context predictions

      I don’t know what’s up with the 80% category, but the average miscalibration of the eleven categories is <3%.

      At risk of bragging, this seems wild to me. My experience of making these predictions is fairly well described as ‘pulling a number out of thin air’2. But apparently if you take all these conjured numbers, and look at the 45 of them that fell in the vicinity of 40%, then I implicitly guessed that 17.28 of those events would happen. And in fact 18 of them happened. WTF? Why wasn’t it eight of them or thirty-five of them? And that was only the fifth most accurate of the eleven buckets shown above! For predictions in the vicinity of 70%, I was off by 0.15%—I said 54.88 of those 80 things would happen, and in fact 55 of them happened.

      Possibly people overall are just better calibrated than I thought. I had some remembered view that people’s asserted 90% confidence intervals were really 50% confidence intervals or something, but I can’t immediately find such evidence now, and I can find various graphs of groups of people being on average fairly calibrated. And the handful of PredictionBook users I could find with more than a thousand predictions are not hugely worse.

      If you are curious about what I predicted, I put examples at the end of this post.

    2. For the entire thousand predictions—the above plus 370 about my own behavior— I’m off by 6.25% on average (up from 2.95%) over the same eleven buckets.

      calibration first 1000 resolved predictions

    3. As you may infer, I’m pretty bad overall at predicting my own behavior!

      calibration first 1000 resolved predictions

      This is more what I expected of a calibration curve—broadly overconfident. And perhaps its general worseness is explained by the appeal of optimism in predicting oneself. But it’s a pretty weird shape, which seems less explicable. If I think there’s a 40% chance that I’ll do something, apparently it’s not happening. If you want it to happen, you should hope I change my mind and put 5% on it!

      I’m not sure what is up with this particular strange shape. But note that making predictions about one’s own behavior has particular complication, if one likes to be right. If you put a number below 50% on taking an action, then you have a disincentive to doing it. So you should then put a lower probability on it than you would have, which would make you even more wrong if you took the action, so you have a further disincentive to doing it, etc. I do generally look for a fixed point where given that I put probability p on something (and the incentive consequences of that), I do think it will happen with probability p. But this is a different process than the usual predicting process, and I could imagine it going wrong in strange ways. For instance, if I’m more motivated by being right than I thought, then 40% predictions which might have been 50% predictions originally should really be 5% predictions. This theory doesn’t really work though, because then shouldn’t the lower categories also be especially overconfident? Whereas in fact they are okay.

      (Maybe I just have free will? The kind of free will that manifests as being about 15% less likely to do anything than one might have expected seems disappointing, but the menu of possible free will options was never that inspiring.)

    Example predictions

    Here are some typical predictions, arbitrarily pulled from my spreadsheet and lightly edited:

    • I will be invited to play ONUW today: 0.45 (true)
    • The trial bank transfers come through to my IBKR account by the end of Monday: 0.34 (false)
    • [Friend] will want to leave here for the day before I do: 0.05 (false)
    • [Friend] will seem notably sad in demeanor when I talk to [them]: 0.6 (false)
    • I will be paid by end Jan 27: 0.85 (true)
    • If I go inside shortly I see [friend]: 0.08 (false)
    • We have the [organization] party here: 0.55 (true)
    • I go to math party today: 0.88 (true)
    • I will get my period on Tuesday: 0.10 (true)
    • We will be invited to work at [office] for at least a week: 0.75 (false)
    • On Feb 5 we (including [friend], me, [friend]) are renting a new place: 0.73 (true)
    • We will run the arts and crafts room auction today (i.e. by midnight we will have as much info from the auction as we will get re which room is whos, ignoring processing of info we have): 0.40 (true)
    • [Person] is leading a new EAish org or CEO or COO of an existing EAish org by May 30 2023, where EAish includes orgs not culturally EA but doing things that are considered so by a decent fraction of EAs: 0.62 (TBD)
    • I will get to the office in time for lunch: 0.95 (true)
    • I see [housemate] tonight before midnight: 0.88 (true)
    • If I ask [attendee] ‘what did you think of [event I ran]?’, [they] will be strongly positive: 0.8 (false)
    • I see [friend]’s dad tonight before midnight: 0.95 (forgot to notice)
    • If I offer [person] a trial [they] will take it: 0.65 (true)
    • If I look at [friend]’s most recent Tweet, it is linking to [their] blog: 0.8 (false)
    • My weight is under [number] again before it is over [number] again: 0.75 (false)
    • I do all of my Complice goals tomorrow: 0.3 (false)
    • I will go to the office on Friday: 0.6 (true)
    • I will read the relevant chapter of Yuval book by the end of 2nd: 0.1 (false)
    • I weigh less than [weight] the first time I weigh myself tomorrow: 0.65 (true)
    • Lunch includes no kind of fish meat: 0.43 (true)

    And some examples of own-behavior marked predictions:

    • Work goal as stated will be completed by end of day: start a document of feedback policies: 0.90 (true)
    • I ok [person]’s post before the meeting: 0.68 (false)
    • Work goal as stated, will be completed by Sunday 28th October: Respond to [person]: 0.80 (true)
    • Work goal as stated will be completed by midnight Sunday 30 September 2018: read [person]’s research: 0.4 (false)
    • Work goal as stated, will be completed by Sunday 4th November: Arrange to talk to [other researcher] about [employee] project […]: 0.3 (false)
    • Work goal as stated will be completed by midnight Sunday 30 September 2018: Think for 1h about [colleague] thing: 0.5 (true)
    • I have fewer than 1k emails in inbox at some point on Feb 10th: 0.87 (true)
    • I have written to [brother] by Feb 10th: 0.82 (true)
    • I will be home by 9pm: 0.97 (true)

    Categories missing from these randomishly selected lists but notable in being particularly fun:

    1. Predictions of history that I don’t know or remember, followed by looking it up on Wikipedia. A pretty fun combination of predicting things and reading Wikipedia.
    2. Predictions of relationship persistence in successive episodes of Married at First Sight.


    1. I have a column where I write context on some predictions, which is usually that they are my own work goal, or otherwise a prediction about how I will behave. This graph excludes those, but keeps in some own-behavior prediction which I didn’t flag for whatever reason.) 

    2. Except maybe more like art—do you know that feeling where you look at the sketch, and tilt your head from side to side, and say ‘no, a little bit more… just there….mmm…yes…’? It’s like that: ‘27%…no, a little more, 29%? No, 33%. 33%, yes.’ Except honestly it’s more ridiculous than that, because my brain often seems to have views about which particular digits should be involved. So it’s like, ‘23%…no, but mmm 3…33%, yes.’ I am generally in favor of question decomposition and outside views and all that, but to be clear, that’s not what I’m doing here. I might have been sometimes, but these are usually fast intuitive judgments. 

  • A game of mattering

    When I have an overwhelming number of things to do, and insufficient native urge to do them, I often arrange them into a kind of game for myself. The nature and appeal of this game has been relatively stable for about a year, after many years of evolution, so this seems like a reasonable time to share it. I also play it when I just want to structure my day and am in the mood for it. I currently play something like two or three times a week.

    The game

    The basic idea is to lay out the tasks in time a bit like obstacles in a platformer or steps in Dance Dance Revolution, then race through the obstacle course grabbing them under consistently high-but-doable time pressure.

    Here’s how to play:

    1. Draw a grid with as many rows as there are remaining hours in your hoped for productive day, and ~3 columns. Each box stands for a particular ~20 minute period (I sometimes play with 15m or 30m periods.)
    2. Lay out the gameboard: break the stuff you want to do into appropriate units, henceforth ‘items’. An item should fit comfortably in the length of a box, and it should be easy enough to verify completion. (This can be achieved through house rules such as ‘do x a tiny bit = do it until I have a sense that an appropriate tiny bit has been done’ as long as you are happy applying them). Space items out a decent amount so that the whole course is clearly feasible. Include everything you want to do in the day, including nice or relaxing things, or break activities. Drinks, snacks, tiny bouts of exercise, looking at news sites for 5 minutes, etc. Design the track thoughtfully, with hard bouts followed by relief before the next hard bout.
    3. To play, start in the first box, then move through the boxes according to the time of day. The goal in playing is to collect as many items as you can, as you are forced along the track by the passage of time. You can collect an item by doing the task in or before you get to the box it is in. If it isn’t done by the end of the box, it gets left behind. However if you clear any box entirely, you get to move one item anywhere on the gameboard. So you can rescue something from the past, or rearrange the future to make it more feasible, or if everything is perfect, you can add an entirely new item somewhere.

    I used to play this with tiny post-it stickers, which I would gather in a large moving pile, acting as a counter:

    example of game with stickers example of game with stickers

    Now I just draw the whole thing. Crossed out = collected; [] = rescued from the past, now implicitly in the final box; dot in the lower right = box cleared; dot next to item = task done but item stuck in the past (can be collected immediately if rescued).


    Why is this good?

    I think a basic problem with working on a big pile of things in a big expanse of time is that if you work or not during any particular minute, it feels like it makes nearly no difference to the expectation of success. I’m not quite sure why this is—in fact if I don’t work this minute, I’m going to get one minute less work done. But it feels like if I don’t work this minute, I only need to work a smidgen faster on average to get any particular amount of work done, so what does it matter if I work now or later? And if i had some particular goal (e.g. finishing writing some massive text today), it’s unlikely that my other efforts will get me exactly to the line where this minute pushed me over—probably I will either succeed with hours to spare (haha) or fail hours from my goals.

    I picture what’s going on as vaguely something like this—there is often some amount of work that is going to make your success likely, and if you know that you are on a locally steep part of the curve, it is more motivating than if you are either far away from the steep part or don’t know where you are:


    Yet on the other hand, the appeal of various non-work activities this specific minute might be the most distinct and tangible things in the world. So when there is a lot to be done in a long time, not working often looks more exciting than working, even if a more rational accounting would disagree.

    Having a single specific thing to do within minutes is much more compelling: the task and the time are lined up so that my action right now matters. Slacking this minute is the difference between success and failure.

    It feels very different to have one email to deal with in three minutes and to have a thousand to deal with in next fifty hours.

    One might naively respond to this issue by breaking up one’s tasks into tiny chunks, then laying them out in a day of tiny time boxes, then aiming for each to happen by the end of its allotment. But this will be terrible. A few boxes in, either you’ll be ahead or behind. And either way, your immediate actions have drifted away from feeling like they matter. If you are ahead, the pressure is off: you’ll probably succeed at the next increment whether or not you work hard now. If behind, you are definitely going to fail at doing the next box on time, and probably some others, and your present work is for an increased chance of catching up at some vague future box, much like before you had these boxes. (Plus your activities are no longer in line with what your plan was, which for me makes it tempting to scrap the whole thing and do something else.)

    A big innovation of this game is to instead ensure that you keep meeting tasks one at a time where each one matters in its moment, as in a game like Beat Saber or Dance Dance Revolution. The game achieves this by adjusting the slack to keep the next ten minutes’ action near the actually-mattering-to-success region all day. If you get behind you have to give up on items and move forward, so you aren’t left struggling for a low probability of catching up. If you get ahead, you add more items and thus tighten the slack.

    A thing I like about this is that it actually makes the activity more genuinely fun and compelling, and doesn’t involve trying to trick or uncomfortably binding oneself. It is superficially a lot like a ‘productivity hack’, but I associate these with somehow manipulating or forcing yourself to do something that you at some level have real reason to dislike. I expect such tricks to fail, and I don’t think I want them to succeed.

    This seems different: I think humans are just genuinely better at being in an enjoyable flow state when their activities have certain structures that are genuinely compatible with a variety of tasks. Beat saber wouldn’t be fun if all the boxes were just sitting in a giant pile and you had to beat your way through as many as you could over an hour. But with the boxes approaching one at a time, at a manageable rate, where what you do in each moment matters, it really is fun (for many people, I hear—I actually don’t love it, but I do appreciate this particular aspect). The same thing that makes Beat Saber more fun than Saber-a-bunch-of-boxes-on-your-own-schedule can genuinely also be applied to giant piles of tasks.

    The fact that this game has lasted a year in my life and I come back to it with verve points to it not being an enemy to any major part of myself.

    Another promising way of seeing this game is that this structure lets you see more clearly the true importance of each spent minute, when you were by default in error. Whereas for instance playing Civ IV for five minutes every time you do work (another sometimes way-of-being of mine) is less like causing yourself to perceive reality truly and more like trying to build an alternate incentive structure out of your mistaken perception, that adds up to rational behavior in the real world.

    If anyone else tries this, I’m curious to hear how it goes. My above explanation of its merit suggests it might be of broad value. But I also know that perhaps nobody in the world likes organizing things into little boxes as much as I do, so that could also be the main thing going on.

  • Update updates

    You can now read or subscribe to this blog via world spirit sock stack, a Substack mirror of this site. I expect to see comments at wsss similarly often to wssp (with both being more often than at various other places this crossposts, e.g. LessWrong).

    You can also be alerted to posts on Twitter via @wssockpuppet. I’m going to continue to Tweet about some subset of things on my personal account, so this runs a risk of double-seeing things.

  • Podcasts on surveys, slower AI, AI arguments

    I recently talked to Michael Trazzi for his podcast, The Inside View. It just came out, so if that’s a conversation you want to sit in on, do so here [ETA: or read it here].

    The main topics were the survey of ML folk I recently ran, and my thoughts on moving more slowly on potentially world-threatening AI research (which is to say, AI research in general, according to the median surveyed ML researcher…). I also bet him a thousand dollars to his hundred that AI would not make blogging way more efficient in two years, if I recall. (I forget the exact terms, and there’s no way I’m listening to myself talk for that long to find out. If anyone else learns, I’m curious what I agreed to.)

    For completeness of podcast reporting: I forgot to mention that I also talked to Daniel Filan on AXRP, like a year ago. In other old news, I am opposed to the vibe of time-sensitivity often implicit in the public conversation.

  • Survey advice

    Things I believe about making surveys, after making some surveys:

    1. If you write a question that seems clear, there’s an unbelievably high chance that any given reader will misunderstand it. (Possibly this applies to things that aren’t survey questions also, but that’s a problem for another time.)
    2. A better way to find out if your questions are clear is to repeatedly take a single individual person, and sit down with them, and ask them to take your survey while narrating the process: reading the questions aloud, telling you what they think the question is asking, explaining their thought process in answering it. If you do this repeatedly with different people until some are not confused at all, the questions are probably clear.
    3. If you ask people very similar questions in different sounding ways, you can get very different answers (possibly related to the above, though that’s not obviously the main thing going on).
    4. One specific case of that: for some large class of events, if you ask people how many years until a 10%, 50%, 90% chance of event X occurring, you will get an earlier distribution of times than if you ask the probability that X will happen in 10, 20, 50 years. (I’ve only tried this with AI related things, but my guess is that it at least generalizes to other low-probability-seeming things. Also, if you just ask about 10% on its own, it is consistently different from 10% alongside 50% and 90%.
    5. Given the complicated landscape of people’s beliefs about the world and proclivities to say certain things, there is a huge amount of scope for choosing questions to get answers that sound different to listeners (e.g. support a different side in a debate).
    6. There is also scope for helping people think through a thing in a way that they would endorse, e.g. by asking a sequence of questions. This can also change what the answer sounds like, but seems ethical to me, whereas applications of 5 seem generally suss.
    7. Often your respondent knows thing P and you want to know Q, and it is possible to infer something about Q from P. You then have a choice about which point in this inference chain to ask the person about. It seems helpful to notice this choice. For instance, if AI researchers know most about what AI research looks like, and you want to know whether human civilization will be imminently destroyed by renegade AI systems, you can ask about a) how fast AI progress appears to be progressing, b) when it will reach a certain performance bar, c) whether AI will cause something like human extinction. In the 2016 survey, we asked all of these.
    8. Given the choice, if you are hoping to use the data as information, it is often good to ask people about things they know about. In 7, this points to aiming your question early in the reasoning chain, then doing the inference yourself.
    9. Interest in surveys doesn’t seem very related to whether a survey is a good source of information on the topic surveyed on. One of the strongest findings of the 2016 survey IMO was that surveys like that are unlikely to be a reliable guide to the future.
    10. This makes sense because surveys fulfill other purposes. Surveys are great if you want to know what people think about X, rather than what is true about X. Knowing what people think is often the important question. It can be good for legitimizing a view, or letting a group of people have common knowledge about what they think so they can start to act on it, including getting out of bad equilibria where everyone nominally supports claim P because they think others will judge them if not.
    11. If you are surveying people with the intention of claiming a thing, it is helpful to think ahead about what you want to claim, and make sure you ask questions that will let you claim that, in a simple way. For instance, it is better to be able to say ‘80% of a random sample of shoppers at Tesco said that they like tomato more than beans’ than to say ‘80% of a sample of shoppers who were mostly at Tesco but also at Aldi (see footnote for complicated shopper selection process) say that they prefer tomato to peas, or (using a separate subset of shoppers) prefer peas to beans, from which we can infer that probably about 80% of shoppers in general, or more, prefer tomato to beans’. You want to be able to describe the setup and question in a way that is simple enough that the listener understands what happened, and see the significance of the finding.
    12. If you are running a survey multiple times, and you want informative answers about whether there were differences in views between those times, you should probably run exactly the same survey and not change the questions even a tiny bit unless there is very strong reason to. This follows from 3.
    13. Qualtrics costs thousands of dollars to use, and won’t let you sign up for an account or even know how much it might cost unless you book a meeting to talk to someone to sell it to you. Guidedtrack.com seems pretty nice, but I might not have been trying to do such complicated things there.
    14. Running surveys seems underrated as an activity.
  • What do ML researchers think about AI in 2022?

    Crossposted from AI Impacts

    AI Impacts just finished collecting data from a new survey of ML researchers, as similar to the 2016 one as practical, aside from a couple of new questions that seemed too interesting not to add.

    This page reports on it preliminarily, and we’ll be adding more details there. But so far, some things that might interest you:

    • 37 years until a 50% chance of HLMI according to a complicated aggregate forecast (and biasedly not including data from questions about the conceptually similar Full Automation of Labor, which in 2016 prompted strikingly later estimates). This 2059 aggregate HLMI timeline has become about eight years shorter in the six years since 2016, when the aggregate prediction was 2061, or 45 years out. Note that all of these estimates are conditional on “human scientific activity continu[ing] without major negative disruption.”
    • P(extremely bad outcome)=5% The median respondent believes the probability that the long-run effect of advanced AI on humanity will be “extremely bad (e.g., human extinction)” is 5%. This is the same as it was in 2016 (though Zhang et al 2022 found 2% in a similar but non-identical question). Many respondents put the chance substantially higher: 48% of respondents gave at least 10% chance of an extremely bad outcome. Though another 25% put it at 0%.
    • Explicit P(doom)=5-10% The levels of badness involved in that last question seemed ambiguous in retrospect, so I added two new questions about human extinction explicitly. The median respondent’s probability of x-risk from humans failing to control AI1 was 10%, weirdly more than median chance of human extinction from AI in general2, at 5%. This might just be because different people got these questions and the median is quite near the divide between 5% and 10%. The most interesting thing here is probably that these are both very high—it seems the ‘extremely bad outcome’ numbers in the old question were not just catastrophizing merely disastrous AI outcomes.
    • Support for AI safety research is up: 69% of respondents believe society should prioritize AI safety research “more” or “much more” than it is currently prioritized, up from 49% in 2016.
    • The median respondent thinks there is an “about even chance” that an argument given for an intelligence explosion is broadly correct. The median respondent also believes machine intelligence will probably (60%) be “vastly better than humans at all professions” within 30 years of HLMI, and that the rate of global technological improvement will probably (80%) dramatically increase (e.g., by a factor of ten) as a result of machine intelligence within 30 years of HLMI.
    • Years/probabilities framing effect persists: if you ask people for probabilities of things occurring in a fixed number of years, you get later estimates than if you ask for the number of years until a fixed probability will obtain. This looked very robust in 2016, and shows up again in the 2022 HLMI data. Looking at just the people we asked for years, the aggregate forecast is 29 years, whereas it is 46 years for those asked for probabilities. (We haven’t checked in other data or for the bigger framing effect yet.)
    • Predictions vary a lot. Pictured below: the attempted reconstructions of people’s probabilities of HLMI over time, which feed into the aggregate number above. There are few times and probabilities that someone doesn’t basically endorse the combination of.
    • You can download the data here (slightly cleaned and anonymized) and do your own analysis. (If you do, I encourage you to share it!)

    Individual inferred gamma distributions

    Individual inferred gamma distributions

    The survey had a lot of questions (randomized between participants to make it a reasonable length for any given person), so this blog post doesn’t cover much of it. A bit more is on the page and more will be added.

    Thanks to many people for help and support with this project! (Many but probably not all listed on the survey page.)

    Cover image: Probably a bootstrap confidence interval around an aggregate of the above forest of inferred gamma distributions, but honestly everyone who can be sure about that sort of thing went to bed a while ago. So, one for a future update. I have more confidently held views on whether one should let uncertainty be the enemy of putting things up.

    1. Or, ‘human inability to control future advanced AI systems causing human extinction or similarly permanent and severe disempowerment of the human species’
    2. That is, ‘future AI advances causing human extinction or similarly permanent and severe disempowerment of the human species’
    </div><div id="custom_html-19" class="widget_text mh-widget mh-posts-2 widget_custom_html"><div class="textwidget custom-html-widget"><div>
  • Book review: The Passenger by Lisa Lutz

    Spoiler warning: spoilers for The Passenger by Lisa Lutz, and In the Cart, by Anton Chekhov.

    I took up this book looking for a page-turner. It was, but hours before the end I thought its main contribution to my mental life would be the visceral knowledge that page-turners can be undelicious. It felt cold, and getting into its world felt bad. The protagonist slunk around dark and uncomfortable places, killing people, scheming harshly, perceiving low beams as dangers to the heads of tall men, and that sort of thing. With some amount of fretting about what she was becoming. I wanted to turn the pages, but I also kind of wanted it to end, and for me to read something more squarely enjoyable next time.

    Then the end was different, and kept coming back to me in the day, and turned the rest around and gave it new meaning. Like the image of a long dark tunnel becomes something different if you step away and see that it starts in the loveliest of fields. After a long dark tunnel of a book, we turn around and see where the heroine came from, and her teenage innocence and goodness newly fills out the potentially empty person we have been watching. And her past isn’t really discordant—I think I just wrote her off fast. This endless night began in that day, and seeing so makes the full picture both warm and intense, where it had seemed like cheap suspense and violence.

    In that abstract description, I suppose it is a bit like ‘In The Cart’ by Chekhov: we watch a dreary life, then at last that life is pulled into a perspective where we see its warm and hopeful start, unreachably distant. And it changes the flavor.

    I have not much idea if what I saw was what Lisa Lutz was going for, but if she was, and if other readers are like me (haha) then I think having 85% of the book be unpleasant was probably called for.

  • An inquiry into the thoughts of twenty-five people in India

    Sometimes I get excited about running surveys. Here is a Positly one from November 2020 in which I asked the following questions, to participants from India:

    1. What are you looking forward to this week?
    2. What do you think of as the most important thing going on in the world right now?
    3. If you had a spare half hour right now, what would you do with it?
    4. What is something you changed your mind about recently?
    5. What in life is more important than other people realize?
    6. If someone gave you $5 right now, what would you do with it?
    7. Who is someone you think of as a hero?
    8. Are you paying attention to the US election?
    9. What was the biggest news story this year?

    I don’t recall any notable constraints other than the location requirement, but I barely remember doing this.

  • Podcast: with Spencer Greenberg on groupstruckness and boundedness

    I talked to Spencer Greenberg a little while ago for his podcast Clearer Thinking. The recording is out today. According to his website we discussed:

    What does it mean to be “groupstruck”? How does groupstruck-ness differ from the bystander effect, normalcy bias, and other related cognitive biases? How do we break people out of being groupstruck? What does it mean to be a “bounded” person? How can we build up better decision-making heuristics? What sorts of decisions do people usually not quantify but should (and vice versa)? How can we make rational relationship decisions without coming across as “calculating” or cold? How does anthropic reasoning affect our hypotheses about the nature of the universe and life within it (i.e., the Fermi paradox, the simulation hypothesis, etc.)?

    You can listen or read the transcript here.

  • Proposal: Twitter dislike button

    The popular story of Twitter’s role in the ruin of civilization is that it is a runaway trash fire of reciprocal anger and offense, where otherwise nice people are possessed by overwhelming outrages, and drawn into throwing their own energy behind creating the vilest and most vindictive responses to what they see, turning away from reason and hurting others in turn, and so the place continues.

    I’m not sure how much of Twitter activity this accounts for (apparently Michael Nielsen enjoys an entirely different place, and my experience seems pretty nice too). But I think there’s a real pattern of this kind, which makes game theoretic sense, and goes something like this:

    1. People say things
    2. People read these things
    3. If anything seems objectionable to any of these people, then they repost those things with commentary, and everyone else reads them extra.
    4. In the next round, people (or the ones who who get attention) say objectionable things (that they expect will get attention), about objectionable things (that they have their attention on from the last round)
    5. etc.

    To lay out the effects of all this more clearly:

    1. People disproportionately read things they don’t like, which is presumably bad for them
    2. People get the visceral sense that others are disproportionately writing things they don’t like, which is misleading, and not in a helpful-for-public-friendship way
    3. Things people don’t like get extra space in the public conversation
    4. People who tend to write things that others don’t like get extra power and attention instead of less
    5. Writing things other people don’t like is incentivized (if you want attention, writing things other people don’t like is probably somewhat better than writing things people do like, and way better than writing things they don’t feel strongly about).

    Supposing something like this model is true, and bad, it seems to me that there is a really simple solution: add a dislike button.

    That is, what if when a person sees a thing they don’t like, instead of broadcasting it to others, they register their disapproval by quietly clicking a different button next to the heart, and then Twitter shows it to other people less instead of more? You can still retweet it if you especially want other people to see it more, but adding attention wouldn’t be the default disapproval vote.

    This is not an original idea, and the other major websites that do it have not, to my knowledge, been run out of business by a dearth of disagreement. I think they are also not so much known for the above dynamic.

    I posit that a Twitter downvote button would be great. What am I missing?

  • Fighting in various places for a really long time

    The first time someone raved to me about seeing Everything Everywhere All at Once, I thought they were actually suggesting I see everything everywhere all at once, and I was briefly excited by the implication that this exhilarating possibility was somehow on the table.

    After that disappointment I heard about it several times more, and warmed to the idea of seeing the movie anyway, especially on account of it being the most roundly recommended one I remember. The third time someone invited me to see it with them, I went.

    And it seemed so astonishingly lacking to both of us that I left severely confused, and remain so. Like: I know people have different tastes. I know that I’m not the biggest movie appreciator (my ideal movie probably has a small number of visually distinct characters and nobody dies or does anything confusing, and I’ve already seen it twice). But usually I have some abstract guess about what other people are liking. Or, more realistically, a name for the category of mysterious attraction (“ah yes, you are into the ‘action’, and that means it’s good when helicopters crash or people shoot each other”). Yet here, I’m grasping even for that. “You like it because.. it has much more prolonged fighting than usual and you like fighting?…or…it is some kind of irony thing about other movies?” I could believe that it was some kind of mediocre action movie. But usually my friends don’t go crazy for mediocre action movies. And here for instance one of my best friends, who I generally take to have subtle and sensitive and agreeable tastes, and who knows me extremely well, told me in particular to see it. And the strongest criticism I have seen of it outside of our post-movie discussion is another friend’s apparently sincere complaint on Facebook that it is probably only among the top hundred movies ever, not the top ten like people say. And it’s not that I just wasn’t wowed by it: it’s hard to remember the last time I was less compelled by a movie. (Though perhaps one doesn’t remember such things.) Like, I was really sitting there in the cinema thinking something along the lines of, ‘movies usually grab my attention somehow, yet this is doing some special thing differently to not have that happen? Huh?’

    I don’t know if I can spoil this movie, because whatever was good in it, I totally missed. But here I attempt spoilers. This is what happens in the movie, as far as I can tell:

    (Ok my companion and I actually failed to notice when it started, so maybe there was something important there. Oops.)

    A woman and her family run a laundromat, and are also working on their taxes. Her life is disappointing to her. A version of her husband appears from a different dimension and relays some kind of dimly coherent plot involving lots of dimensions and the need for her to jump between them and fight or something. Then they fight and jump between dimensions for about two hours. Their fighting involves some repeating motifs: 1) There is a humorous conceit that in order to jump between dimensions you have to do a strange action, for instance bite off and chew some lip balm. This joke is repeated throughout most of the fighting. One time the traveler has to put an object up their bottom, so that is pretty exciting humorwise. 2) Things often look cool. Like, there are lots of evocative objects and people are wearing make-up and neat costumes. 3) There is lots of jumping between dimensions. At some point it becomes clear that a baddie is actually the woman’s daughter, who has turned to nihilism as a result of either seeing everything all at once and that being kind of intrinsically nihilism-provoking due to its lack of permitting anything else, or as a result of having her lesbianism disrespected by her mother earlier. The fighting takes on a more nihilism vs. appreciating life flavor, and then it turns out that being friendly and warm is good, as represented by the father, and now appreciated by the mother. Then…actually I forget what happens at the end, sorry.

    I’m all for ‘nihilism vs. something something existential something something, life, kindness’ as a theme, but this seemed like such a shallow treatment of it. It just seemed like a bunch of fighting labeled ‘deep plot about nihilism etc’, and I don’t think caused me to have any interesting thoughts about such themes, except perhaps by reminding me of the general topic and leaving me without anything to distract my mind from wandering.

    It was clearly too violent for my liking, so that’s idiosyncratic, but it’s not like I’m always opposed to violence—some of the fighting in Lord of the Rings was quite moving, and I watched the whole of Game of Thrones in spite of also at other times using scenes from it in exposure therapy. But I posit that you need some sort of meaningful context to make violence interesting or moving, and I don’t think I caught that.

    I also speculate that some humor is meant to come from the protagonist being a middle aged immigrant Chinese woman, instead of the more standard young man. Which seems rude: as though it is asking for the props generally offered for featuring atypical demographics in films, yet is doing so as a joke.

    In sum, it seemed to me to be a bunch of fairly meaningless fighting interspersed with repetitive lowbrow humor and aesthetically pleasing props.

    I asked a couple of my friends to explain their alternate takes to me, but I don’t think I can do their explanations justice, due to not really understanding them. At a high level they disagreed with me about things like ‘was it extremely humorous?’ and ‘was it unusually engaging vs. unusually unengaging?’, but I didn’t understand why, at a lower level. Probably we all agree that it was visually cool, but I wasn’t actually stunned by that. Maybe visual attractiveness alone counts for less with me (though I recently saw Everything is Illuminated, which I found awesome in a confusingly soul-electrifying way and whose merit seems somehow related to visualness). One interesting thing that this discussion with EEAAO appreciators added was the point that there is something moving about the thought that in a different dimension you and the odious tax lady might be tender lovers. I agree that that’s a nice thought.

    I am hesitant to criticize here, because it is sweet of my friends to try to give me a nice movie recommendation, and I appreciate it. Also, I think in general that if Alice loves a thing and Bob doesn’t, it is much more likely that Bob is missing something wonderful than that Alice is imagining such a thing. (Though conversely if they agree that the thing is pretty good in ways, and Bob just hates it because it also has some overriding problem, then my guess would be the reverse: probably Alice is missing a thing.)

    So probably, somehow, other people are right. Please other people, help enlighten me more? (And thanks to some of my friends for trying!)

  • Stuff I might do if I had covid

    In case anyone wants a rough and likely inaccurate guide to what I might do if I had covid to mitigate it, I looked into this a bit recently and wrote notes. It’s probably better than if one’s plan was to do less than a few hours of research, but is likely flawed all over the place and wasn’t written with public sharing in mind, and um, isn’t medical advice [ETA May 11: also safety-relevant improvements are being made in the doc version, so I recommend looking at that.]:


    Here’s a Google doc version, where any comments you leave might be seen by the next person looking (and you might see comments added by others).

    Here’s a much longer doc with the reasoning, citations and more comments.

    (I continue to guess that long covid is worth avoiding.)

  • Why do people avoid vaccination?

    I’ve been fairly confused by the popularity in the US of remaining unvaccinated, in the face of seemingly a non-negligible, relatively immediate personal chance of death or intense illness. And due to the bubbliness of society, I don’t actually seem to know unvaccinated people to ask about it. So in the recent covid survey I ran, I asked people who hadn’t had covid (and thus for whom I didn’t have more pressing questions) whether they were vaccinated, and if not why not. (Note though that these people are 20-40 years old, so not at huge risk of death.)

    Their responses:

    1. I don’t think I need it, I don’t think covid is a big deal, I don’t think the vaccine works and the more the government/media pushes it the more I don’t want to ever get it. It should be a private decision between someone and their doctor, not Joe Biden and CNN saying comply or be kicked out of society.

    2. I still dont trust the information and safety claims made by the crooked FDA and CDC. Needs more research and study.

    3. I had a scary vaccine reaction previously.

    4. I am only 32 years old and in decent health so I think I would be fine if I caught COVID. It has almost been two years since the pandemic started and I haven’t gotten sick besides some minor colds. I would rather rely on natural immunity instead of the vaccine.

    5. dont want one

    6. Other health issues where my Doctor wants me to wait to get vaccinated.

    7. I think it is poison at worst and ineffective at best. Also the way the pushed it came off like a homeless man trying to lure me into an ally. The vaccine issue has made me lose 100% trust and faith in media and government i do not believe or trust anything from them anymore

    8. I have anxiety problems and other health issues.

    9. I’m actually scheduled to get a vaccination, I’m simply waiting for my appointment date.

    10. The places where the vaccination is available are always crowded and I am somewhat afraid of being outdoors or indoors in large crowds for too long these days.

    11. I am extremely skeptical of this “vaccine”, which has been proven ineffective (vaccinated people keep getting it, and what’s with all of the boosters?!) and has caused too many deaths and/or serious health issues/emergencies. I’ll take my chances on Covid any day over a “vaccine” that magically appeared from 8+ pharmaceutical companies in such an astonishingly short time. That is unheard of, and there is no way to know what is really in it, and what kind of problems those who are ignorant and fearful enough to get it will face later on down the road (that is, if they don’t die from the “vaccinations” themselves. The fact that the government has bribed, coerced, and is trying to FORCE everyone to get this “vaccine” is unprecedented, and quite frankly terrifying and sinister to me. I will NEVER allow a forced injection of god knows what into my own body, as that is wrong in every way there is. Vaccines allegedly keep those who get them safe from contracting whatever it is the vaccine is made for, so why would it be anyone’s business whether or not someone else is vaccinated? The Let’s Go Brandon administration and the dumbocrats that were fearmongering before the administration was even in place have done this by design. Spread fear, mistrust, and divisiveness, and lock people down to isolate them and make them suspicious and fearful of one another. Divide and conquer, because the fascist wannabe communists know that it is the only way to indulge their scummy self-serving agendas. United we stand.

    12. For health reasons.

    13. I’m not at high risk, so I don’t need it immediately. Waiting to see how effective it is.

    14. I don’t think it’s efficient…the prospect of being jabbed by endless boosters doesn’t appeal to me

    15. I’ve never had covid.

    16. I don’t trust the vaccine and have not have gotten covid since the pandemic started. Plus all of the misinformation surrounding covid, from the news media to social media sites. I just don’t know what to believe anymore but I feel that not getting vaccinated is the best option for me.

    17. I am still skeptical of the vaccine since people are still catching covid and its variants

    18. It doesn’t stop the transmission, or stop you from getting covid. I am at low risk anyways.

    19. I am at very low risk of infection as I do not work and wouldn’t accept any non-remote work. I also do not have any friends or family in the state where I moved to last year and I do not drink so I don’t go out to bars or events, or even restaurants.

    20. I simply have no trust in the big pharmacy and little more in the government. Throw in the fact that the vaccine was created too quickly and while it claims to have lowered infection rates and the severity of COVID-19, I’ve known several people who have been infected and they were fully vaccinated. In addition, I’ve seen many news articles reporting the same. Why should I receive a vaccine that provides such weak protection?

    21. Transportation and access. If a pill form or some self-administering option were available, I would.

    22. I don’t trust it.

    23. I have chosen not to take the vaccine

    24. Because I think its the Governments agenda to start killing people. I don’t trust Big Pharma or the Government. Also, people still get covid with the vaccine, seems like a waste of time.

    25. I live in Ohio, kinda wanna die. Also, it doesnt even dent the new variants; you still ge tit.

    26. do not trust them

    27. Because I believe that the vaccine is part of a either depopulation plan, or a tracking device. Either way there is no logical reason to get it.

    28. I am hesitant because of the possible side effects long-term because not much is known about it.

    29. I don’t see the benefit since I am not really at risk from covid.

    30. It’s against my religion.

    31. I don’t need a covid vaccination.

    32. I’m concerned about possible serious side effects from the vaccine that may show up some time down the road.

    33. Normally, I would consider it, but the talk of forced mandates has really soured me to it. I dont believe in taking away people’s health freedoms like that. I think it’s an insane intursion on civil liberties, and I fear if we all just accept it, we will become like Australia

    34. I only leave the house once or twice a week so it never really felt necessary to me.

    The biggest themes mentioned here seem to be distrust or hostility regarding the government and media and their various pro-vaccination pressurings, and the impression that the vaccine doesn’t work. (I wonder how much that one is relative to an expectation of ‘very occasional breakthrough cases’ that one might also trace to someone’s questionable communication choices.) An unspoken theme is arguably being in possession of relatively little information (like, what’s going on with the person who is still waiting to find out if the vaccines are effective? And do people just not know that the vaccine reduces the downside of covid, even if it is still possible to get it? Or do all of these people know things I don’t?).

    I wonder if much good could be done in this kind of situation with some kind of clearly independent and apolitical personal policy research institution, who prioritized being trustworthy and accurate. Like, I don’t trust the government or media especially either on this, but I and people I do trust have a decent amount of research capacity and familiarity with things like academic papers, so we arguably have an unusually good shot at figuring out the situation without having to make reference to the government or media much (assuming there aren’t more intense conspiracies afoot, in which case we will end up dead or tracking-deviced, I suppose). If I wasn’t in such a position, I feel like I’d pay a decent amount for someone to think about such things for me. I guess this is related to the service that doctors are meant to provide, though the thing I imagine would look very different to real doctors in practice.

  • Bernal Heights: acquisition of a bicycle

    The measure of a good bicycle, according to me, is that you can’t ride it without opening your mouth in joy and occasionally exclaiming things like ‘fuck yeah bicycle’. This is an idiosyncratic spec, and I had no reason to think that it might be fulfilled by any electric bicycle—a genre I was new to—so while I intended to maybe search through every electric bicycle in The New Wheel for one that produced irrepressible, physically manifest joy, I also expected this to likely fail, and to be meanwhile embarrassingly inexplicable and irritating to bike shop employees—people who often expect one’s regard for bikes to map fairly well to facts about their price, frame shape, and whether the gears are Shimano. But several bikes in, when I uncomfortably explained to a guy there that, while the ones I had tried so far were nice, bicycles had been known to make me, like, very happy, he said of course we should find the bicycle that I loved. So I at least felt somewhat supported in my ongoing disruption of his colleague’s afternoon, with requests that bicycle after bicycle be brought out for me to pedal around the streets of Bernal Heights.

    The guy would maneuver each bike out of the crowded shop and to the sidewalk, and adjust it to fit me, and we would chat, often about his suggestion that I maybe ride up to the hill on the other side of the main road. Which I would agree might be a good idea, before riding off, deciding that turning left was too hard, and heading in the other direction, through back streets and around a swooping circle park with a big ring road, where I would loop a few times if the mood took me.

    Some bicycles were heavy, and rode like refrigerators. Most bicycles were unsteady, and urged even my cycling-seasoned bottom to the seat while pedaling. Most bicycles added considerable assistance to going up hills. Many bicycles seemed fine.

    Bernal Heights, on the other hand, seemed awesome. As I paused before my habitual turn-in-the-wrong-direction one time, the house kitty-corner to me was playing music louder than I recall ever hearing anything play music that wasn’t a large-concert speaker. It was truly not considerate. And a middle-aged guy on my corner was having a great time, laughing, and was like, ‘Welcome to Cortland Avenue’. I pulled up and said that I hadn’t been here before actually, and didn’t know what he was talking about. He explained that Cortland Avenue was some kind of peaceful and placid place, and that they could use more of whatever this was.

    The whole street felt old-fashioned-okay somehow, and not really like my sense of modern America. I wanted to say it was a bit like the 80s in Hobart (the capital city of Tasmania, where I grew up) but since I’ve barely experienced the 80s in Hobart, I probably shouldn’t say that. Nothing shiny, nothing preying. Yellow things, sincere things, people who care about rock music, people who bought some vegetables, people talking to friends in streets and outdoor restaurants. So many electric bicycles—do I just not notice all the electric bicycles when I’m not on one? On one outing from the shop, a woman called out to me to say that she had also tried the bicycle that I was now riding, the other day, and wasn’t it good? I pulled up to tell her that I was actually struggling to find the ‘on’ switch, and she showed it to me.

    (To be clear, I converse with strangers in streets quite rarely in my normal life about three miles away.)

    I got a headache from all the bike-trying, and requested a lunch pause. Then I explored further down the street, and found a dineresque crêperie. I was practicing making imperfect choices fast, so in an uncharacteristic snap of decision I went in to get a lemon sugar crepe (which is only a potentially imperfect choice on axes other than deliciousness). The place was some kind of institution, and the man behind the counter seemed to be savoring the motions of crepe-provision. I had fun ordering, and sat outside. It was so nice there that I repeatedly tried to photograph it, but it wasn’t a kind of niceness that my phone could capture it seemed. Perhaps the fact that I was sitting in the street and didn’t look mildly distressed would convey something to an experienced viewer.

    Back at the bike shop, I had a bike in mind for if no amazing bike materialized, and continued working through the tail of the bike options.

    Then there was an amazing bike. There was not much visual foreshadowing of this: it was an unsleek thing, painted in an impure grey with questionable red highlights. But it felt like freedom. I could stand up on it. It moved as an extension of my body. An extension full of energy and exhilaration. My smile became round with delight and I swore gleefully. I rocketed up steep streets and to the circle park. I flew around it, elated, bumping over speed bumps, pedaling passionately around the upward side and flying down around the down. Then after quite a relatively long investigation into a bike for which there was no actual open question, I made my way back to the shop.

    I said I’d buy it. They looked at their records, and their back storage, and their records and determined that they didn’t have one to sell. This was the floor bike, for trying, and not to be sold. Though they had a slightly bigger red one to sell.

    They carefully measured me, with a platform and a springed thing between my thighs and such, and determined that the red one was actually the right size for me, and the one I had ridden was too small. I wasn’t meant to ride the red one because it was a new bike for selling, not a floor model. But they would let me take it out a little anyway.

    It was nice. Was it as nice as the other one? I didn’t know—it seemed maybe less nice, but also now that I was obsessively paying attention to signs of ineffable goodness, and worried, I was probably just having less fun, no fault of the bike. It was basically the same as my perfect bike, but the right size, and possible to buy, and more beautiful, and not obviously less awesome, so probably I should get it and stop engaging in such fun-dampening neurosis.

    I went back. Then it occurred to me that I could still try that grey bike one more time. I did. It was awesome. It seemed obviously better than the red bike. It didn’t matter if I was caught in some tangle of neuroses: such joy would not be smothered. I stopped by the road and relayed my problem by text to my boyfriend, who wisely started googling for other stores that might have such a bike. Then I took photographs of the bike from all sides in the sun by the park. Then my phone with connectivity died. (For reasons to do with my own forgetfulness re phone charging and complications of phone plans, I had brought two phones: one with power, and one with connectivity.)

    I rode around and mentally rehearsed purchasing the floor bike. Did they need a floor bike for which they have no actual corresponding salable bikes? I’d pay as much as for a new bike. I’d pay more. They would be astonished and grateful. I’d talk to the manager, who would be free to disregard floor bike protocol, for such an exceptional case.

    I went back to the store. No, they would not sell me the floor bike. It didn’t belong to them. I could buy it in months, when floor bikes get replaced or something. I was also told: don’t do that—I hadn’t seen what the customers did to floor bikes. (What could customers possibly be doing to floor bikes to warrant such fear?) ‘Months’ was also about how long it would take them to order in a new bike.

    They let me use their wifi, and I reached my boyfriend again with my charged phone, and he had actually phoned a bunch of bike shops, like some kind of hero (or some kind of superhero with the ability to just talk to people in shops on the phone—if I phoned a bike shop, they might say something like “phh shu anganga mph ghe?” and I’d say I couldn’t hear them, but they wouldn’t hear me, and we’d go back and forth like that a few times, until it became too embarrassing to be borne). He had located a couple of very similar bikes, possibly one of them identical, at other bike shops in San Francisco and nearby Berkeley. I decided to go home and charge my phone, and so ended my and The New Wheel’s long afternoon together.

    At home, I charged my phone and acknowledged my failings re phone charging, and bravely acted on my boyfriend’s claim that it would be reasonable to just phone the most promising bike shop back to check it really was the same bike they had, before spending over an hour driving to Berkeley. It also became apparent that my other boyfriend would not hate taking me on a long bicycle-pursuing excursion in his car that evening.

    So we set out, me feeling kind of defensive and silly, because I could have got a bike that was better on every front except for ‘ineffable greatness’ hours earlier and with a lot less bothering other people. I vaguely attempted to defend myself as we went over the bridge, but it didn’t seem very necessary, and we got on to more interesting conversation.

    The bike shop, it turned out, was a few doors away from a house I used to live in, between a coffee shop with romantic memories, and a bench with different romantic memories, from multiple ancient times. Stepping into the thick past, I left my boyfriend to park the car and walked up to the new shop. It was a big warmly lit warehouse room, which I didn’t remember seeing when I lived here. Friendly: a place of children in baskets and wholesome rolled-up-trouser types. I read out the string identifying my desires to a plump, friendly man: “Gazelle ultimate T10+ 46inch”. He went looking for the corresponding item amidst the central sea of handlebars and frames. He couldn’t find it. Strange. He consulted his records, and the back storage, and his records, and another bike shop man. At last, it was right there—the problem had been that the record said that the bike was ‘dust’ colored, and he understandably hadn’t considered that someone would come up with that name for my beloved bike’s reddish-grey tone.

    Relieved but further paranoid for the preservation of the hard-to-measure magic, I got out my photographs from earlier, and asked him if there was anything different between the photograph and this bike. There was! The other bike had had some kind of fancy suspension seat post installed. They had the same for sale, so I asked for it. I rode the bike around half the block and back, and couldn’t tell if it was amazing, but it really was a short and constrained ride, and what more could I reasonably do? I bought it.

    We could barely push its giant, heavy body into the back of the car, and it made an alarming cracking sound, which hopefully was just the light changing position. We took it home.

    Another day, I took it out for a little ride, and it was great to power up the hills of San Francisco, and shoot along the flats. It’s the kind of bike that only adds power when you pedal, so it seemed that riding was still a lot of exercise, but your exertions got you all around the city, instead of half way up the nearest hill. And while I just meant to have a little ride, I went further around the city than I may have ever been on a single outing. What had been an intractable country of mountainous slopes and distances and intersections like war zones was shrunk to my scale. And I felt safer, though faster, because usually getting out of the way of things requires my own feeble strength, which might be completely overcome by starting on a little bit of a bump or something. Now I could more move when I wanted. And I could go fast enough to feel no guilt riding in the middle of the car lane, rather than the door-prone bicycle lane. I was as about as fast as cars, and nimble.

    I had to admit though that it wasn’t the joy I had sought so hard. It was merely mundanely good. And I was tense, and San Francisco was frightening, and cars were everywhere, and it was all exhausting. Maybe now I was just too stressed, and it would be good later? Or maybe the bike was somehow adjusted slightly wrong, and the potential for that same joy could never be found among the myriad possible positional combinations? About twenty minutes from home, I realized that I actually couldn’t leave the bike and walk without my back hurting a lot. So, questionably, I got back on it, which was bearable, and rode home. I spent the next day or so in bed.

    Shorter rides favored the hypothesis that it was fun but not extraordinary.

    I returned to the first bike shop, and asked a man there to adjust my new bike to be exactly like their floor bike, ignoring the very likely possibility that they had adjusted the floor bike since my visit. He didn’t seem to obviously understand either the situation or my request, but was willing to make some changes, and phone me when done. He also sold me an expensive lock and some neat (and expensive) panniers. I went and tried to buy crepes, but the store was closed. But then I found a cafe and a restaurant next door to each other with back patios, a recent passion of mine, and took a nice iced coffee from one to the other, where I ate fries and read about words for an hour, in the company of a cat, someone else reading a book, and some kind of raw European music. It was pretty good. My phone died, because my life is too complicated and/or I’m an idiot. I went back to the bike store. He had changed some things, such as—promisingly—the angle of the handlebars. I rode home via the circle park, detouring to fly around it, iced coffee lodged in my front gear cables. It seemed amazing. But I’m uncertain, and doesn’t that mean it wasn’t? Well I smiled a lot on the way home, anyway.

  • Positly covid survey 2: controlled productivity data

    (This continues data from the survey I did the other day on Positly (some other bits here and here.)

    Before asking everyone whether they had had covid, or have ongoing problems from it, or anything else, I asked them about their recent productivity relative to 2019. I was hoping to minimize the influence of their narratives about their covid situation and how it should affect their productivity on their productivity estimates.

    The clearest takeaway, I think: people who would later report ongoing post-covid symptoms also reported being way less productive on average than people who didn’t report any covid.

    Perhaps stupidly, I asked people about their productivity in one of three randomly selected ways, to better tell if results were sensitive to misunderstandings, wording or other vagaries of survey taking. While I stand by this for higher commitment surveys, here it meant trying to aggregate a bunch of slightly mismatched claims about how unproductive a year people are having, when I could have just been averaging a column of numbers. And people did give different-looking distributions of answers for different questions, and different numbers of people with covid got randomized into different kinds of questions, so combining them is messy.

    The different questions were these:

    1. How productive were you this week, relative to your average in 2019?
      • More than twice as productive
      • About 1.5x as productive
      • About as productive
      • About three quarters as productive
      • Less than half as productive
    2. How productive were you this week, as a percentage of average for you in 2019?

      (slider: 0-200%; instruction: For more than 200%, just put 200%)

    3. Were you much less productive in 2021 than in 2019?
      • Yes
      • No

    Below are the answers, divided up by question type. Note that the numbers of respondents involved are tiny and things are noisy. For example, the category question only got two long covid sufferers, and one of them was the nurse apparently working 200% as much as in 2019 but fatigued to the point of crashing her car. (In general, productivity varying because of demand as well as supply is an obvious-in-retrospect theme, according to my reading of the long answers to later questions.)

      0-200% slider (average %) Not ‘much less’ productive (fraction of people) Category choice (average 1st-5th option, higher more productive)
    Productivity of demographic      
    All respondents 106% 48% 3.15
    No covid 108% 54% 3.09
    Covid 100% 31% 3.28
    Covid & not long covid 109% 29% 3.26
    Long covid 70% 33% 3.50
    n in demographic      
    Total n 85 54 89
    Covid n 18 13 25
    Long covid n 4 6 2

    (Probably I should have given you the ‘people who had covid more than two months ago’ numbers, but I didn’t and I’m short on time for this, sorry! All the data is available at the end.)

    It would be nice to combine all these answers. A natural way to do it might be to assign plausible seeming numerical values to all the answers to 1 and 3 , then average all these values.

    For instance, let’s use the following conversion:

    Question 1

    • More than twice as productive: 200%
    • About 1.5x as productive: 150%
    • About as productive: 100%
    • About three quarters as productive: 75%
    • Less than half as productive: 50%

    Question 3

    • Yes (i.e. much less productive): 80%
    • No: 100%

    With these we get:

    Respondents Average productivity relative to 2019* n
    All 106% 228
    Covid 106% 56
    Long covid 89% 12
    Covid & not LC 110% 44
    No covid 106% 172
      *According to above described questionable quantification procedure (for those of you skipping to the table)  

    A different thing to do is call everything below 80% ‘much less productive’ then convert everything into fraction of people avoiding that fate:

    Demographic P(not much less productive) n
    All 68% 228
    Covid 64% 56
    Long covid 25% 12
    Covid & not LC 75% 44
    No covid 69% 172

    The ‘covid & not LC’ category is notably more productive than the ‘no covid’ category in both combinings, which seems interesting. Some things that might be going on:

    • Covid is overall negligibly bad for productivity, and the quarter of people whose lives would have gotten worse lately anyway classify it as ‘long covid’, leaving the other covid sufferers unusually productive. (This being the main thing going on seems inconsistent with other evidence to me, but maybe there’s some of that, or maybe my understanding of the other evidence is mistaken.)
    • People who get long covid are disproportionately those whose productivities were on a less uphill trajectory.
    • People who get covid and not long covid have a relatively productive time because they are freed from avoiding covid, whereas people without covid are often paying costs that are a large fraction of the cost of having covid to avoid it.
    • Maybe people’s sense of their own productivity is higher when they had reason that it might be low.
    • Maybe people with covid but no ongoing problems are disproportionately in households with other people who have some kind of problem from covid (including for instance having died, or having lost income from a brief illness), and capable of helping, and so are being more productive by necessity.
    • This data is so meagre and messy that reading anything into it that’s not about a huge effect is a mistake, and this is already too much thought on the topic.

    I tend to think it’s the latter, but if anyone wants to actually do statistics, or anything else with this data:


  • Positly covid survey: long covid

    Here are some more careful results from a survey I ran the other day on Positly, to test whether it’s trivial to find people who have had their lives seriously impacted by long covid, and to get a better sense of the distribution of what people mean by things like ‘brain fog’, in bigger, vaguer, research efforts.

    Respondents are from the US, aged 20-40, and decided to take the survey—entitled ‘Lifestyle 2022’ with a longer description mentioning the pandemic—when offered good compensation for doing so, a couple of days ago.

    Most basic results

    • 228 responses
    • 56 people thought they have had covid; seven or eight of them on multiple occasions
    • 12 had ongoing health problems or symptoms after recovering from acute illness (21% of those infected)

    This was the basic long covid question:

    Do you seem to have ongoing health problems or symptoms after recovering from acute covid illness?

    In retrospect, it seems misinterpretable, but looking at other responses (see very bottom for all questions, or section ‘How bad are these long covid cases’ for more details on most of them), I don’t think they misinterpreted it much.

    Long covid from cases over two months ago, by vaccination status

    The most basic results are not so telling, because maybe all of those people with ongoing problems are among the masses recovering in recent weeks, and will recover from everything before long. So here we’ll just look at people who were infected more than two months ago.

    Also, let’s divide up rates by vaccination status at the time of infection.

    Minor details:

    • These results are about cases rather than people, i.e. people with two cases are registered twice below (I asked about the date, vaccination status and health consequences for each, so e.g. one person says they got long covid from the second case, so the first is counted under ‘no LC’).

    • ‘Vaccinated’ = at least a full one or two shot course for two weeks


      All Vac No vac (Vac N/A)
    All 39 6 31 2
    LC 9 2 7 0
    no LC 30 4 24 2
    P(LC) 0.23 0.33 0.23  

    In sum, in this (small) group:

    • Around quarter of cases of covid >2 months ago were reported to precede some kind of ongoing health problems or symptoms
    • Around a sixth of covid cases happened to people vaccinated at the time
    • For vaccinated cases, the rate of long covid was about the same as for unvaccinated

    Or to put it visually:


    How bad are these long covid cases?

    A thing I wanted to know was how bad these ongoing symptoms are for people. So I asked some more quantitative questions, as well as for open-ended descriptions.

    There are only nine cases of long covid from more than two months ago, and I asked people a few different things about them, so it seems easier to tell you their answers than to try to summarize, especially because I think it’s helpful to see all of a person’s answers together:



    (8th column heading ends: “…you were on average in 2019”; for the second column, other options that nobody in this group agreed to were ‘I am unable to walk up stairs without resting afterward’, and ‘A doctor has diagnosed me with having specific organ damage’.)

    Taking answers at face value, about eight of these sound pretty bad to me, just on column Q grounds. However it seems likely that some amount of miscommunication is going on, so I trust more impressions coming from several questions. On that basis, maybe half sound like very likely a big deal to me, holistically (and most of the others plausibly so).

    To be maybe continued later…

    (Maybe coming up later: What’s going on with all these people who don’t want to be vaccinated? Are people still taking many precautions? Which ones? What fraction of people’s covid-infected acquaintances seem to be severely ongoingly unwell from it? —I had a lot of prepaid survey-taking minutes from people who didn’t have covid with which to satisfy some of my pandemic curiosities.)

    Full guidedtrack.com code for this survey

    *randomize: 1
    		*question: How productive were you this week, relative to your average in 2019?
    			More than twice as productive
    			About 1.5x as productive
    			About as productive
    			About three quarters as productive
    			Less than half as productive
    		*question: How productive were you this week, as a percentage of average for you in 2019?
    			*type: slider
    			*after: %
    			*max: 200
    			*before: For more than 200%, just put 200%
    			*save: slidersave
    		*question: Were you much less productive in 2021 than in 2019?
    *question: Have you had covid?
    		*set: covid
    		*set: nocovid
    		*set: unsurecovid
    *if: covid
    	*question: Have you had covid more than once?
    			*set: multicovid
    	*if: not multicovid
    		*question: When did you get covid, roughly?
    			*type: calendar
    	*if: multicovid
    		*question: When did you get covid each time, roughly?
    			*type: paragraph
    	*question: Were you vaccinated when you got covid?
    		Yes, I'd had a full 1 or 2 shot course of vaccination for more than two weeks
    		Yes, and boosted for more than two weeks
    		I've had covid multiple times, and the answer is different for different times
    	*question: Do you seem to have ongoing health problems or symptoms after recovering from acute covid illness?
    			*set: longcovid
    		Too soon to say
    		*tip: If you have had covid more than once, answer yes if any bout of covid led to ongoing health problems
    *if: not covid
    	*question: How much are you taking precautions to avoid covid at the moment?
    		Not at all
    		A little
    		A lot
    		Very intensely
    		*save: cautionlevelslider
    	*question: Where do you think you fit among the population in taking precautions to avoid covid at the moment?
    		Among the least cautious 20% of people
    		Among the second least cautious 20% of people
    		In the middle 20% of people
    		Among the second most cautious 20% of people
    		Among the most cautious 20%
    		*save: cautionlevelslider
    	*question: What precautions are you taking lately to avoid covid?
    		*type: checkbox
    		Avoiding indoor crowds
    		Wearing masks indoors
    		Wearing masks outdoors
    		Reducing time indoors with friends
    		Avoiding offices
    		Caring for children to keep them from a high exposure setting
    		Wearing high quality masks, such as N95, KN95 or P100 masks
    		Washing or hand sanitizing extra
    		Reducing travel
    		Reducing time in stores
    		Using rapid tests before spending time together
    		Moving activities outdoors
    	*question: How many people do you know who had covid at least two months ago and survived?
    		*type: number
    		*tip: A rough guess is fine.
    	*question: How many of them are less than 40 years old?
    		*type: number
    		*tip: a rough guess is fine
    	*question: How many of those people (under 40 years old, survived covid at least two months ago) seem to be having longer term health problems as a result, that probably reduce their productivity by more than 20%?
    		*type: number
    		*tip: For instance, if your friends Alice (37 years old), Bob (24 years old) and Mary (92 years old) all had covid in 2020, and Alice and Mary have both been too tired to do anything since their infections, then the answer would be '1' (Alice). A rough guess is fine.
    	*question: Have you had a covid vaccination?
    			*set: novax
    	*if: novax
    		*question: Why haven't you had a covid vaccination?
    			*type: paragraph
    *if: longcovid
    	*question: Which of these things are true for you, as a result of having covid?
    		*type: checkbox
    		I feel substantially cognitively damaged
    		I had no covid symptoms at all, and have had no ongoing covid-related health problems
    		I am unable to read things I used to be able to read
    		I am unable to walk up stairs without resting afterward
    		My life is miserable
    		I am markedly less able to think clearly, more than half of the hours of the day
    		I had a really bad time for over a month
    		I have lingering symptoms, but nothing that makes my life more than 10% worse
    		My senses of taste and smell continue to be off
    		A doctor has diagnosed me with having specific organ damage
    		*tip: If you have had covid more than once, take into account the results of all bouts of covid you have had
    	*question: How much do your ongoing health problems from covid reduce your capacity to do things you would have normally done in a day?
    		*type: slider
    		*after: %
    	*if: multicovid
    		*question: Which bout(s) of covid led to ongoing health problems for you?
    			*type: paragraph
    			*tip: e.g. "second", or "both"
    	*question: Would you rather:
    		A) be rid of ongoing covid related health issues and symptoms forever
    			*set: prefcovone
    		B) a 10% increase in your income this year
    		C) Not applicable
    	*if: prefcovone
    		*question: Would you rather:
    			A) be rid of ongoing covid related health issues and symptoms forever
    				*set: prefcovtwo
    			B) a 50% increase in your income this year
    	*if: prefcovtwo
    		*question: Would you rather:
    			A) be rid of ongoing covid related health issues and symptoms forever
    			B) a 200% increase in your income this year
    *question: Are you currently working less as a result of the covid pandemic?
    	*type: checkbox
    	Yes, because I had covid and it left me with ongoing problems with physical or mental health
    	Yes, because of reduced availability of work
    	Yes, to avoid covid risks at work
    	Yes, to care for others (e.g. children)
    	Yes, because of other distractions, burdens or mental health issues stemming from the situation
    	Yes, for other reasons
    	No, I work as much as usual
    	*save: work
    *question: Roughly how many fewer hours do you think you are working per week than you were on average in 2019?
    	*type: number
    	*after: hours
    	*tip: e.g. If you used to work about 40 hours per week, and recently work about 35 hours per week, put "5 hours"
    *question: Do you have any chronic illness unrelated to covid?
    	Prefer not to say
    *if: covid
    	*question: In words, please tell us about the biggest impacts that having covid has had on your life
    		*type: paragraph
    		*tip: Numbers are especially helpful, e.g. 'I probably lose about ten minutes a day to coughing fits', or '20% reduction in my ability to work', or 'I used to run two miles a day, but now I can only manage one'.
    *question: In words, please tell us about the biggest impacts that the pandemic has had on your life
    	*type: paragraph
    	*tip: Numbers are especially helpful, e.g. 'I lose about twenty hours a week looking after my children because I can't send them to childcare', or '50% reduction in income, due to lack of available work', or 'I used to go to the gym twice a week, but it has mostly been closed'.
    *question: Anything else you'd like to say?
    	*type: paragraph
    Thank you!
    *program: EndOfActivityButton


  • Long covid: probably worth avoiding—some considerations

    I hear friends reasoning, “I’ll get covid eventually and long covid probably isn’t that bad; therefore it’s not worth much to avoid it now”. Here are some things informing my sense that that’s an error:

    A. Really bad anecdotes aren’t hard to find. I asked for people’s long-covid experiences on Twitter and got more than a hundred comments mostly full of truly awful stories, almost entirely from people I don’t know at all, presumably somehow brought via the active Long Covid community.

    This is a bit hard to interpret as a datum, due to the intense selection effects involved, but my main inferences are:

    1. The bad tail gets very bad.
    2. This bad tail is less like a quantitative reduction in labor, and more like psychological torment, being very sleep deprived or brain damaged while dealing with a host of legitimately scary health problems (see below) than I had been picturing.
    3. The very bad part of the bad tail is not tiny. Like, if I requested lightning strike experiences, I don’t think I would get this kind of response.

    See this for more heavily selected but still scary anecdotes about what bad cases can look like.

    B. Bad anecdotes are common enough to show up in my vicinity. Among this cascade of terrible stories is a response from one of the handful of people from my extended network who I already knew had had covid, Oxford machine learning professor Michael Osborne. I take this as strong evidence that that level of bad experience isn’t vanishingly rare, though he has been well for the last few months, so this is only direct evidence of this kind of thing happening for about two years.

    Excerpts from his account:

    (Another distant relative in my generation just told me that they have long covid, but I take that as much less evidence, since I only know about them having covid at all because they saw this long covid discussion.)

    C. Rates of ambiguously-maybe-quite-bad symptoms seem very high, even for people who only had mild covid

    This norwegian study, n = 70k, has, for mild cases (in a sample mixing people who had covid 1-6 months prior with people who had it 11-12 months prior):

    • 10.2% with fatigue (6.4% higher than control in the 11-12 months case),
    • 7.1% with poor memory (3.5% higher than previous control),
    • 9.2% with brain fog (5.3% higher than previous control).
    • 6.9% with shortness of breath (5.6% higher than previous control)

    These numbers are for unvaccinated people; I’d maybe reduce them by a factor of two for being vaccinated, deferring to Matt Bell, who reasons, “A very recent report from the UK claims that full vaccination cuts the risk of lingering symptoms 28+ days after catching COVID by roughly half”.

    This meta-analysis of 81 studies finds:

    • “Approximately 1 in 3 individuals experienced fatigue 12 or more weeks following COVID-19 diagnosis.”

    • “Approximately 1 in 5 individuals exhibited cognitive impairment 12 or more weeks following COVID-19 diagnosis.”

    (Though I am told that lot of the studies involved might have been small-n, hospitalized, no control, so I’m not sure how they dealt with that.)

    This survey suggests that among people with long covid (I think drawn from a support group), some especially undesirable symptoms are very: “88.0% of the total respondents (3310) experienced either cognitive dysfunction or memory loss (Figure 8).”

    All of these symptoms come in degrees between slightly annoying and life-wrecking, making all of this hard to interpret. Maybe we can look at something more objective, like ‘can the person still do their job?’

    D. It looks like a lot of people can no longer do their jobs

    • Katie Bach of Brookings argued a few days ago that an estimate of 1.1 million people out of work because of long covid is reasonable, out of 103 million working age Americans she estimated had had covid, i.e. a roughly 1% chance of being entirely out of work:



    • Washington post: “Hard data is not available and estimates vary widely, but based on published studies and their own experience treating patients, several medical specialists said 750,000 to 1.3 million patients likely remain so sick for extended periods that they can’t return to the workforce full time.”
    • This meta-analysis of 81 studies I mentioned earlier also looked at work: “29.0% and 47.4% of those who were employed premorbidly were not able to return to work”; “5% to 90% were unable to reach their pre-COVID employment level” (p. 128) (As noted earlier, a lot of the studies in the meta-analysis seem to be small n, involving hospitalized people, without controls, and I don’t know what they did about this. Also, it’s possible I’m misunderstanding what group the meta-analysis is about, given how crazy high the numbers are).

    Some harder to interpret data about long covid sufferers in particular (where I’m not sure how many people count as that) still suggests pretty major issues:

    Matt bell says that this UK data-set has ~18% of non-hospitalized long covid sufferers with “activities limited a lot.”

    And looking at a survey I mentioned earlier again, people with long covid (and probably under selection to be relatively bad cases) report large losses of ability to work, and in other specific capabilities:

    • “The greatest area of impact reported was on work, with 86.2% (95% confidence interval 84.4 to 88.0%) of working respondents feeling mildly to severely unable to work - 29.1% (26.7% to 31.6%) severely…. Other areas of impact included making serious decisions 85.3% (80.7% to 89.8%), communicating thoughts and needs 74.8% (72.5% to 77.1%), having conversations with others 68.3% (65.8% to 70.8%), maintaining medication schedules 62.5% (59.8% to 65.1%), following simple instructions 54.4% (51.6% to 57.2%), and driving 53.2% (50.5% to 56.0%).” (p. 16)
    • “Of unrecovered respondents who worked before becoming ill, only 27.3% (95% confidence interval 25.3% to 29.4%) were working as many hours as they were prior to becoming ill at the time of survey, compared to 49.3% (40.8% to 57.9%) of recovered respondents (see Figure 11d). Nearly half 45.6% (43.2% to 48.0%) of unrecovered respondents were working reduced hours at the time of the survey, and 23.3% (21.3% to 25.4%) were not working at the time of the survey as a direct result of their illness. This included being on sick leave, disability leave, being fired, quitting, and being unable to find a job that would accommodate them. The remaining respondents retired, were volunteers, or did not provide enough information to determine their working status. Overall, 45.2% (42.9% to 47.2%) of respondents reported requiring a reduced work schedule compared to pre-illness. 22.3% (20.5% to 24.3%) were not working at the time of survey due to their health conditions.” p. 27.

    E. Other people’s previous back of the envelope calculations on this are not reassuring.

    Matt bell:

    “If you’re a 35 year old woman, and your risk of ending up with lifelong long COVID from catching COVID is 2.8%, then catching COVID would be the same, statistically speaking, as losing (50 years * 0.18 * 0.028 * 365 days/year) = ~90 days of your life.”

    Scott Alexander:

    “Your chance of really bad debilitating lifelong Long COVID, conditional on getting COVID, is probably somewhere between a few tenths of a percent, and a few percent.”

    F. Having ‘survived’ covid looks associated with a 60% increased risk of death (after surviving covid) during the following six months

    According to a massive controlled study published in Nature (more readable summary here). It also looks like they are saying that this is for non-hospitalized covid patients, though the paper is confusing to me.

    I’m not sure whether to model this as some longer-than-a-few-weeks period of danger that is still done within the year (maybe 400 micromorts if done by six months), versus semi-permanently worse health (which would maybe be like aging about 8 years).

    This is bad because death, but I’m maybe more alarmed by it because it supports the ‘post-covid illness is some fairly extreme body fuckage of which the fact that some people can’t think straight is merely a tip-of-the-iceberg symptom’ hypothesis over the ‘most of the story is that you lose some work time this year’ hypothesis. Both because death really suggests something wrong, and because they catalogue a lot of things wrong—from the WebMD summary: “The patients had a high rate of stroke and other nervous system ailments; mental health problems such as depression; the onset of diabetes; heart disease and other coronary problems; diarrhea and digestive disorders; kidney disease; blood clots; joint pain; hair loss; and general fatigue.”

    It’s also a bad sign about what a bad time you might have during that year. I don’t think these people go gently - it seems like a repeating theme of these long covid stories is that their victims have a lot of health scares and medical investigation, and this mortality data suggests to me that they are genuinely at risk of their lives, which I think would make the experience much worse, for me at least.

    G. Overall deaths from everything have been very unusually high at points in 2021, even in 15-64 age group

    From Our World In Data:

    For context:

    (The Center Square) – The head of Indianapolis-based insurance company OneAmerica said the death rate is up a stunning 40% from pre-pandemic levels among working-age people.

    “We are seeing, right now, the highest death rates we have seen in the history of this business – not just at OneAmerica,” the company’s CEO Scott Davison said during an online news conference this week. “The data is consistent across every player in that business.”

    OneAmerica is a $100 billion insurance company that has had its headquarters in Indianapolis since 1877. The company has approximately 2,400 employees and sells life insurance, including group life insurance to employers nationwide.

    Davison said the increase in deaths represents “huge, huge numbers,” and that’s it’s not elderly people who are dying, but “primarily working-age people 18 to 64” who are the employees of companies that have group life insurance plans through OneAmerica.

    “And what we saw just in third quarter, we’re seeing it continue into fourth quarter, is that death rates are up 40% over what they were pre-pandemic,” he said.

    “Just to give you an idea of how bad that is, a three-sigma or a one-in-200-year catastrophe would be 10% increase over pre-pandemic,” he said. “So 40% is just unheard of.”

    H. Sounds like these things involve various damage throughout body, and my guess is that that ends up being regrettable in ways not captured in ‘hours lost to fatigue this year’

    See Nature study in F. I also feel like I’ve seen this a lot, but don’t have that many examples immediately on hand. Here’s one other example, not ideal because note that these are hospitalized younger people:

    For people younger than 65 who were hospitalized with COVID-19, the risk of death in the 12 months after the infection was 233% higher than it was for people who did not have the disease, results published in the journal Frontiers in Medicine have shown.

    Nearly 80% of all deaths of people in the study who had recovered from COVID-19 in the past 12 months were not due to cardiovascular or respiratory causes, suggesting that the impact of the virus is significant and wide-ranging, even after the initial infection has been fought off.

    My guess is that all the symptoms are a spectrum, and if the worst looks like an unbelievable amount of cognitive impairment and a pot pourri of organ dysfunctions, or death post-infection, then probably everyone gets a handful of cognitive impairment and organ dysfunction.

    I. It’s easy to just think about these worst case outcomes, but there are a lot of more probable non-worst case outcomes that would still be a non-negligible bummer.

    I see people mostly estimating the worst cases, but my guess is that the more probable non-worst case outcomes (e.g. lesser fatigues and brain fogs, kidney damage, arrhythmias etc), are not nothing.

    J. Future unknowns

    Across all diseases, how much of their disutility is obvious in the first two years? Saliently right now: we’ve had Epstein-Barr for ages and only now noticed that it apparently has a ~1/300 chance of causing MS, usually showing up between 20 and 40, long after the virus, and wreaking brutal destruction. I’m not sure whether we would realize how bad HIV was if it had appeared two years ago and lots of people had it, but nobody had had it for more than two years yet.

    K. Long covid probably isn’t psychosomatic

    A French study found that long covid is barely associated with having had covid according to an antibody test, yet associated with believing one has had covid (which itself is unrelated to the antibody test results).

    At first I (and I think others) thought that if this wasn’t some error, then long covid was likely psychosomatic and not caused by physically having covid. But on further thought, that’s totally wrong: this pattern could be caused by beliefs causing illness, but it could also be caused by illness causing beliefs, which obviously happens all the time. That is, people’s guesses about whether they had covid are heavily influenced by their symptoms.

    Imagine a population of a thousand people. A hundred of them get covid this year, and ten get long covid caused by the virus. Fifty and five of those respectively notice that they have covid at the time. Also a hundred people get other illnesses that look like long covid (other viral infections, other mental health problems, other random things). Everyone knows that a lot of people are getting covid and don’t know they have covid. The hundred people who got some other illness that looks like long covid infer that they are plausibly in the asymptomatic covid group, since they seem to have long covid. Especially the ones who got some viral illness, so were actually a bit sick at around the time they developed their chronic illness. So we end up with people who think they have long covid having less covid than the general population, even though long covid caused by the covid virus is genuinely afflicting 10% of the covid infected. Note that everyone here might have non-psychosomatic illnesses.

    It seems to me that we have other data that basically rules out the possibility that long covid is imaginary (e.g. see Nature study above on laboratory abnormalities and raised death rate). Though psychosomatic illness is weird - my understanding is that it could in principle still be psychosomatic, while yielding measurable physical laboratory abnormalities, though intuitively I’d be fairly surprised to learn that the same new psychosomatic syndrome had gripped millions in the wake of a particular belief they had, and raised their risk of death by half. Maybe I’m missing something here.

    L. The general discussion of what is going on with people’s brains sounds terrible

    The list of different plausible routes to brain damage occurring according to Nature—some brain cells getting broken, some blood restrictions causing little strokes, some immune system attacking your own brain type issues—is one I want very little to do with. For instance:

    “…The researchers observed that, in slices of hamster brain, SARS-CoV-2 blocks the functioning of receptors on pericytes, causing capillaries in the tissue to constrict. “It turns out this is a big effect,” says Attwell.

    It’s a “really cool” study, says Spudich. “It could be something that is determining some of the permanent injury we see — some of these small-vessel strokes.”“

    Another sample of current discussion of the brain damage situation that I really don’t want to be a part of, this time from NPR:

    Frontera led a study that found that more than 13% of hospitalized COVID-19 patients had developed a new neurological disorder soon after being infected. A follow-up study found that six months later, about half of the patients in that group who survived were still experiencing cognitive problems.

    The current catalog of COVID-related threats to the brain includes bleeding, blood clots, inflammation, oxygen deprivation and disruption of the protective blood-brain barrier. And there’s new evidence in monkeys that the virus may also directly infect and kill certain brain cells…

    “Frontera was part of a team that studied levels of toxic substances associated with Alzheimer’s and other brain diseases in older COVID-19 patients who were hospitalized.

    “The levels were really high, higher than what we see in patients that have Alzheimer’s disease,” Frontera says, “indicating a very severe level of brain injury that’s happening at that time.”…

    Even COVID-19 patients who experience severe neurological problems tend to improve over time, Frontera says, citing unpublished research that measured mental function six and 12 months after a hospital stay.

    “Patients did have improvement in their cognitive scores, which is really encouraging,” she says.

    But half of the patients in one study still weren’t back to normal after a year….”

    M. It sounds like covid maybe persists in your body for ages?

    Seems like the virus lives throughout your organs long after recovery, based on autopsies, including of mild/asymptomatic covid sufferers (summary, paper):

    “The study found that the virus had replicated across multiple organ systems even among patients with asymptomatic to mild COVID-19.

    The virus was detected in all 44 cases and across 79 of 85 anatomical locations and body fluids sampled”

    “We performed 72 complete autopsies on 44 patients with COVID-19 to map and quantify SARS-CoV-2 73 distribution, replication, and cell-type specificity across the human body, including brain, 74 from acute infection through over seven months following symptom onset. We show that 75 SARS-CoV-2 is widely distributed, even among patients who died with asymptomatic to 76 mild COVID-19, and that virus replication is present in multiple pulmonary and 77 extrapulmonary tissues early in infection. Further, we detected persistent SARS-CoV-2 78 RNA in multiple anatomic sites, including regions throughout the brain, for up to 230 days 79 following symptom onset.”

    This seems like a bad sign for future problems.

    Ok, that was ‘covid is bad, primarily due to long covid’. But aren’t we all going to get it anyway?

    N. Later rounds of covid are probably bad too

    This assumes that later covids are basically free, once you’ve done it once, in a way that isn’t true for e.g. crashing your car. My guess is that later bouts are less bad on average, but far from free.

    In my survey, of three people with lasting problems who got covid at least twice, one got the problems with the first, one the second, and one said both contributed (though for the last person, the second was around a month ago). Not a great sample size, but seems like strong evidence that second-round long-covid isn’t unheard of.

    O. It’s not 100% that you will get it.

    I’d guess there’s a decent chance you’ll be able avoid it, via reasonable-to-pay costs. For instance, maybe omicron basically brings herd immunity. Maybe rapid tests get more reliable, easy, and cheap. Maybe long covid becomes a bigger concern, and people worry enough to get R0 generally below 1. Or other technologies improve: Fend really does cut covid exhalations massively and is taken up widely, or something like Novid finds a way to be taken up. (Or your path takes you to being a hermit or living in China, or we spend next year hiding in bunkers from cyborg bees, and you are glad you didn’t pay all that covid risk up front for parties that aren’t happening.)

    (If we were doing exactly what we are doing so far, but with air purifiers in most buildings, would R0 would have been ¼ instead of ~1, and would it have died out? Is the problem that we are psychologically incapable of maintaining slightly more caution than needed to keep infections steady?)

    Are things so predictable?

    P. The likelihood of you getting it probably does depend on how bad it is

    Whether you will definitely get covid depends on your—and everyone’s—assessment of how bad it would be to get. If the high end of possible long covid risks became clearly correct, probably many people wouldn’t take those costs vs. avoiding people who choose to be risky, so the options for low risk lifestyle would improve. Like, at first when people realized that it was potentially quite valuable to wash your hands after various bodily things, it would have been pretty annoying to be one of the few people trying to avoid contact with others so called ‘germs’. One might have resigned oneself to getting every disease contained in anyone’s bodily fluids. But eventually the idea of hand-washing being important has had enough traction that everyone can care a lot about other people washing their hands after using the bathroom say, and it imposes very little burden, because basically everyone has habits and infrastructure that make it easy to uphold hygiene.

    If everyone believed that covid was bad enough to get, it would die out in a month (possibly with terrible economic downsides for a month). It probably isn’t that bad, but if it really is quite bad to get, and ways to avoid it exist, people will probably realize that eventually, and adjust.

    Q. Getting covid later is probably better than earlier.

    So far this trend seems strong: I would rather get covid now than in March 2020. I expect more of this, from better knowledge, medicine, vaccines, and availability of everything we already have.

    If I expect to get covid every year for the next five years, adding one more bout now is adding one more especially bad bout in expectation.

    R. Huge VOI in following behind the crowd, at a safe distance

    I know many people resigning themselves to getting covid this month, and results so far suggest that a lot of them will indeed be infected.

    If half of my friends are getting covid right now, there seems to be massive value in waiting another three months before reassessing precautions, so I can see how it goes for them. While noisy, I expect to get a better sense of the distribution of outcomes among people in my reference class—and especially whether there is a macroscopic chance of really dire outcomes—from this (I think regrettable) experiment than I get from cobbling together different studies about vaguely defined symptoms among ambiguous or not-quite-relevant demographics, with various guesstimated adjustments.

    There also seems to be a huge amount of uncertainty about how bad long covid is. I and my friends have various intuitions and guesses about where assorted factors of two push in this direction or that, and I doubt any of us should be super confident that we are the one with the right intuitions and guesses.

    I think waiting out this spike looks like the right move from a VOI perspective even if your own judgments come out in favor of long covid being probably fine. As long as you are decently uncertain—which it seems to me that you probably should be—there is a lot of information coming in (at least if you are someone in my social circles, but probably a lot of social circles are coming to have a lot more people with covid this month).

    I hear friends reasoning, “I’ll get covid eventually and long covid probably isn’t that bad; therefore it’s not worth much to avoid it now”. Here are some things informing my sense that that’s an error:

  • Survey supports ‘long covid is bad’ hypothesis (very tentative)

    I wanted more clues about whether really bad long covid outcomes were vanishingly rare (but concentrated a lot in my Twitter) or whether for instance a large fraction of ‘brain fogs’ reported in datasets are anything like the horrors sometimes described. So I took my questions to Positly, hoping that the set of people who would answer questions for money there was fairly random relative to covid outcomes.

    I hope to write something more careful about this survey soon, especially if it is of interest, but figure the basic data is better to share sooner. This summary is not very careful, and may e.g. conflate slightly differently worded questions, or fail to exclude obviously confused answers, or slightly miscount.

    This is a survey of ~230 Positly survey takers in the US, all between 20 and 40 years old. Very few of the responses I’ve looked at seem incoherent or botlike, unlike those in the survey I did around the time of the election.


    Have you had covid?

    Yes: 57

    No: 151

    Unsure: 22

    Do you seem to have ongoing health problems after recovering from acute covid illness?

    Yes: 12 out of 57 = 21%

    Are you currently working less as a result of the covid pandemic?

    “Yes, because I had covid and it left me with ongoing problems with physical or mental health” - about 6 tick this, though often in conjunction with other yes explanations.

    => 10% rate of people apparently working less at least partly due to Long Covid, among those who’ve had covid

    (Also, here is an example of someone responding that they work just as much as before:

    ‘I am a registered nurse, so I am around a lot of COVID. The lingering symptom that has been the worst is the fatigue. I feel like I am never rested. It actually lead to a car accident because I fell asleep driving after the second round of COVID…’

    Finding good questions is tricky - in a bad enough situation everyone might be doing very badly and yet look more productive due to necessity.)

    Agreement with a list of checkable statements about their lives

    8 people unambiguously checked boxes agreeing with statements that sounded especially brutal to me (they could check as many as they wanted from a longer list including less brutal things):

    • I feel substantially cognitively damaged
    • I am unable to walk up stairs without resting
    • I am less markedly able to think clearly, more than half the hours of the day
    • My life is miserable

    (I’m tentatively not including people who seemed to give conflicting answers in different places, though maybe some will make sense on further inspection)

    This seems to be 8/57 = 14% rate of brutal covid outcomes, though at least some of these are probably very recent - I haven’t filtered things out by when they got covid, though I did ask them (I’m hoping to go to sleep very soon).

    How much do your ongoing health problems from covid reduce your capacity to do things you would have normally done in a day?

    Only given to the 12 people who said they had ongoing health problems.


    Average: 46% reduction

    Median: 37% reduction

    People citing less than 30% reduction: 1

    For people with ongoing health issues, given a choice of A) ‘be rid of ongoing covid related health issues and symptoms forever’ or B) an increase in income this year:

    For 10% increase in income:

    6 would take health, 3 income, 2 N/A

    For 50% increase in income:

    3 would still take health, 3 would take income

    For 200% increase in income:

    1 still prefers health, 2 would take income

    ##Questions just given to people who didn’t have covid, about people they know:

    • How many people do you know who had covid at least two months ago and survived?
    • How many of them are less than 40 years old?
    • How many of those people (under 40 years old, survived covid at least two months ago) seem to be having longer term health problems as a result, that probably reduce their productivity by more than 20%?

    151 respondents estimated between them that they knew 479 people less than 40 years old who survived covid over two months ago (ignoring two people with implausibly high numbers of acquaintances). Of these, they estimated that 75 of these people had developed longer term health problems that reduced their productivity by more than 20%.

    75/479 = 16%

    => Among youngish people who respondents knew to have recovered from covid more than two months ago, it seemed to them that about 16% of those people had more than 20% reduction in productivity from ongoing covid health effects

  • Beyond fire alarms: freeing the groupstruck

    Crossposted from AI Impacts

    [Content warning: death in fires, death in machine apocalypse]

    ‘No fire alarms for AGI’

    Eliezer Yudkowsky wrote that ‘there’s no fire alarm for Artificial General Intelligence’, by which I think he meant: ‘there will be no future AI development that proves that artificial general intelligence (AGI) is a problem clearly enough that the world gets common knowledge (i.e. everyone knows that everyone knows, etc) that freaking out about AGI is socially acceptable instead of embarrassing.’

    He calls this kind of event a ‘fire alarm’ because he posits that this is how fire alarms work: rather than alerting you to a fire, they primarily help by making it common knowledge that it has become socially acceptable to act on the potential fire.

    He supports this view with a great 1968 study by Darley and Latané, in which they found that if you pipe a white plume of ‘smoke’ through a vent into a room where participants fill out surveys, a lone participant will quickly leave to report it, whereas a group of three (innocent) participants will tend to sit by in the haze for much longer1.

    Here’s a video of a rerun2 of part of this experiment, if you want to see what people look like while they try to negotiate the dual dangers of fire and social awkwardness.

    A salient explanation for this observation3 is that people don’t want to look fearful, and are perhaps repeatedly hit by this bias when they interpret one another’s outwardly chill demeanor as evidence that all is fine. (Darley and Latané favor a similar hypothesis, but where people just fail to interpret a stimulus as possibly dangerous if others around them are relaxed.)

    So on that hypothesis, thinks Eliezer, fire alarms can cut past the inadvertent game of chicken produced by everyone’s signaling-infused judgment, and make it known to all that it really is fire-fleeing time, thus allowing face-saving safe escape.

    With AI, Eliezer thinks people are essentially sitting by in the smoke, saying ‘looks fine to me’ to themselves and each other to avoid seeming panicky. And so they seem to be in need the analogue of a fire alarm, and also (at least implicitly) seem to be expecting one: assuming that if there were a real ‘fire’, the fire alarm would go off and they could respond then without shame. For instance, maybe new progress would make AI obviously an imminent risk to humanity, instead of a finicky and expensive bad writing generator, and then everyone would see together that action was needed. Eliezer argues that this isn’t going to happen—and more strongly (though confusingly to me) that things will look basically similar until AGI—and so he seems to think that people should get a grip now and act on the current smoke or they will sit by forever.

    My take

    I forcefully agree with about half of the things in that post, but this understanding of fire alarms—and the importance of there not being one for AGI—is in the other half.

    It’s not that I expect a ‘fire alarm’ for AGI—I’m agnostic—it’s just that fire alarms like this don’t seem to be that much of a thing, and are not how we usually escape dangers—including fires—even when group action is encumbered by embarrassment. I doubt that people are waiting for a fire alarm or need one. More likely they are waiting for the normal dance of accumulating evidence and escalating discussion and brave people calling the problem early and eating the potential embarrassment. I do admit that this dance doesn’t look obviously up to the challenge, and arguably looks fairly unhealthy. But I don’t think it’s hopeless. In a world of uncertainty and a general dearth of fire alarms, there is much concern about things, and action, and I don’t think it is entirely uncalibrated. The public consciousness may well be oppressed by shame around showing fear, and so be slower and more cautious than it should be. But I think we should be thinking about ways to free it and make it healthy. We should not be thinking of this as total paralysis waiting for a magical fire alarm that won’t come, in the face of which one chooses between acting now before conviction, or waiting to die.

    To lay out these pictures side by side:

    Eliezer’s model, as I understand it:

    • People generally don’t act on a risk if they feel like others might judge their demonstrated fear (which they misdescribe to themselves as uncertainty about the issue at hand)
    • This ‘uncertainty’ will continue fairly uniformly until AGI
    • This curse could be lifted by a ‘fire alarm’, and people act as if they think there will be one
    • ‘Fire alarms’ don’t exist for AGI
    • So people can choose whether to act in their current uncertainty or to sit waiting until it is too late
    • Recognizing that the default inaction stems not from reasonable judgment, but from a questionable aspect of social psychology that does not appear properly sensitive to the stakes, one should choose to act.

    My model:

    • People act less on risks on average when observed. Across many people this means a slower ratcheting of concern and action (but way more than none).
    • The situation, the evidence and the social processing of these will continue to evolve until AGI.
    • (This process could be sped up by an event that caused global common knowledge that it is socially acceptable to act on the issue—assuming that that is the answer that would be reached—but this is also true of Eliezer having mind control, and fire alarms don’t seem that much more important to focus on than the hypothetical results of other implausible interventions on the situation)
    • People can choose at what point in a gradual escalation of evidence and public consciousness to act
    • Recognizing that the conversation is biased toward nonchalance by a questionable aspect of social psychology that does not appear properly sensitive to the stakes, one should try to adjust for this bias individually, and look for ways to mitigate its effects on the larger conversation.

    (It’s plausible that I misunderstand Eliezer, in which case I’m arguing with the sense of things I got from misreading his post, in case others have the same.)

    If most people at some point believed that the world was flat, and weren’t excited about taking an awkward contrarian stance on the topic, then it would indeed be nice if an event took place that caused basically everyone to have common knowledge that the world is so blatantly round that it can no longer be embarrassing to believe it so. But that’s not a kind of thing that happens, and in the absence of that, there would still be a lot of hope from things like incremental evidence, discussion, and some individuals putting their necks out and making the way less embarrassing for others. You don’t need some threshold being hit, or even a change in the empirical situation, or common knowledge being produced, or or all of these things at once, for the group to become much more correct. And in the absence of hope for a world-is-round alarm, believing that the world is round in advance because you think it might be and know that there isn’t an alarm probably isn’t the right policy.

    In sum, I think our interest here should actually be on the broader issue of social effects systematically dampening society’s responses to risks, rather than on ‘fire alarms’ per se. And this seems like a real problem with tractable remedies, which I shall go into.

    I. Do ‘fire alarms’ show up in the real world?

    Claim: there are not a lot of ‘fire alarms’ for anything, including fires.

    How do literal alarms for fires work?

    Note: this section contains way more than you might ever want to think about how fire alarms work, and I don’t mean to imply that you should do so anyway. Just that if you want to assess my claim that fire alarms don’t work as Eliezer thinks, this is some reasoning.


    “One might think that the function of a fire alarm is to provide you with important evidence about a fire existing, allowing you to change your policy accordingly and exit the building.

    In the classic experiment by Latane and Darley in 1968, eight groups of three students each were asked to fill out a questionnaire in a room that shortly after began filling up with smoke. Five out of the eight groups didn’t react or report the smoke, even as it became dense enough to make them start coughing. Subsequent manipulations showed that a lone student will respond 75% of the time; while a student accompanied by two actors told to feign apathy will respond only 10% of the time. This and other experiments seemed to pin down that what’s happening is pluralistic ignorance. We don’t want to look panicky by being afraid of what isn’t an emergency, so we try to look calm while glancing out of the corners of our eyes to see how others are reacting, but of course they are also trying to look calm…

    …A fire alarm creates common knowledge, in the you-know-I-know sense, that there is a fire; after which it is socially safe to react. When the fire alarm goes off, you know that everyone else knows there is a fire, you know you won’t lose face if you proceed to exit the building.

    The fire alarm doesn’t tell us with certainty that a fire is there. In fact, I can’t recall one time in my life when, exiting a building on a fire alarm, there was an actual fire. Really, a fire alarm is weaker evidence of fire than smoke coming from under a door.

    But the fire alarm tells us that it’s socially okay to react to the fire. It promises us with certainty that we won’t be embarrassed if we now proceed to exit in an orderly fashion.”

    I don’t think this is actually how fire alarms work. Which you might think is a nitpick, since fire alarms here are a metaphor for AI epistemology, but I think it matters, because it seems to be the basis for expecting this concept of a ‘fire alarm’ to show up in the world. As in, ‘if only AI risk were like fires, with their nice simple fire alarms’.

    Before we get to that though, let’s restate Eliezer’s theory of fire response behavior here, to be clear (most of it also being posited but not quite favored by Darley and Latané):

    1. People don’t like to look overly scared
    2. Thus they respond less cautiously to ambiguous signs of danger when observed than when alone
    3. People look to one another for evidence about the degree of risk they are facing
    4. Individual underaction (2) is amplified in groups via each member observing the others’ underaction (3) and inferring greater safety, then underacting on top of that (2).
    5. The main function of a fire alarm is to create common knowledge that the situation is such that it is socially acceptable to take a precaution, e.g. run away.

    I’m going to call hypotheses in the vein of points 1-4 ‘fear shame’ hypotheses.

    fear shame hypothesis: the expectation of negative judgments about fearfulness ubiquitously suppress public caution.

    I’m not sure about this, but I’ll tentatively concede it and just dispute point 5.

    Fire alarms don’t solve group paralysis

    A first thing to note is that fire alarms just actually don’t solve this kind of group paralysis, at least not reliably. For instance, if you look again closely at the rerun of the Darley and Latané experiment that I mentioned above, they just actually have a fire alarm4, as well as smoke, and this seems to be no impediment to the demonstration:

    The fire alarm doesn’t seem to change the high level conclusion: the lone individual jumps up to investigate, and the people accompanied by a bunch of actors stay in the room even with the fire alarm ringing.

    And here is a simpler experiment entirely focusing on what people do if they hear a fire alarm:

    Answer: these people wait in place for someone to tell them what to do, many getting increasingly personally nervous. The participant’s descriptions of this are interesting. Quite a few seem to assume that someone else will come and lead them outside if it is important.

    Maybe it’s some kind of experiment thing? Or a weird British thing? But it seems at least fairly common for people not to react to fire alarms. Here are a recent month’s tweets on the topic:

    The first video also suggests that the 1979 Woolworths fire killed ten people, all in the restaurant, because those people were disinclined to leave before paying their bill, due to a similar kind of unwillingness to diverge from normal behavior. I’m not sure how well supported that explanation is, but it seems to be widely agreed that ten people died, all in the restaurant, and that people in the restaurant had been especially unwilling to leave under somewhat bizarre circumstances (for instance, hoping to finish their meals anyway5, or having to be dragged out against their will6). According to a random powerpoint presentation I found on the internet, the fire alarm went off for four minutes at some point, though it’s possible that at that point they did try to leave, and failed. (The same source shows that all were found quite close to the fire escape, so they presumably all tried to leave prior to dying, but that probably isn’t that surprising.) This seems like probably a real case of people hearing a fire alarm and just not responding for at least some kind of weird social reasons, though maybe the fire alarm was just too late. The fact that everyone else in the 8 floor building managed to escape says there was probably some kind of fairly clear fire evidence.

    So, that was a sequence of terrifying demonstrations of groups acting just like they did in the Darley and Latané experiment, even with fire alarms. This means fire alarms aren’t an incredibly powerful tool against this problem. But maybe they make a difference, or solve it sometimes, in the way that Eliezer describes?

    How might fire alarms work? Let’s go through some possible options.

    By creating common knowledge of something to do with fire?

    This is Eliezer’s explanation above. One issue with it is that given that fire alarms are so rarely associated with fires (as Eliezer notes) the explanation, ‘​​A fire alarm creates common knowledge, in the you-know-I-know sense, that there is a fire…’ seems like it must be a markedly different from the precise mechanism. But if a fire alarm is not producing common knowledge of a fire, what is it producing common knowledge of, if anything?

    …common knowledge of the fire alarm itself?

    Fire alarms might produce common knowledge that there’s a fire alarm going off better than smoke produces common knowledge of smoke, since fire alarms more aggressively observable, such that hearing one makes it very likely that others can hear it and can infer that you can hear it, whereas smoke can be observed more privately, especially in small quantities. Even if you point out the smoke in an attempt to create common knowledge, other people might think that you are mistaking steam for smoke due to your fear-tainted mindset. Smoke is more ambiguous. In the experiments, people who didn’t leave—seemingly due to being in groups—reportedly attributed their staying to the smoke probably not being smoke (which in fairness it wasn’t). Fire alarms are also ambiguous, but maybe less so.

    But it’s not obvious how common knowledge of the fire alarm itself avoids the problem, since then everyone has to judge how dire a threat a fire alarm is, and again one can have more and less fear-indicative choices.7

    …common knowledge of some low probability of fire?

    A perhaps more natural answer is that fire alarms produce common knowledge ‘that there is some non-negligible risk of fire, e.g. 1%’. This would be an interesting model, because if Eliezer is right that fire alarms rarely indicate fires and are probably less evidence of a fire than smoke8 then it must be that a) fire alarms produce common knowledge of this low chance of fire while smoke fails to produce common knowledge of a higher chance of fire, and b) common knowledge of a low risk is worth leaving for, whereas non-common knowledge of a higher risk is not worth leaving for.

    These both make sense in theory, strictly speaking:

    1. Fire alarms are intrinsically more likely to produce common knowledge (as described above)
    2. People might have a more shared understanding of the probability of fire implied by a fire alarm than of the probability of fire implied by smoke, so that common knowledge of smoke doesn’t produce common knowledge of an n% chance of danger but common knowledge of a fire alarm does.
    3. If you think there is a 5% risk of fire but that your friends might mistake you for thinking that there is a 0.01% risk of fire, then you might be less keen to leave than if you all have common knowledge of a 1% risk of fire.

    But in practice, it seems surprising to me if this is a good description of what’s going on. Some issues:

    • Common knowledge doesn’t seem that unlikely in the smoke case, where others are paying enough attention to see you leave.
    • If others actually don’t notice the smoke, then it’s not clear why leaving should even indicate fear to them at all. For instance, without knowing the details of the experiment in the video, it seems as though if the first woman with company had just quietly stood up and walked out of the room, she should not expect the others to know she is responding to a threat of fire, unless they too see the smoke, in which case they can also infer that she can infer that either they have either seen the smoke too or they haven’t and have no reason to judge her. So what should she be scared of, on a story where the smoke just produces less common knowledge?
    • People presumably have no idea what probability of fire a fire alarm indicates, making it very hard for one to create common knowledge of a particular probability of fire among a group of people.

    Given these things, I don’t buy that fire alarms send people outside via creating common knowledge of some low probability of fire.

    …common knowledge that it isn’t embarrassing?

    Another possibility is that the fire alarm produces common knowledge of the brute fact that it is now not embarrassing to leave the building. But then why? How did it become non-embarrassing? Did the fire alarm make it so, or did it respond to the situation becoming non-embarrassing?

    …common knowledge of it being correct to leave?

    Maybe the best answer in this vicinity is ‘that there is a high enough risk that you should leave’. This sounds very similar to ‘that there is some particular low risk’, but it gloms together the ‘probability of fire’ issue and the ‘what level of risk means that you should leave’ issue. The difference is that if everyone was uncertain about the level of risk, and also about at what level of risk they should leave, the fire alarm is just making a bid for everyone leaving, thereby avoiding the step where they have to make a judgment about under what level of risk to leave, which is perhaps especially likely to be the step at which they might get judged. This also sounds more realistic, given that I don’t think anyone has much idea about either of these steps. Whereas I could imagine that people broadly agree that a fire alarm means that it is leaving time.

    On the other hand, if I imagine leaving a building because of a fire alarm, I expect a decent amount of the leaving to be with irritation and assertion that there is not a real fire. Which doesn’t look like common knowledge that it is the risk-appropriate time to leave. Though I guess viewed as a strategy in the game, ‘leave but say you wouldn’t if you weren’t being forced to, because you do not feel fear’ seems reasonable.

    In somewhat better evidence-from-imagination, if a fire alarm went off in my house, in the absence of smoke, and I went and stood outside and called the fire brigade, I would fear seeming silly to my housemates and would not expect much company. So I at least am not in on common knowledge of fire alarms being a clear sign that one should evacuate—I may or may not feel that way myself, but I am not confident that others do.

    Perhaps a worse problem with this theory is that it isn’t at all clear how everyone would have come to know and/or agree that fire alarms indicate the right time to leave.

    I think a big problem for these common knowledge theories in general is that if fire alarms sometimes fail to produce common knowledge that it isn’t embarrassing to escape (e.g. in the video discussed above), then it is hard for them to produce common knowledge most of the time, due to the nature of common knowledge. For instance, if I hear a fire alarm, then I don’t know whether everyone knows that it isn’t embarrassing for me to leave, because I know that sometimes people don’t think that. It could be that everyone immediately knows which case they are in by the nature of the fire alarm, but I at least don’t know explicitly how to tell.

    By providing evidence?

    Even if fire alarms don’t produce real common knowledge that much, I wouldn’t be surprised if they help get people outside in ways related to signaling and not directly tied to evidence of fire.

    For instance, just non-common-but-not-obviously-private evidence could reduce each person’s expected embarrassment somewhat, maybe making caution worth the social risk. That is, if you just think it’s more likely that Bob thinks it’s more likely that you have seen evidence of real risk, that should still reduce the embarrassment of running away.

    By providing objective evidence?

    Another similar thing that fire alarms might do is provide evidence that is relatively objective and relies little on your judgment, so you can be cautious in the knowledge that you could defend your actions if called to. Much like having a friend in the room who is willing to say ‘I’m calling it - this is smoke. We have to get out’, even if they aren’t actually that reliable. Or, like if you are a hypochondriac, and you want others to believe you, it’s nice to have a good physical pulse oximeter that you didn’t build.9

    This story matches my experience at least some. If a fire alarm went off in my house I think I would seem reasonable if I got up to look around for smoke or a fire. Whereas when I get up to look for a fire when I merely smell smoke, I think people often think I’m being foolish (in their defense, I may be a bit overcautious about this kind of thing). So here the fire alarm is helping me take some cautious action that I wanted to take anyway with less fear of ridicule. And I think what it is doing is just offering relatively personal-judgment-independent evidence that it’s worth considering the possibility of a fire, whereas otherwise my friends might suspect that my sense of smell is extremely weak evidence, and that I am foolish in my inclination to take it as such.

    So here the fire alarm is doing something akin to the job Eliezer is thinking of—being the kind of evidence that gives me widely acceptable reason to act without having to judge and so place the quality of my judgment on the line. Looking around when there’s a fire alarm is like buying from IBM or hiring McKinsey. But because this isn’t common knowledge, it doesn’t have to be some big threshold event—this evidence can be privately seen and can vary by person in their situation. And it’s not all or nothing. It’s just a bit helpful for me to have something to point to. With AI, it’s better if I can say ‘have you seen GPT-3 though? It’s insane’ than if I just say ‘it seems to me that AI is scary’. The ability of a particular piece of evidence to do this in a particular situation is on a spectrum, so this is unlike Eliezer’s fire alarm in that it needn’t involve common knowledge or a threshold. There is plenty of this kind of fire alarm for AI. “The median ML researcher says there is a 5% chance this technology destroys the world or something equivalently bad”, “AI can write code”, “have you seen that freaking avocado chair?”.

    My guess is that this is more a part of how fire alarms work than anything like genuine common knowledge is.

    Another motivation for leaving beside your judgment of risk?

    An interesting thing about the function of objective evidence in the point above is that it is not actually much to do with evidence at all. You just need a source of motivation for leaving the building that is clearly not very based on your own sense of fear. It can be an alarm telling you that the evidence has mounted. But it would also work if you had a frail mother who insisted on being taken outside at the first sign of smoke. Then going outside could be a manifestation of familial care rather than anything about your own fear. If the smell of smoke also meant that there were beers outside, that would also work, I claim.

    Some other examples I predict work:

    • If you are holding a dubiously covid-safe party and you actually want people who are uncomfortable with the crowding to go outside, then put at least one other thing they might want outside, so that they can e.g. wander out looking for the drinks instead of having to go and stand there in fear.
    • If you want people in a group who don’t really feel comfortable snorkeling to chicken out and not feel pressured, then make salient some non-fear costs to snorkeling, e.g. that each additional person who does it will make the group a bit later for dinner.
    • If you want your child to avoid reckless activities with their friends, say you’ll pay them $1000 if they finish high school without having done those things. This might be directly motivating, but it also gives them a face-saving thing they can say to their friends if they are ever uncomfortable.

    This kind of thing seems maybe important.

    By authority?

    A common knowledge story that feels closer to true to me is that fire alarms produce common knowledge that you are ‘supposed to leave’, at least in some contexts.

    The main places I’ve seen people leave the building upon hearing a fire alarm is in large institutional settings—dorms and schools. It seems to me that in these cases the usual thing they are responding to is the knowledge that an authority has decided that they are ‘supposed to’ leave the building now, and thus it is the default thing to do, and if they don’t, they will be in a conflict with for instance the university police or the fire brigade, and there will be some kind of embarrassing hullabaloo. On this model, what could have been embarrassment at being overly afraid of a fire is averted by having a strong incentive to do the fire-cautious action for other reasons. So this is a version of the above category, but I think a particularly important one.

    In the other filmed experiment, people were extremely responsive to a person in a vest saying they should go, and in fact seemed kind of averse to leaving without being told to do so by an authority.

    With AI risk, the equivalent of this kind of fire alarm situation would be if a university suddenly panicked about AI risk sometimes, and required that all researchers go outside and work on it for a little bit. So there is nothing stopping us from having this kind of fire alarm, if any relevant powerful institution wanted it. But there would be no reason to expect it to be more calibrated than random people about actual risk, much as dorm fire alarms are not more calibrated than random people about whether your burned toast requires calling the fire brigade. (Though perhaps this would be good, if random caution is better than consistent undercaution.)

    Also note that this theory just moves the question elsewhere. How do authorities get the ability to worry about fires, without concern for shame? My guess: often the particular people responding also have a protocol to follow, upheld by a further authority. For instance, perhaps the university police are required by protocol to keep you out of the building, and they too do not wish to cause some fight with their superiors. But at some point, didn’t there have to be an unpressured pressurer? A person who made a cautious choice not out of obedience? Probably, but writing a cautious policy for someone else, from a distance, long before a possible emergency, doesn’t much indicate that the author is shitting themselves about a possible fire, so they are probably totally free from this dynamic.

    (If true, this seems like an observation we can make use of: if you want cautious behavior in situations where people will be incentivised to underreact, make policies from a distance, and or have them made by people who have no reason for fear.)

    I feel like this one is actually a big part of why people leave buildings in response to fire alarms. (e.g. when I imagine less authority-imbued settings, I imagine the response being more lax). So when we say there is no fire alarm for AI, are we saying that there is no authority willing to get mad at us if we don’t panic at this somewhat arbitrary time?

    One other nice thing to note about this model. For any problem, many levels of caution are possible: if an alarm causes everyone to think it is reasonable to ‘go and take a look’ but your own judgment is that the situation has reached ‘jump out of the window’ level, then you are probably still fairly oppressed by fear shame. Similarly, even if a foreign nation attacks an ally, and everyone says in unison, ‘wow, I guess it’s come to this, the time to act is now’, there will probably be people who think that it’s time to flee overseas or to bring out the nukes, and others who think it’s time to have a serious discussion with someone, and judgments will be flying. So for many problems, it seems particularly hard to imagine a piece of evidence that leads to total agreement on the reasonable course of action. The authority model deals with this because authority doesn’t mess around with being reasonable—it just cuts to the chase and tells you what to do.

    By norms?

    A different version of being ‘supposed to leave’ is that it is the norm, or what a cooperative person does. This seems similar in that it gives you reason to go outside, perhaps to the point of obligation, which is either strong enough to compel you outside even if you were still embarrassed, or anyway not related to whether you are fearful, and so unlikely to embarrass you. It still leaves the question of how a fire alarm came to have this power over what people are supposed to do.

    By commitment?

    Instead of having a distant authority compelling you to go outside, my guess is that you can in some situations get a similar effect by committing yourself at an earlier time where it wouldn’t have indicated fear. For instance, if you say, ‘I’m not too worried about this smoke, but if the fire alarm goes off, I’ll go outside’, then you have more reason to leave when the fire alarm does go off, while probably indicating less total fear. I doubt that this is a big way that fire alarms work, but it seems like a way people think about things like AI risk, especially if they fear psychologically responding to a gradual escalation of danger in the way that a boiling frog of myth does. They build an ‘alarm’, which sends them outside because they decided in the past that that would be the trigger.

    By inflicting pain?

    In my recollection, any kind of fire alarm situation probably involves an unbearably ear-splitting sound, and thus needs to be dealt with even if there is zero chance of fire. If leaving the building and letting someone else deal with it is available, it is an appealing choice. This mechanism is another form of ‘alternate motivation’, and I think is actually a lot like the authority one. The cost is arranged by someone elsewhere, in the past, who is free to worry on your behalf in such situations without shame; quite possibly the same authority. The added cost makes it easy to leave without looking scared, because now there is good incentive for even the least scared to leave, as long as they don’t like piercing shrieks (if you wanted to go really hard on signaling nonchalance, I think you could do so by just hanging out in the noise, but that end of the signaling spectrum seems like a separate issue).

    My guess is that this plays some role, speaking as a person who once fled an Oxford dorm enough times in quick succession to be fairly unconcerned by fire by the last, but who still feels some of the ungodly horror of that sound upon recollection.

    By alerting you to unseen fire?

    Even if some of these stories seem plausible at times, I find it hard to believe that they are the main thing going on with fire alarms. My own guess is that actually fire alarms really do mostly help by alerting people who haven’t received much evidence of fire yet, e.g. because they are asleep. I’m not sure why Eliezer thinks this isn’t so. (For instance, look up ‘fire alarm saved my life’ or ‘I heard the fire alarm’ and you get stories about people being woken up in the middle of the night or sometimes alerted from elsewhere in the building and zero stories about anything other than that, as far as I can tell on brief perusal. I admit though that ‘my friends and I were sitting there watching the smoke in a kind of nonchalant stupor and then the fire alarm released us from our manly paralysis’ is not the most tellable story.)

    I admit that the evidence is more confusing though - for instance, my recollection from a recent perusal of fire data is that people who die in fires (with or without fire alarms) are mostly not asleep. And actually the situation in general seemed pretty confusing, for instance, if I recall correctly, the most likely cause of a fatal fire appeared to be cigarette smoking, and the most likely time for it was the early afternoon. And while, ‘conscious person smoking cigarette at 1pm sets their room on fire and fails to escape’ sounds possible, I wouldn’t have pinned it as a central case. Some data also seemed to contradict, and I can’t seem to find most of it again now at all though, so I wouldn’t put much stock in any of this, except to note confusion.

    My guess is still that this is a pretty big part of how fire alarms help, based on priors and not that much contrary evidence.

    In sum: not much fire alarm for fires

    My guess is that fire alarms do a decent mixture of many things here - sometimes they provide straightforward evidence of fires, sometimes they wake people up, sometimes they compel people outside through application of authority or unbearable noise, sometimes they probably even make it less embarrassing to react to other fire evidence, either via creating common-knowledge or just via being an impersonal standard that one can refer to.

    So perhaps Eliezer’s ‘creating common knowledge of risk and so overcoming fear shame’ mechanism is part of it. But even if so, I don’t think it’s as much of a distinct thing. Like, there are various elements here that are helpful for combatting fear shame—evidence about the risk, impersonal evidence, a threshold in the situation already deemed concerning in the past, common knowledge. But there’s not much reason or need for them to come together in a single revolutionary event. And incremental versions of these things also help—e.g. A few people thinking it’s more likely that a concern is valid, or common knowledge of some compelling evidence among five people, or someone making a throwaway argument for concern, or evidence that some other people think the situation is worse without any change in the situation itself.

    So—I think fire alarms can help people escape fires in various ways, some of which probably work via relieving paralysis from fear shame, and some of which probably relate to Eliezer’s ‘fire alarm’ concept, though I doubt that these are well thought of as a distinct thing.

    And on the whole these mechanisms are a lot more amenable to partialness and incremental effects than suggested by the image of a single erupting siren pouring a company into a parking lot. I want to put fire alarms back there with many other observations, like hearing a loud bang, or smelling smoke: ambiguous and context dependent and open to interpretation that might seem laughable if it is too risk-averse. In the absence of authority to push you outside, probably people deal with these things by judging them, looking to others, discussing, judging more, iterating. Fire alarms are perhaps particularly as a form of evidence, but I’m not sure they are a separate category of thing.

    If this is what fire alarms are, we often either do or could have them for AGI. We have evolving evidence. We have relatively person-independent evidence about the situation. We have evidence that it isn’t embarrassing to act. We have plenty of alternate face-saving reasons to act concernedly. We have other people who have already staked their own reputation on AGI being a problem. All of these things we could have better. Is it important whether we have a particular moment when everyone is freed of fear shame?

    Is there a fire alarm for other risks?

    That was all about how fire alarms work for fires. What about non-fire risks? Do they have fire alarms?

    Outside of the lab, we can observe that humans have often become concerned about things before they were obviously going to happen or cause any problem. Do these involve ‘fire alarms’? It’s hard for me to think of examples of situations where something was so clear that everyone was immediately compelled to act on caution, without risk of embarrassment, but on the other hand thinking of examples is not my forte (asking myself now to think of examples of things I ate for breakfast last week, I can think of maybe one).

    Here are some cases I know something about, where I don’t know of particular ‘fire alarms’, and yet it seems that caution has been abundant:

    1. Climate change: my guess is that there are many things that different people would call ‘fire alarms’, which is to say, thresholds of evidence by which they think everyone should be appalled and do something. Among things literally referred to as fire alarms, according to Google, are the Californian fires and the words of Greta Thunberg and scientists. Climate change hasn’t become a universally acknowledged good thing to be worried about, though it has become a universally-leftist required thing to be worried about, so if some particular event prompted that, that might be a lot like a fire alarm, but I don’t know of one.
    2. Ozone hole: on a quick Wikipedia perusal, the closest thing to a fire alarm seems to be that “in 1976 the United States National Academy of Sciences released a report concluding that the ozone depletion hypothesis was strongly supported by the scientific evidence” which seems to have caused a bout of national CFC bannings. But this was presumably prompted by smaller groups of people already being concerned and investigating. This seems more like ‘one person smells smoke and goes out looking for fire, and they find one and come back to report and then several of their friends also get worried’.
    3. Recombinant DNA: my understanding is that the Asilomar conference occurred after an escalation of concern beginning with a small number of people being worried about some experiments, with opposition from other scientists until the end.
    4. Covid: this seems to have involved waves of escalating and de-escalating average concern with very high variance in individual concern and action in which purportedly some people have continued to favor more incaution to their graves, and others have seemingly died of caution. I don’t know if there has ever been near universal agreement on anything, and there has been ample judgement in both directions about degrees of preferred caution.
    5. Nuclear weapons: I don’t know enough about this. It seems like there was a fairly natural moment for everyone in the world to take the risk seriously together, which was the 6th of August 1945 bombing of Hiroshima. But if it was a fire alarm, it’s not clear what evacuating looks like. Stopping being at war with the US seems like a natural candidate, but three days later Japan hadn’t surrendered and the US bombed Nagasaki, which suggests Hiroshima was taken as less of a clear ‘evacuation time’. But I don’t know the details, and for instance, maybe surrendering isn’t straightforwardly analogous to evacuating.
    6. AI: It seems like there has been nothing like a ‘fire alarm’ for this, and yet for instance most random ML authors alike agree that there is a serious risk.10

    My tentative impression is that history has plenty of concerns built on ambiguous evidence. In fact looking around, it seems like the world is full of people with concerns that are not only not shared by that many others, but also harshly judged. Many of which seem so patently unsupported by clinching evidence that it seems to me ‘rational socially-processed caution dampened by fear shame’ can’t be the main thing going on. I’ll get more into this later.

    Summary: there are no ‘fire alarms’ for anything, and it’s fine (kind of)

    In sum, it seems to me there is no ‘fire alarm’ for AGI, but also not really a fire alarm for fires, or for anything else. People really are stymied in responding to risks by fear of judgment. Many things can improve this, including things that fire alarms have. These things don’t have to be all or nothing, or bundled together, and there is plenty of hope of having many of them for AGI, if we don’t already.

    So upon noting that there will be no fire alarm for AGI, if your best guess previously was that you should do nothing about AGI, I don’t think you should jump into action, assuming that you will be ever blind to a true signal. You should try to read the signals around you, looking out for these biases toward incaution.

    But also: fire alarms are built

    I think it’s interesting to notice how much fire alarms are about social infrastructure. Reading Eliezer’s post, I got the impression of the kind of ‘fire alarm’ that was missing as a clear and incontrovertible feature of the environment. For instance, an AI development that would leave everyone clear that there was danger, while still being early enough to respond. But the authority and pain infliction mechanisms are just about someone having created a trigger-action plan for you, and aggressive incentives for you to follow it, ahead of time. Even the common knowledge mechanisms work through humans having previously created the concept of a ‘fire alarm’ and everyone somehow knowing that it means you go outside. If fire alarms were instead a kind of organic object that we had discovered, with the kind of sensitivity to real fires that fire alarms have, I don’t even think that we’d run outside so fast. (I’m not actually even sure we would think of them as responding to fire—or like, maybe it would be rumored or known to fire alarm aficionados?)

    Developments are basically always worrying for some people and not for others - so it seems hard for anything like common knowledge to come from a particular development. If you want something like universal common knowledge that such-and-such is non-embarrassing now to think, you are more likely to get it with a change in the social situation. E.g. “Steven Hawking now says AI is a problem” is arguably more like a fire alarm in this regard than AlphaGo—it is socially constructed, and involves someone else taking responsibility for the judgment of danger.

    Even the components of fire alarm efficacy that are about conveying evidence of fire—to a person who hadn’t seen smoke, or understood it, or who was elsewhere, or asleep—are not naturally occurring. We built a system to respond to a particular subtle amount of smoke with a blaring alarm. The fact that there isn’t something like that for AI is appears to be because we haven’t built one. (New EA project proposal? Set up alarm system so that when we get to GPT-7 piercing alarms blare from all buildings until it’s out and responsible authorities have checked that the situation is safe.)

    II. Fear shame and getting groupstruck

    I think a better takeaway from all this research on people uncomfortably hanging out in smoke filled rooms is the fear shame hypothesis:

    Shame about being afraid is a strong suppressor of caution.

    Which is also to say:

    your relaxed attitude to X is partly due to uncalibrated avoidance of social shame, for most X

    (To be more concrete and help you to try out this hypothesis, without intending to sway you either way:

    • Your relaxed attitude to soil loss is partly due to uncalibrated avoidance of social shame
    • Your relaxed attitude to risk from nanotechnology is partly due to uncalibrated avoidance of social shame
    • Your relaxed attitude to risk from chemicals in paint is partly due to uncalibrated avoidance of social shame
    • Your relaxed attitude to Democratic elites drinking the blood of children is partly due to uncalibrated avoidance of social shame
    • Your relaxed attitude to spiders is partly due to uncalibrated avoidance of social shame)

    How is information about risk processed in groups in practice by default?

    Here it seems helpful to have a model of what is going on when a group responds to something like smoke, minus whatever dysfunction or bias comes from being scared of looking like a pansy.

    The standard fire-alarm-free group escape

    In my experience, if there is some analog of smoke appearing in the room, people don’t just wait in some weird tragedy of the commons until they drop dead. There is an escalation of concern. One person might say ‘hey, can you smell something?’ in a tone that suggests that they are pretty uncertain, and just kind of curious, and definitely not concerned. Then another person sniffs the air and says in a slightly more niggled tone, ‘yeah, actually - is it smoke?’. And then someone frowns as if this is all puzzling but still not that concerning, and gets up to take a look. And then if anyone is more concerned, they can chime in with ‘oh, I think there’s a lot of dry grass in that room too, I hope the spark generator hasn’t lit some of it’, or something.

    I’m not sure whether this is an incredibly good way to process information together about a possible fire, but it seems close to a pretty reasonable and natural method: each person expresses their level of concern, everyone updates, still-concerned people go and gather new information and update on that, this all repeats until the group converges on concern or non-concern. I think of this as the default method.

    It seems to me that what people actually do is this plus some adjustments from e.g. people expecting social repercussions if they express a different view to others, and people not wanting to look afraid. Thus instead we see the early reports of concern downplayed emotionally, for instance joked about, both allowing the reporter to not look scared, and also making it a less clear bid for agreement, so allowing the other person to respond with inaction, e.g. by laughing at the joke and dropping the conversation. I’m less clear on what I see exactly that makes me think there is also a pull toward agreeing, or that saying a thing is like making a bid for others to agree, and disagreeing is a potentially slightly costly social move, except for my intuitive sense of such situations.

    It’s not obvious to me that crippling embarrassment is a bias on top of this kind of arrangement, rather than a functional part of it. If each person has a different intrinsic level of fear, embarrassment might be genuinely aligning people who would be too trigger-happy with their costly measures of caution. And it’s not obvious to me that embarrassment doesn’t also affect people who are unusually incautious. (Before trying to resolve embarrassment in other ways, it seems good to check whether it is a sign that you are doing something embarrassing.)

    Two examples of groups observing ambiguous warning signs without fire alarms in the wild, from the time when Eliezer’s post came out and I meant to write this:

    1. At about 3am my then-boyfriend woke up and came and poked his head around my door and asked whether I could smell smoke. I said that I could, and that I had already checked the house, and that people on Twitter could also smell it, so it was probably something large and far away burning (as it happened, I think Napa or Sonoma). He went to bed, and I checked the house one more time, to be sure and/or crazy.
    2. I was standing in a central square in a foreign city with a group of colleagues. There was a very loud bang, that sounded like it was a stupendously loud bang some short distance away. People in the group glanced around and remarked on it, and then joked about it, and then moved to other topics. I remained worried, and surreptitiously investigated on my phone, and messaged a friend with better research resources at hand.

    I think Case 2 nicely shows the posited fear shame (though both cases suggest a lack of it with close friends). But in both cases, I think you see the social escalation of concern thing. In the first case my boyfriend actually sought me out to casually ask about smoke, which is very surprising on a model where the main effect of company is to cause crippling humiliation. Then it didn’t get further because I had evidence to reassure him. In the second case, you might say that the group was ignoring the explosion-like-thing out of embarrassment. But I hypothesize that they were actually doing a ratcheting thing that could have led to group fear, that quickly went downward. They remarked casually on the thing, and jokingly wondered about bombs and such. And I posit that when such jokes were met with more joking instead of more serious bombs discussion, the ones who had been more concerned became less so.

    The smoke experiment video also suggests that this kind of behavior is what people expect to do: the first woman says, ‘I was looking for some sort of reaction from someone else. Even just the slightest little thing, that they’d recognize that there was something, you know, going on here. For me to kind of, react on that and then do something about it. I kind of needed prodding.”

    I think this model also describes metaphorical smoke. In the absence of very clear signs of when to act, people indeed seem embarrassed to seem too concerned. For instance, they are sometimes falling over themselves to be distanced from those overoptimistic AI-predictors everyone has heard about. But my guess is that they avoid embarrassment not by sitting in silence until they drown in metaphorical smoke, but with a social back and forth maneuver—pushing the conversation toward more concern each time as long as they are concerned—that ultimately coordinates larger groups of people to act at some point, or not. People who don’t want to look like feverish techno-optimists are still comfortable wondering aloud whether some of this new image recognition stuff might be put to ill-use. And if that goes over well, next time they can be a little more alarmist. There is an ocean of ongoing conversation, in which people can lean a little this way and that, and notice how the current is moving around them. And in general—before considering possible additional biases—it isn’t clear to me that this coordination makes things worse than the hypothetical embarrassment-free world of early and late unilateral actions.11

    In sum I think the basic thing people do when responding to risks in a group is to cautiously and conformingly trade impressions of the level of danger, leading to escalating concern if a real problem is arising.


    A notable problem with this whole story so far is that people love being concerned. Or at least, they are often concerned in spite of a shocking dearth of evidential support, and are not shy about sharing their concerns.

    I think one thing going on is that people mostly care about criticism coming from within their own communities, and that for some reason concerns often become markers of political alignment. So if for instance the idea that there may be too many frogs appearing is a recognized yellow side fear, then if you were to express that fear with great terror, the whole yellow side would support you, and you would only hear mocking from the heinous green side. If you are a politically involved yellow supporter, this is a fine state of affairs, so you have no reason to underplay your concern.

    This complicates our pluralistic inaction story so much that I’m inclined to just write it off as a different kind of situation for now: half the people are still embarrassed to overtly express a particular fear, but for new reasons, and the other half are actively embarrassed to not express it, or to express it too quietly. Plus everyone is actively avoiding conforming with half of the people.

    I think this kind of dynamic is notably at play with climate change case, and weirdly-to-me also with covid. My guess is that it’s pretty common, at least to a small degree, and often not aligned with the major political sides. Even if there are just sides to do with the issue itself, all you need for this is that people feel a combination of good enough about the support of their side and dismissive enough of the other side’s laughter to voice their fears.

    In fact I wonder if this is not a separate issue, and actually a kind of natural outcome of the initial smelling of smoke situation, in a large enough crowd (e.g. society). If one person for some reason is worried enough to actually break the silence and flee the building, then they have sort of bet their reputation on there being a fire, and while others are judging that person, they are also updating a) that there is more likely to be a fire, and b) that the group is making similar updates, and so it is less embarrassing to leave. So one person’s leaving makes it easier for each of the remaining people to leave12. Which might push someone else over the edge into leaving, which makes it even easier to leave for the next person. If you have a whole slew of people leaving, but not everyone, and the fire takes a really long time to resolve, then (this isn’t game theory but my own psychological speculations) I can imagine the people waiting in the parking lot and the people sticking it out inside developing senses of resentment and judgment toward the people in the other situation, and camaraderie toward those who went their way.

    You can actually see a bit of something like this in the video of the Asch conformity experiments—when another actor says the true answer, the subject says it too and then is comradely with the actor:

    My guess is that in many cases even one good comrade is enough to make a big difference. Like, if you are in a room with smoke, and one other person is willing to escalate concern with you, it’s not hard to imagine the two of you reporting it together, while having mild disdain for the sheeple who would burn.

    So I wonder if groupishness is actually part of how escalation normally works. Like, you start out with a brave first person, and then it is easier to join them, and a second person comes, and you form a teensy group which grows (as discussed above) but also somewhere in there becomes groupish in the sense of its members being buoyed enough by their comrades’ support and dismissive enough of the other people that the concerned group are getting net positive social feedback for their concern. And then the concerned group grows more easily by there being two groups you can be in as a conformist. And by both groups getting associated with other known groups and stereotypes, so that being in the fearful group signals different things about a person than fearfulness. On this model, if there is a fire, this gets responded to by people gradually changing into the ‘building is on fire’ group, or newcomers joining it, and eventually that group becoming the only well respected one, hopefully in time to go outside.

    In sum, we see a lot of apparently uncalled for and widely advertised fearfulness in society, which is at odds with a basic story of fear being shameful. My guess is that this is a common later part of the dynamic which might begin as in the experiments, with everyone having trouble being the first responder.

    Note that this would mean the basic fire alarm situation is less of a good model of real world problems of the kind we might blog about, where by the time you are calling for people to act in spite of their reluctance to look afraid, you might already be the leader of the going outside movement which they could join in relatively conformist ease, perhaps more at the expense of seeming like a member of one kind of group over another than straightforwardly looking fearful.

    Is the fear shame hypothesis correct?

    I think the support of this thesis from the present research is actually not clear. Darley and Latané’s experiment tells us that people in groups react less to a fire alarm than individuals. But is the difference about hiding fear? Does it reveal a bias? Is it the individuals who are biased, and not the group?

    Is there a bias at all?

    That groups and individuals behave differently doesn’t mean that one of the two is wrong. Perhaps if you have three sources of evidence on whether smoke is alarming, and they are overall pointing at ‘doubtful’, then you shouldn’t do anything, whereas if you only have one and it is also pointing at ‘doubtful’, you should often gather more evidence.

    It could also be that groups are generally more correct due to having more data, and whether they are more or less concerned than individuals actually varies based on the riskiness of the situation. Since these kinds of experiments are never actually risky, our ability to infer that a group is under-reacting relies on the participants being successfully misled about the degree of risk. But maybe they are only a bit misled, and things would look very different if we watched groups and individuals in real situations of danger. My guess is that society acts much more on AI risk and climate change than the average of individuals’ behavior, if the individuals were isolated from others with respect to that topic somehow.

    Some evidence against a bias is that groups don’t seem to be consistently less concerned about risk than individuals, in the wild. For instance, ‘panics’ are a thing I often hear that it would be bad to start.

    Also, a poll of whoever sees such things on my Twitter suggests that while rarer, a decent fraction of people feel social pressure toward being cautious more often than the reverse:

    Are groups not scared enough or are individuals too scared?

    Even if there is a systematic bias between groups and individuals, it isn’t obvious that groups are the ones erring. They appear to be in these fire alarm cases, but a) given that they are in fact correct, it seems like they should get some benefit of the doubt, and b) these are a pretty narrow set of cases.

    An alternate theory here would be that solitary people are often poorly equipped to deal rationally with risks, and many tend to freak out and check lots of things they shouldn’t check, but this is kept in check in a group setting by some combination of reassurance of other people, shame about freaking out over nothing, and conformity. I don’t really know why this would be the situation, but I think it has some empirical plausibility, and it wouldn’t be that surprising to me if humans were better honed for dealing with risks in groups than as individuals. (D&L suggest a hypothesis like this, but think it isn’t this, because the group situation seemed to alter participants likelihood of interpreting the smoke as fire, rather than their reported ability to withstand the danger. I’m less sure that inclination to be fearless wouldn’t cause people to interpret smoke differently.)

    One might think a reason against this hypothesis is that this shame phenomenon seems to be a bias in the system, so probably the set who are moved by it (people in groups) are the ones who are biased. But you might argue that shame is maybe a pretty functional response to doing something wrong, and so perhaps you should assume that the people feeling shame are the ones who would otherwise be doing something wrong.

    Is it because they want to hide their fear?

    In an earlier study, D&L observed participants react less to an emergency that other participants could see, even when the others couldn’t see how they responded to it.

    D&L infer that there are probably multiple different things going on. Which might be true, but it does pain me to need two different theories to explain two very similar datapoints.

    Another interesting fact about these experiments is that the participants don’t introspectively think they interpret the smoke as fire, and want to escape, but are concerned about looking bad. If you ask them, apparently they say that they just didn’t think it was fire:

    “Subjects who had not reported the smoke also were unsure about exactly what it was, but they uniformly said that they had rejected the idea that it was a fire. Instead, they hit upon an astonishing variety of alternative explanations, all sharing the common characteristic of interpreting the smoke as a nondangerous event. Many thought the smoke was either steam or air-conditioning vapors, several thought it was smog, purposely introduced to simulate an urban environment, and two (from different groups) actually suggested that the smoke was a “truth gas” filtered into the room to induce them to answer the questionnaire accurately. (Surprisingly, they were not disturbed by this conviction.) Predictably, some decided that “it must be some sort of experiment” and stoicly endured the discomfort of the room rather than overreact.

    Despite the obvious and powerful report inhibiting effect of other bystanders, subjects almost invariably claimed that they had paid little or no attention to the reactions of the other people in the room. Although the presence of other people actually had a strong and pervasive effect on the subjects’ reactions, they were either unaware of this or unwilling to admit it.”

    I don’t take this as strong evidence against the theory, because this seems like what it might look like for a human to see ambiguous evidence and at some level want to avoid seeming scared. Plus if you look at the video of this experiment being rerun, the people in groups not acting do not look uniformly relaxed.

    For me a big plus in the theory of fear shame is that it introspectively seems like a thing. I’m unusually disposed toward caution in many circumstances, and also an analytic approach that both doesn’t match other people’s intuitive assessments of risk always, and isn’t very moved by observing this. And I do feel the shame of it. This year has allowed particular observation of this: it is just embarrassing, for me at least, to wear a heavy duty P100 respirator in a context where other people are not. Even if the non-social costs of wearing a better mask are basically zero in a situation (e.g. I don’t need to talk, I’m kind of enjoying not having my face visible), it’s like there is an invisible demand rising from the world, ‘why are you wearing such a serious mask? Is it that you think this is dangerous?’ (‘Only a little bit dangerous, please, I’m just like you, it’s just that on net I don’t really mind wearing the bigger mask, and it is somewhat safer, so why not?’13)

    But on further consideration, I think introspection doesn’t support this theory. Because a much broader set of things than fear seem to produce a similar dynamic to seeing smoke in a group, or to in other cases where I feel unable to take the precautions I would want because of being observed.

    Here are some actions that feel relatedly difficult to me—probably either because the outward behavior seem similar or because I expect a similar internal experience—but where the threat of seeming too fearful in particular isn’t the issue:

    1. Wearing a weird outfit in public, like a cape (this feels fairly similar to wearing a heavy duty mask in public, e.g. I’m inclined not to though there are no obvious consequences, and if I do, my brain becomes obsessed with justifying itself)
    2. Wearing no mask in a context where others have masks (my friend says this feels similarly hard to wearing an overly large mask to him)
    3. Getting up and leaving a room of people doing a questionnaire if there appeared to be hundred dollar bills falling from the sky outside the window (I expect this to feel somewhat similar to seeing smoke)
    4. Answering a question differently from everyone else in front of the room, as in the classic Asch conformity experiments (I expect this to feel a bit like seeing smoke, and the behavior looks fairly similar: a person is offered a choice in front of a group who all seem to be taking the apparently worse option)
    5. Being shown a good-seeming offer with a group of people, e.g. an ad offering a large discount on a cool object if you call a number now (I would find it hard to step out and phone the number, unless I did it surreptitiously)
    6. Being in a large group heading to a Japanese restaurant, and realizing that given everyone’s preferences, an Italian restaurant would be better (I think this would feel a bit like seeing smoke in the room, except that the smoke wasn’t even going to kill you)
    7. Sitting alone at a party, in a way that suggests readiness to talk, e.g. not looking at phone or performing solitary thoughtfulness (this makes me want to justify myself, like when wearing a big mask, and is very hard to do, maybe like standing up and leaving upon seeing smoke)
    8. Leaving a large room where it would be correct to say goodbye to people, but there are so many of them, and they are organized such that if you say goodbye to any particular person, many others will be watching, and to say goodbye to everyone at once you will have to shout and also interrupt people, and also may not succeed in actually getting everyone’s attention, or may get it too loudly and seem weird (this has an, ‘there’s an obviously correct move here, and I somehow can’t do it because of the people’ feeling, which I imagine is similar to the smoke)
    9. If a class was organizing into groups in a particular way, and you could see a clearly better way of doing it, telling the class this
    10. Shouting a response to someone calls out a question to a crowd
    11. Walking forward and investigating whether a person is breathing, when they have collapsed but there is a crowd around them and you don’t know if anyone has done anything
    12. Getting up to help someone who has fallen into the subway gap when lots of people can see the situation
    13. Stepping in to stop a public domestic violence situation
    14. Getting up to tell a teacher when a group of other students are sticking needles into people’s legs (this happened to me in high school, and I remember it because I was so paralyzed for probably tens of minutes while also being so horrified that I was paralyzed)
    15. Asking strangers to use their credit card to make an important phone call on the weird public phones on a ship (this also happened to me, and I was also mysteriously crippled and horrified)
    16. Criticizing someone’s bad behavior when others will see (my friend says he would feel more game to do this alone, e.g. if he saw someone catcalling a woman rudely)
    17. Correcting a professor if they have an equation wrong on the board, when it’s going to need to be corrected for the lesson to proceed sensically, and many people can see the issue
    18. Doing anything in a very large room with about six people scattered around quietly, such that your actions are visible and salient to everyone and any noise or sudden motion you make will get attention
    19. Helping to clean up a kitchen with a group of acquaintances, e.g. at a retreat, where you are missing information for most of the tasks (e.g. where do chopping boards live, do things need to be rinsed off for this dishwasher, what is this round brown object, did it all start out this dirty?)
    20. Doing mildly unusual queueing behavior for the good of all. For instance, standing in a long airport queue, often everyone would be better off if a gap were allowed to build at the front of the queue and then everyone walked forward a longer distance at once, instead of everyone edging forward a foot at a time. This is because often people set down their objects and read on their phones or something while waiting, so it is nicer to pick everything up and walk forward five meters every few minutes than it is to pick everything up and walk forward half a meter every twenty seconds. Anyone in the queue can start this, where they are standing, by just not walking forward when the person in front of them does. This is extremely hard to do, in my experience.
    21. Asking or answering questions in a big classroom. I think professors have trouble getting people to do this, even when students have questions and answers.
    22. Not putting money in a hat after those around you have
    23. Interacting with a child with many adults vaguely watching
    24. Taking action on the temperature being very high as a student in a classroom
    25. Cheering for something you liked when others aren’t
    26. Getting up and dancing when nobody else is
    27. Walking across the room in a weird way, in most situations
    28. Getting up and leaving if you are watching something that you really aren’t liking with a group of friends

    Salient alternate explanations:

    1. Signaling everything: people are just often encumbered any time people are looking at them, and might infer anything bad about them from their behavior. It’s true that they don’t want to seem too scared, but they also don’t want to seem too naively optimistic (e.g. believing that money is falling from above, or that they are being offered a good deal) or to not know about fashion (e.g. because wearing a cape), or to be wrong about how long different lines are (e.g. in the Asch experiments).
    2. Signaling weirdness: as in 1, but an especially bad way to look is ‘weird’, and it comes up whenever you do anything different from most other people, so generally cripples all unusual behavior.
    3. Conformity is good: people just really like doing what other people are doing.
    4. Non-conformity is costly: there are social consequences for nonconformity (2 is an example of this, but might not be the only one).
    5. Non-conformity is a bid for being followed: if you are with others, it is good form to collaboratively decide what to do14. Thus if you make a move to do something other than what the group is doing, it is implicitly a bid for others to follow, unless you somehow disclaim it as not that. According to intuitive social rules, others should follow iff you have sufficient status, so it is also a bid to be considered to have status. This bid is immediately resolved in a common knowledge way by the group’s decision about whether to follow you. If you just want to leave the room and not make a bid to be considered high status at the same time—e.g. because that would be wildly socially inappropriate given your actual status—then you can feel paralyzed by the lack of good options.

      This model fits my intuitions about why it is hard to leave. If I imagine seeing the smoke, and wanting to leave, what seems hard? Well, am I just going to stand up and quietly walk out of the room? That feels weird, if the group seems ‘together’ - like, shouldn’t I say something to them? Ok, but what? ‘I think we should go outside’? ‘I’m going outside’? These are starting to sound like bids for the group agreeing with me. Plus if I say something like this quietly, it still feels weird, because I didn’t address the group. And if I address the group, it feels a lot like some kind of status-relevant bid. And when I anticipate doing any of these, and then nobody following me, that feels like the painful thing. (I guess at least I’m soon outside and away from them, and I can always move to a new city.)

      On this theory, if you could find a way to avoid your actions seeming like a bid for others to leave, things would be fine. For instance, if you said, ‘I’m just going to go outside because I’m an unreasonably cautious person’, on this theory it would improve the situation, whereas on the fear shame hypothesis, it would make it worse. My own intuition is that it improves the situation.

    6. Non-conformity is conflict: not doing what others are doing is like claiming that they are wrong, which is like asking for a fight, which is a socially scary move.
    7. Scene-aversion: people don’t like ‘making a scene’ or ‘making a fuss’. They don’t want to claim that there’s a fire, or phone 911, or say someone is bad, or attract attention, or make someone nearby angry. I’m not sure what a scene is. Perhaps a person has made one if they are considered responsible for something that is ‘a big deal’. Or if someone else would be right in saying, ‘hey everyone, Alice is making a bid for this thing to be a big deal’

    These are not very perfect or explanatory or obviously different, but I won’t dive deeper right now. Instead, I’ll say a person is ‘groupstruck’15 if they are in any way encumbered by the observation of others.

    My own sense is that a mixture of these flavors of groupstruckness happen in different circumstances, and that one could get a better sense of which and when if one put more thought into it than I’m about to.

    A big question that all this bears on is whether there is a systematic bias away from concern about risks, in public e.g. in public discourse. If there is—if people are constantly trying to look less afraid than they are—then it seems like an important issue. If not, then we should focus on other things, for instance perhaps a lurking systematic bias toward inaction.

    My own guess is that the larger forces we see here are not about fear in particular, and after the first person ‘sounds the alarm’ as it were, and some people are making their way outside, the forces for and against the side of higher caution are more messy and not well thought of as a bias against caution (e.g. worrying about corporate profits or insufficient open source software or great power war mostly makes you seem like one kind of person or another, rather than especially fearful). My guess is that these dynamics are better thought of as opposing a wide range of attention-attracting nonconformism. That said, my guess is that overall there are somewhat stronger pressures against fear than in favor of it, and that in many particular instances, there is a clear bias against caution, so it isn’t crazy to think of ‘fear shame’ as a thing, if a less ubiquitous thing, and maybe not a very natural category.

    III. Getting un-groupstruck

    How can fear shame and being groupstruck be overcome? How are things like this overcome in practice, if they ever are? How should we overcome them?

    Some ideas that might work if some of the above is true, many inspired by aspects of fire alarms:

    1. A person or object to go first, and receive the social consequences of nonconformity
      For instance, a person whose concern is not discouraged by social censure, or a fire alarm. There is no particular need for this to be a one-off event. If Alice is just continually a bit more worried than others about soil loss, this seems like it makes it easier for others to be more concerned than they would have been. Though my guess is that often the difference between zero and one people acting on a concern is especially helpful. In the case of AI risk, this might just mean worrying in public more about AI risk.
    2. Demonstrate your non-judgmentalness
      Others are probably afraid of you judging them often. To the extent that you aren’t also oppressed by fear of judgment from someone else, you can probably free others some by appearing less judgmental.
    3. Other incentives to do the thing, producing plausible deniability
      Cool parties to indicate your concern, prestigious associations about it…
    4. Authorities enforcing caution
      Where does the shame-absorbing magic of a real fire alarm come from, when it has it? From an authority such as building management, or your school, or the fire brigade, who you would have to fight to disobey.
    5. ‘Fire wardens’
      A combination of 1 and 2 and maybe 8. The experiment above found that people responded very fast to a fire warden telling them to move. Here, a policy made from a distance sends in a person whose job it is to authoritatively tell you to leave. This looks pretty effective for fires, anecdotally. For AI safety, one equivalent might be a person in a company whose job it is to watch over some analysis of the safety of different projects, with the authority to tell people that projects have to be set down sometimes. In general, set up genuine authority on the questions you want to have guidance for when the time comes (rather than making calls on at the time), and allow them to set policy in coolness ahead of time, and grant them the ability to come in with a megaphone and a yellow vest when you want to be warned.
    6. Clash with another conformist behavior
      For instance, if everyone is sitting by in some smoke, but also everyone does what they are told by a police person, then calling in the police might dislodge them
    7. Politicization
      Once there are multiple groups who feel good about themselves, it is probably easier for people to join whichever might have initially felt too small and non-conformist. On the downside, I imagine it might be harder for everyone to ultimately join, and also this sounds messy and I’ve only thought about it for a few minutes.
    8. Policy from outside the paralysis
      If you leave your dorm because there is a fire alarm, the dean who made the policy that requires you to does not have to feel awkwardly afraid each time the alarm goes off and you have to leave the building. (As discussed above.) In general, arranging to make cautious policies from places where caution won’t be embarrassing seems helpful.
    9. A slightly better empirical case that the time for concern is now These forces aren’t all powerful—if people are worried enough, they will often act in spite of embarrassment, or cease being embarrassed. Plus, if the evidence is good enough that someone acts, that can help others act (see 1).
    10. A shift in the general overton window
      thinking climate change will probably cause intense disaster and may destroy the world and requires urgent action is now the norm, and thinking that it might be bad but will probably not be that bad and shouldn’t be the highest priority risks being an asshole.
    11. A new framing or emphasis of attention
      E.g. It’s not about being scared of lifelong disability, it’s about respecting the frontline workers and the work they are putting in day in and day out dealing with people who insist on partying in this disaster.
    12. Personal trigger for action
      It can probably be valuable to state ahead of time a trigger that you think would cause you to do a thing, so that you at least notice if your standards are slipping because you don’t want to do the thing. I don’t see why this should be particularly related to any threshold at which society recognizes interest in an issue to be non-embarrassing.
    13. Smaller rooms
      If your auditorium of people hearing a fire alarm were instead a hundred rooms with five people in each, some of the fives of people would probably manage to leave, which if visible might encourage others to go. It’s easier to get common knowledge that a thing isn’t embarrassing with five people than with five hundred people. My guess is also that people would leave the room in the smoke faster if they were in pairs who were messaging with each other as part of the fake task. Because bringing up the smoke to one person isn’t so hard, and if a pair finds that they are both concerned, it is easier for two people to leave together. Thus for instance organizing small group discussions of an issue might be better for getting people’s genuine levels of concern on the table.
    14. Escalating scale of company
      Related to the above, my guess is that if a person is in a larger group implicitly, e.g. a community, and is concerned, they will try to get the mild attention of a single person and discuss it privately, then escalate from there. E.g. first you jokingly mention the worry to your boyfriend, then if he doesn’t laugh that much, you admit that maybe it could conceivably be a real thing, then you both speculate about it a bit and learn a bit more, then you say that you are actually a bit worried, and then he says that too, then you start to feel out your friends, etc. My guess is that this helps a lot with mitigating these paralyses. Thus making it easier seems helpful. For instance, if you are running an event where you think people are going to be crippled from dissenting from a certain view in front of the room, you could have them first discuss the question with a single person, then with a small group.16
    15. Citable evidence
      If objective, citable evidence that you could justify your caution with is much more helpful than evidence for private consumption, then you can help mitigate fear shame by providing that sort of evidence. For instance, survey data showing that the median ML researcher thinks AI poses an extreme risk.
    16. Make a fire alarm
      As noted above, fire alarms are not natural phenomena—they are built. If you thought fire alarms were a thing, and their absence was important, then trying to build one seems like perhaps a good move. (If you were considering devoting your life to trying to engineer a friendly AI revolution on a short timeline for want of a fire alarm, perhaps more so.) Given the ambiguities in what exactly a fire alarm is doing, this might look different ways. But maybe something like a measure of risk (which needn’t be accurate at all) which triggers the broadcast of an alert and call for a specific act of caution from specific parties, which was generally thought to be authoritative or otherwise desirable to listen to ahead of time.

    Conclusions: forget fire alarms, down with fear shame and groupstrickenness

    In conclusion, fire alarms don’t seem that important in the battle against fear shame, and fear shame also doesn’t seem like a great description of what’s going on. People seem frequently encumbered into apparent irrationality in the company of others, which seems important, but there seem to be lots of things to do about it. I think we should plausibly do some of them.

    Action conclusions

    I’m saying:

    DON’T: say ‘there will never be a fire alarm, so this is basically the situation we will always be in’ and flee the building/work on AI safety out of an inability to distinguish this from the dire situation.

    DO: consider whether your position is unduly influenced by social incentives that don’t track the real danger of the situation—for instance, whether you would find it embarrassing among your current associates to express deep concern for AI risk—and try to adjust your level of concern accordingly.

    DO: make it easier for everyone to follow their assessment of the evidence without oppressive social influences at a personal level, by:

    1. practicing voicing your somewhat embarrassing concerns, to make it easier for others to follow (and easier for you to do it again in future)
    2. reacting to others’ concerns that don’t sound right to you with kindness and curiosity instead of laughter. Be especially nice about concerns about risks in particular, to counterbalance the special potential for shame there. [or about people raising points that you think could possibly be embarrassing for them to raise]

    DO: consider thinking about designing policies and institutions that might mitigate the warping of fear shame and social encumberment (some ideas above).

    DO: make ‘fire alarms’, if you think they are important. Find measurable benchmarks with relatively non-subjective-judgment-based import. Find them ahead of time, before social incentives hit. Measure them carefully. Get authoritative buy-in re their import and the reasonable precautions to take if they are met. Measure carefully and publicize our distance from them.

    In sum, I think you should take seriously the likelihood that you and everyone else are biased in the direction of incaution or inaction—as it seems like there is good evidence that you might be—but that this is not especially well thought of in terms of ‘fire alarms’.


    1. A single real participant accompanied by two actors instructed to remain calmly in the room will also sit by, but this seems unsurprising and unconcerning, if we assume people in groups normally share information and partly defer to one another. Probably a lone human surrounded by actors claiming to have thought about AGI and come to the view that it is totally fine would also not prioritize AGI, which seems fine. 

    2. The video doesn’t show the more interesting case with a group of innocent participants. I checked with Dominic Abrams, the professor featured in it, and he said it was a genuine experiment, i.e. with real participants, rather than just a re-enactment, but that they didn’t write it up. He recalls that they also did the condition with three innocent participants, and that people were always slower if not alone. Even as essentially anecdote, I find the video pretty interesting. 

    3. Though this explanation has more trouble with the observation of an earlier experiment that people were inactive when they knew others could see a problem, even though they were invisible to the rest of the group, and so arguably protected from judgment, as noted by Darley and Latané 1968. (“We have found (Darley & Latané”, 1968) that the mere perception that other people are also witnessing the event will mark- edly decrease the likelihood that an individual will intervene in an emergency. Individuals heard a person undergoing a severe epileptic-like fit in another room. In one experimental condition, the subject thought that he was the only person who heard the emergency; in another condition, he thought four other persons were also aware of the seizure. Subjects alone with the victim were much more likely to intervene on his behalf, and, on the average, reacted in less than one-third the time required by subjects who thought there were other bystanders present.”) 

    4. Or at least a smoke alarm—technically I think a fire alarm is the thing that sends ringing throughout a large building in response to some trigger, whereas the smaller thing with a single detector and attached sound maker is a smoke detector. 

    5. “Evacuation expert Prof Ed Galea said “People who had purchased and paid for their meal… Even though they could see the smoke, they could smell the smoke, they could hear the alarms going off, they felt they had sufficient time to complete their meals before evacuating.”” https://www.fireco.uk/3-reasons-people-ignore-fire-alarms/ 

    6. “Another witness of the fire called Kate said: “I remember reading in the newspaper at the time that one man had to be dragged from the restaurant because he wanted to finish his meal despite the flames across the floor.”” https://www.mancunianmatters.co.uk/life/09052013-ill-never-forget-it-manchester-remembers-tragic-woolworths-fire-of-1979-that-claimed-10-lives-in-blaze/ 

    7. It could be that randomly at that next level, people are scared enough to leave. But then it feels like our understanding of fire alarms here is at the level ‘randomly people are more worried about them’. 

    8. That matches my experience but it sounds like maybe the ones that automatically call the fire department really do indicate a fire at least 15% of the time (I think I saw much higher numbers too somewhere). 

    9. Note to hypochondriacs: this doesn’t work as well as you might think—people will doubt your pulse oximeter, and your judgment about when to say pulse oximeters are probably just broken. If you have two pulse oximeters to provide ironclad evidence, this can make them even more suspicious of the whole situation. Which seems somehow fair, though I know of no good explanation for how it is fair. 

    10. According to a large survey of machine learning experts thath I ran with others, the median chance that high level machine intelligence has ‘extremely bad (e.g human extinction)’ level outcomes is about 5%. 

    11. In the particular case of AI I am more worried than average, and the coordinated time to act will be later, and I suspect too late, so maybe I would prefer ignore it. But this is a specific issue where I already have a view, and I don’t see that considering the larger structure of fire alarms lends support to my desire in general. 

    12. On the alternate model that people are basically conformist, this seems even more true. It’s probably worse as a conformist to be the first to leave than to conform with a much smaller group when there is a larger available group. 

    13. And wearing a p100 in public is actually non-conformist-public-concern on easy mode, because it also covers up your face and averts the need for you to make expressively apologetic or excusatory faces at random strangers. ‘Yes I know, I know, but look I’m not oblivious, I do know—probably I am doing something you don’t understand…I’m looking at something you can’t see, and I’m looking at my watch because I’m in a hurry for something you don’t know about but that definitely means I need this excessive mask; I’m probably escorting the queen somewhere or something like that, and you know, she’s very old and it’s reasonable for her to be cautious, or I mean obviously it’s not that reasonable, like I wouldn’t do it, but you know, she’s the queen, and she’s from another time, so we have patience with her foibles’. 

    14. As an interesting datapoint about this, people seem to talk as if they are compelled to act as a group, when there is no obvious reason that they have to, and they don’t seem to agree with the group consensus. e.g. ‘The fire alarm in the block of apartments we r staying in is going off but we are ignoring it?’ 

    15. Thanks to a friend for this word. 

    16. This sort of thing maybe requires that you can maintain a view at least somewhat different from that which you are most comfortable expressing.

      Interestingly in the study, people actually took much longer to notice the smoke when they were in company (based on the authors’ claim that the point of noticing ‘was a relatively easy observation to make, for the subjects invariably showed a distinct, if slight, startle reaction.’). It took a median of under five seconds for people alone to notice the smoke, and 20 seconds for the first person in a group (combining both group conditions). The authors attribute this to people looking around all over the place when alone, but keeping their eyes in line in a group, for fear of seeming to stare or something. Which sounds plausible, but I do wonder a bit if there is something else going on, for instance where at a basic perceptual level if you are in a group with an implicit consensus that things are fine, you don’t endanger that with exploration of peripheral sign of danger. (It could also be that it’s easier to see something moving when everything is still in a room.) 

  • Punishing the good

    Should you punish people for wronging others, or for making the wrong call about wronging others?

    For example:

    1. A newspaper sends me annoying emails all the time, but suppose that empirically if they didn’t behave like this, they would get markedly fewer subscribers, and may not survive. And suppose their survival is in fact worth a little annoyance for a lot of people, we all agree. Such that if I was in their position, I agree that I would send out the annoying emails. Should I resent them and unsubscribe from their paper for their antisocial behavior, or praise them and be friendly because overall I think they made the right call?
    2. Suppose Bob eats beef, which he thinks makes him feel somewhat better and so be better able to carry out his job as a diplomat negotiating issues in which tens of thousands of lives are at stake. He also thinks it is pretty bad for the cows, but worth it on net. Suppose he’s right about all of this. Five hundred years later, carnivory is illegal and hated, and historians report that Bob, while in other regards a hero, did eat beef. Should the people of 2521 think of Bob as an ambiguous figure, worthy of both pride and contempt? or should they treat him as purely a hero, who made the best choice in his circumstances?

    I have one intuition that says, ‘how can you punish someone for doing the very best thing they could have done? What did you want them to do? And are you going to not punish the alternative person, who made a worse choice for the world, but didn’t harm someone in the process? Are you just going to punish everyone different amounts?’

    But an argument for the other side—for punishing people for doing the right thing—is that it is needed to get the incentives straight. If Alice does $100 of harm to Bruce to provide $1000 of help to Carrie, then let’s suppose that that’s good (ignoring the potential violation of property rights, which seems like it shouldn’t be ignored ultimately). But if we let such things pass, then Alice might also do this when she guesses that is only worth $60 to Carrie, if she cares about Carrie more than Bruce. Whereas if we always punish Alice just as much as she harmed Bruce, then she will take the action exactly when she would think it worth it if it was her own welfare at stake, rather than Bruce’s. (This is just the general argument for internalizing externalities - having people pay for the costs they impose on others.)

    This resolution is weirder to the extent that the punishment is in the form of social disgrace and the like. It’s one thing to charge Bob money for his harms to cows, and another to go around saying ‘Bob made the best altruistic decisions he could, and I would do the same in his place. Also I do think he’s contemptible.’

    It also leaves Bob in a weird position, in which he feels fine about his decision to eat beef, but also considers himself a bit of a reprehensible baddie. Should this bother him? Should he try to reform?

    I’m still inclined toward punishing such people, or alternately to think that the issue should be treated with more nuance than I have done, e.g. distinguishing punishments from others’ opinions of you, and more straightforward punishments.

  • Lafayette: empty traffic signals

    Seeking to cross a road on the walk into downtown Lafayette, instead of the normal pedestrian crossing situation, we met a button with a sign, ‘Push button to turn on warning lights’. I wondered, if I pressed it, would it then be my turn to cross? Or would there just be some warning lights? What was the difference? Do traffic buttons normally do something other than change the lights? I clearly believe they do. They make it my turn. But they don’t send a wire into the ‘whose turn is it’ variable deep in the ought-sphere, so what exactly do they do?

    I suspected that this button didn’t change whose turn it was, and it felt empty and devoid of some special potence of being a traffic button.

    I liked to imagine that it was just a normal traffic button, but taking a more nihilistic view of its role. In which case, its nihilistic view seemed to have practical consequences! It wasn’t being as good a good traffic button while saying that it didn’t change whose turn it is. It genuinely fails to coordinate the traffic so well, because here am I unable to garner the ‘right’ to cross with confidence, and there are the drivers unsure what I’m doing. But shouldn’t a traffic button be able to do its job regardless of its philosophical commitments, or without pretending to have philosophical commitments it doesn’t have?

    One might say that the thing going on is that it being ‘my turn’ is a fact about everyone’s expectations. For instance, if the drivers will expect me to cross, then it is ‘my turn’. (I’m tempted to say ‘if the drivers think it is my turn, then it is my turn’, but what are the drivers then thinking?) This doesn’t seem quite right, in that the drivers themselves are asking whether this light means that it is the pedestrian’s turn, and all of us seem to be asking something about the underlying truth, not about each other. Also, if I run erratically into the road, the drivers and I may both come to expect that I am going to cross, but it still isn’t my turn.

    I fantasized that I had woken up in a new world which was just like the old world, but where everything was like the traffic light. I would phone the doctor’s office later to ask if it was ok to cancel my appointment this late, they would just say, ‘I’ll change what it says in this schedule’.

    ‘But is it ok?’

    ‘I will not record your cancellation.’

    ‘Should I pay you?’

    ‘I am not charging you’

    ‘But is that like a favor, or is it the policy? Have I wronged your medical practice? Do I owe you really? Tell me if I was bad!’

    ‘I erased your name from this box on my piece of paper.’

    My tentative take is that turns are real, and we created them, and traffic buttons have genuine power over them, and if a traffic button doesn’t recognize their existence it is potentially at a real disadvantage, perhaps in a similar way to how a chair maker who doesn’t recognize chairs as a thing is at a disadvantage.

    (To be clear, I expect philosophers have much better thought out views on this, and welcome people telling me what they are–this is me thinking aloud, not philosophical advice.)

  • Lafayette: traffic vessels

    This week I’m in Lafayette, a town merely twenty-three minutes further from my San Franciscan office than my usual San Franciscan home, thanks to light rail. There are deer in the street and woods on the walk from the train to town.

    On this occasion at least, Lafayette doesn’t feel properly like a small town to me. I think it’s the main road. A lot of the town is spread along this single road, but the road itself doesn’t feel like its main deal is being Lafayette’s main road. It feels more focused on being an energetic transitway between somewhere and somewhere else, neither in Lafayette. Which probably isn’t even that true, since there is a perfectly giant highway also spanning Lafayette just North of it. Maybe the problem is that it’s too wide, so that the town feels like it’s tenuously accumulated on the sides of a road like plaque, rather than the road being an organic vessel of the town. Or, you know, I’m imagining things.

    I seem to imagine things a lot regarding some kind of road Feng Shui (note: I know nothing about actual Feng Shui). My mind natively reads roads as conduits of some kind of ‘energy’, and tries to apply appropriate intuitive physics. For instance, if you have big flows in and out of a place, relative to the place itself, it won’t feel like its own place. It will feel like a section of a larger place. For instance, the typical random intersection in a big American city can’t be a place with its own local vibe, where you might feel like staying, because it can’t be very separate from the surrounding city that its energy-traffic is constantly being exchanged with. It’s just going to feel like a section of various energetic routes elsewhere.

    This intuitive physics is sort of like the physics of streams with leaves and debris in them. For a place to be placelike, and appealing to stay in, it needs to have enough nooks or ponds or complications for the fast flowing streams in and out to eddy around in and slow down and let the debris swirl to a halt. And this main street is a big stream running through a small place.

    This is all contradicted by the frequency with which people like to stand in narrow thoroughfares at parties even in the face of literal physical streams of partygoers pressing against them. (My intuition on this case is that the pressure of the partygoer liquid is so high that it somehow makes sense to be stuck in the doorway, but I don’t explicitly see how this model even makes sense.)

    I don’t know of any pro evidence for this model, but my brain just keeps on having it.

  • Typology of blog posts that don't always add anything clear and insightful

    I used to think a good blog post should basically be a description of a novel insight.

    To break it down more, on this view:

    1. A blog post should have a propositional claim (e.g. ‘the biggest externalities are from noise pollution’, or ‘noise pollution is a concept’ vs. expression of someone’s feelings produced by externalities, or a series of reflections on externalities). A ‘propositional claim’ here can be described straightforwardly in words, and usually conveys information (i.e. they say the world is one way instead of another way).
    2. It should be a general claim—i.e. applicable to many times and places and counterfactuals (e.g. ‘here is how tragedies of the commons work: …’ vs. ‘here is a thing that happened to me yesterday: …’)
    3. It should be a novel claim(e.g. a new reason to doubt one of the explanations put forward for the demographic transition)
    4. The claim should be described, which is to imply that the content should be:
      1. Verbal (or otherwise symbolic, e.g. a table of numbers surrounded by text would count)
      2. Explicit (saying the things it means, rather than alluding to them)
      3. Mostly concerned with conveying the relevant propositions (vs. for instance mostly concerned with affecting the reader’s mood or beliefs directly)

    I probably would have agreed that the odd vignette was also a good blog post, but ideally it should be contained in some explicit discussion of what was to be learned from it. I probably wouldn’t have held my more recent Worldly Positions blog1 in high esteem.

    I now think that departures from all of these things are often good. So in the spirit of novel descriptions of explicit and general claims, I have made a typology of different combinations of these axes.

    Before getting to it, I’ll explain some part of the value of each category that I think I overlooked, for anyone similar to my twenty year old self.

    Worthy non-propositional-claim content

    Minds have many characteristics other than propositional beliefs. For instance, they can have feelings and attitudes and intuitions and grokkings and senses. They can meditate and chop onions quickly and look on the bright side and tend to think in terms of systems. They can also have different versions of ‘beliefs’ that don’t necessarily correspond to differences in what propositions they would assent to. For instance, they can say ‘it’s good to exercise’, or they can viscerally anticipate a better future when they choose to exercise. And even among straightforward beliefs held by minds, there are many that aren’t easily expressed in words. For instance, I have an impression of what summer evenings in the garden of a lively country restaurant were like, but to convey that sense to you is an art, and probably involves saying different propositional things in the hope that your mind will fill in the same whatever-else in the gaps. So this belief doesn’t seem to live in my mind in a simple propositional form, nor easily make its way into one.

    All of this suggests that the set of things that you might want to communicate to a mind is large and contains much that is not naturally propositional.2

    Minds can also take many inputs other than propositional claims. For instance, instructions and remindings and stories and music and suggestions implicit in propositional claims and body language and images. So if you want to make available a different way of being to a mind—for instance you want it to find salient the instability of the global system—then it’s not obvious that propositional claims are the best way.

    Given that minds can take many non-propositional inputs, and take many non-propositional states, you should just expect that there are a lot of things to be said that aren’t naturally propositional, in form or content. You should expect messages where the payload is intended to influence a mind’s non-propositional states, and ones where the mode of communication is not propositional.

    …in communicating propositional claims

    There are different versions of ‘understanding’ a proposition. I like to distinguish ‘knowing’ or ‘understanding’ a thing — which is to say, seeing it fit into your abstract model of the world, being inclined to assent to it — and ‘realizing’ it — intuitively experiencing its truth in the world that you live in. Joe Carlsmith explores this distinction at more length, and gives an example I like:

    If asked, one would agree that the people one sees on a day to day basis — on the subway, at parties, at work — all have richly detailed and complex inner lives, struggles, histories, perspectives; but this fact isn’t always present and vivid in one’s lived world; and when it becomes so, it can make an important difference to one’s ethical orientation, even if the propositions one assents to have not obviously changed.

    I repeatedly have the experience of ‘already knowing’ some obvious thing that people always say for ages before ‘realizing’ it. For instance, ‘the map is not the territory’. (“Of course the map isn’t the territory. Why would it be? That would be some stupid mistake, thinking that the map was the territory. Like, what would your model of the situation even be like? That the place you live is also your own mind?”) Then at some point it actually hits me that stuff that seems to be in the world ISN’T IN THE WORLD; WHAT SEEMS LIKE THE WORLD IS MY OWN MIND’S IMAGE OF THE WORLD. For instance, long after seeming to know that ‘the map isn’t the territory’ I was astonished to realize that those things that are just boring in their basic essence, like sports statistics and home care magazines, things that seem to be fundamentally drab, are not like that at all. They gleam with just as much allure as the things I am most compelled by, from many vantage points out there—just not mine. And in such a case I say to myself, ‘Oh wow, I just realized something…huh, I guess it is that the map is not the territory…but I knew that?’. Probably reading this, you are still thinking, ‘um yes, you weren’t aware that boringness is person-dependent?’ And I was aware of that. I ‘knew’ it. And I even knew it in some intuitively available ways—for instance, just because I find Married at First Sight interesting, I did not expect my boyfriend to find it so. In particular, in approaching my boyfriend with the news that I have been watching a bunch of Married at First Sight, I viscerally did not expect ‘boyfriend sympathizes with appeal of objectively excellent show’ type observations (in fact he liked it, and I was in fact surprised). But still the boringness of other subjects is depicted to me as part of them, like being red is depicted as in the world (whereas ‘liable to reduce my hunger’ say, is I think more accurately represented by my mind as a feature of myself). And ‘realizing’ that that isn’t right changes how the world that I spend my concrete days in seems.

    (I know I have hardly explained or defended this claim that ‘realizing’ is a thing, and important, but I’m not going to do that properly here.)

    All of these ‘realizations’ seem to be non-propositional. You already had some proposition, and then you get something else. I think of ‘realizing’ a proposition as acquiring a related non-proposition. To realize the proposition ‘other people have inner lives’ is to take in some non-proposition. Perhaps a spacious sense of those other minds being right there around you. If you are communicating a proposition, to have it actually realized, you want to get its non-proposition partner into the recipient’s mind also. This isn’t really right, because each proposition probably has a multitude of intuitive realizations of it, and each intuitive sense of the world could be part of appreciating a multitude of different propositions. But at any rate, communicating a proposition well, so that the other person can really make use of it, often seems to involve conveying a lot of its non-propositional brethren.

    Worthy non-descriptive communication

    Closely related to non-propositional content is non-descriptive communication, which I shall call ‘evocative’ communication.

    I’m thinking of a few different axes as being related to descriptiveness of communication:

    • Verbalness (consisting of words, e.g. “donkeys are nice” vs. a video of a nice donkey)
    • Explicitness (saying in words the thing you mean, rather than demonstrating it or suggesting it or subtly causing it to creep into the background of the picture you are painting without naming it. E.g. “I want us to follow this protocol” vs. “Most reasonable people are following this protocol now”)
    • Neutrality (not setting out to affect the readers’ emotions except via content itself)

    I think of the most vanilla communication as being explicit, verbal and neutral. And this seems pretty good for conveying propositional content. But I suspect that non-propositional content is often conveyed better through evocative communication.

    (Or perhaps it is more like: communicating propositional claims explicitly with language is uniquely easy, because explicit language is basically a system we set up for communicating, and propositions are a kind of message that is uniquely well suited to it. But once we leave the set of things that are well communicated in this way, and given that there are lots of other ways to communicate things, non-descriptive forms of communication are much more likely to be helpful than they were.)

    Relatedly, I think non-descriptive communication can be helpful in making the ‘realizing’ versions of propositional claims available to minds. That is, in really showing them to us. So in that way, evocative communication seems also potentially valuable for communicating propositional content well.

    Worthy communication of non-propositional things descriptively

    Going the opposite way—trying to communicate ineffable things in words—also seems valuable, because a) groping nearby propositionally does contribute to understanding, and b) only understanding things in ineffable ways leaves them unavailable to our reasoning faculties in important ways.

    Worthy non-generality

    I thought that if things were not general, then they were particularly unimportant to talk about. All things equal, isn’t it way better to understand a broad class of things better than a single thing?

    Some ways this is misleading:

    • Understanding specific things is often basically a prerequisite for understanding general things. For instance, devising a general theory of circumstances under which arms races develop will be harder without specific information about the behavior of specific nations historically, to inspire or constrain your theorizing
    • Understanding specific things one after another will often automatically lead to your having an intuitive general model, through some kind of brain magic, even in cases where you would have had a hard time making an explicit model. For instance, after you have seen a thousand small disputes run their course, you might have a pretty good guess about how the current dispute will go, even if you couldn’t begin to describe a theory of argumentation for the relevant community.
    • Specific things are often broadly relevant to the specific world that you live in. For instance, exactly what happened in a particular past war might determine what current obligations should be and what sentiments are warranted, and who is owed, and what particular current parties might be expected to want or take for granted. Which is perhaps only of much interest in a narrow range of circumstances, but if they are the circumstances in which we will live for decades, it might be consistently material.

    Worthy non-originality of content

    On my naive model, you don’t want to repeat something that someone else said, because there is implicitly no value in the repetition—the thing has already been said, so re-saying adds nothing and seems to imply that you are either ignorant or hoping to dupe ignorant others into giving you undeserved credit.

    But on a model where many claims are easy enough to accept, but hard to realize, things look very different. The first time someone writes down an idea, the chances of it really getting through to anyone with much of its full power are low. The typical reader needs to meet the idea repeatedly, from different angles, to start to realize it.

    In a world like that, a lot of value comes from rehashing older ideas. Also in that world, rehashing isn’t the easy cashing in of someone else’s work. Writing something in a way that might really reach some people who haven’t yet been reached is its own art.

    Worthy non-originality of communication

    I think I also kind of imagined that once an idea had been put into the ‘public arena’ then the job was done. But another way in which unoriginality is incredibly valuable is that each person can only see such a minuscule fraction of what has ever been written or created, and they can’t even see what they can’t see, that locating particularly apt bits and sharing them with the right audience can be as valuable as writing the thing in the first place. This is curating and signal boosting. For these, you don’t even need to write anything original. But again, doing them well is not trivial. Knowing which of the cornucopia of content should be shown to someone is a hard intellectual task.


    Here is my tentative four-dimensional typology of kinds of blog posts. Any blog post maps to a path from some kind of content on the left, through some kind of communication to publication on the right. Content varies on two axes: generality and propositionalness. Communication varies in evocativeness. And blog posts themselves vary in how early in this pipeline the author adds value. For instance, among posts with a general propositional idea as their content, communicated in a non-propositional way, there are ones where the author came up with the idea, ones where the author took someone else’s idea and wrote something evocative about it, and ones that are repostings of either of the above. Thus, somewhat confusingly, there are 16 (pathways from left to right) x 3 (steps per pathway) = 46 total blog post types represented here, not the 36 you might expect from the number of squares.

    I include a random assortment of examples, some obscure, zooming probably required (apologies).

    Blog post typology

    Main updates

    1. Lots of worthy things are hard to describe in words
    2. ‘Realizing’ is a thing, and valuable, and different to understanding
    3. Details can be good
    4. Having ideas is not obviously the main place one can add value


    1. It’s good to write all manner of different kinds of blog posts
    2. It’s good to just take other people’s ideas and write blog posts about them, especially of different kinds than the original blog posts
    3. It’s good to just take one’s own ideas and write second or third blog posts saying exactly the same thing in different ways

    Other thoughts

    These different sorts of blog posts aren’t always valuable, of course. They have to be done well. Compellingly writing about something that isn’t worthy of people’s attention, or curating the wrong things can be as bad as the good versions of these things are good.

    Epistemic status: overall I expect to find that this post is badly wrong in at least one way in short order, but to be sufficiently interested in other things that I don’t get around to fixing it. Another good thing about rehashing others ideas is that you can make subtle edits where they are wrong.


    1. Older posts here 

    2. I don’t want to make strong claims about exactly what counts as propositional—maybe these things are propositional in some complicated way—but hopefully I’m pointing at an axis of straightforward propositionalness versus something else, regardless. 

  • Do incoherent entities have stronger reason to become more coherent than less?

    My understanding is that various ‘coherence arguments’ exist, of the form:

    1. If your preferences diverged from being representable by a utility function in some way, then you would do strictly worse in some way than by having some kind of preferences that were representable by a utility function. For instance, you will lose money, for nothing.
    2. You have good reason not to do that / don’t do that / you should predict that reasonable creatures will stop doing that if they notice that they are doing it.

    For example, from Arbital:

    Well, but suppose I declare to you that I simultaneously:

    • Prefer onions to pineapple on my pizza.
    • Prefer pineapple to mushrooms on my pizza.
    • Prefer mushrooms to onions on my pizza.

    Suppose I tell you that I prefer pineapple to mushrooms on my pizza. Suppose you’re about to give me a slice of mushroom pizza; but by paying one penny ($0.01) I can instead get a slice of pineapple pizza (which is just as fresh from the oven). It seems realistic to say that most people with a pineapple pizza preference would probably pay the penny, if they happened to have a penny in their pocket.[1]

    After I pay the penny, though, and just before I’m about to get the pineapple pizza, you offer me a slice of onion pizza instead–no charge for the change! If I was telling the truth about preferring onion pizza to pineapple, I should certainly accept the substitution if it’s free.

    And then to round out the day, you offer me a mushroom pizza instead of the onion pizza, and again, since I prefer mushrooms to onions, I accept the swap.

    I end up with exactly the same slice of mushroom pizza I started with… and one penny poorer, because I previously paid $0.01 to swap mushrooms for pineapple.

    This seems like a qualitatively bad behavior on my part. By virtue of my incoherent preferences which cannot be given a consistent ordering, I have shot myself in the foot, done something self-defeating. We haven’t said how I ought to sort out my inconsistent preferences. But no matter how it shakes out, it seems like there must be some better alternative–some better way I could reason that wouldn’t spend a penny to go in circles. That is, I could at least have kept my original pizza slice and not spent the penny.

    In a phrase you’re going to keep hearing, I have executed a ‘dominated strategy’: there exists some other strategy that does strictly better.

    On the face of it, this seems wrong to me. Losing money for no reason is bad if you have a coherent utility function. But from the perspective of the creature actually in this situation, losing money isn’t obviously bad, or reason to change. (In a sense, the only reason you are losing money is that you consider it to be good to do so.)

    It’s true that losing money is equivalent to losing pizza that you like. But losing money is also equivalent to a series of pizza improvements that you like (as just shown), so why do you want to reform based on one, while ignoring the other?

    If you are even a tiny bit incoherent, then I think often a large class of things are actually implicitly worth the same amount. To see this, consider the following diagram of an entity’s preferences over money and apples. Lines are indifference curves in the space of items. The blue lines shown mean that you are indifferent between $1 and an apple on the margin across a range of financial and apple possession situations. (Not necessary for rationality.) Further out lines are better, and you can’t reach a further out line by traveling along whatever line you are on, because you are not indifferent between better and worse things.

    Indifference curves between numbers of apples and dollars

    Trades that you are fine with making can move your situation from anywhere on a line to anywhere on the same line or above it (e.g. you will trade an apple for $1 or anything more than that).

    Now let’s say that you are also indifferent between an apple and $2, in general:

    Two conflicting sets of indifference curves

    With two incoherent sets of preferences over the items in question, then there are two overlapping sets of curves.

    If indifference curves criss-cross, then you can move anywhere among them while remaining indifferent - as long as you follow the lines, you are indifferent, and the lines now get you everywhere. Relatedly, you can get anywhere while making trades you are happy with. You are now indifferent to the whole region, at least implicitly. (In this simplest case at least, basically you now have two non-parallel vectors that you can travel along, and can add them together to get any vector in the plane.)

    Indifference spreads through all the intersecting indifference curves

    For instance, here, four apples are seen to be equally preferable to two apples (and also to two apples plus two dollars, and to one apple plus two dollars). Not shown because I erred: four apples are also equivalent to $-1 (which is also equivalent to winning the lottery).

    If these were not lines, but higher-dimensional planes with lots of outcomes other than apples and dollars, and all of the preferences about other stuff made sense except for this one incoherence, the indifference hyperplanes would intersect and the entity in question would be effectively indifferent between everything. (To see this a different way, if it is indifferent between $1 and an apple, and also between $2 and an apple, and it considers a loaf of bread to be worth $2 in general, then implicitly a loaf of bread is worth both one apple and two apples, and so has also become caught up in the mess.)

    (I don’t know if all incoherent creatures value everything the same amount - it seems like it should maybe be possible to just have a restricted region of incoherence in some cases, but I haven’t thought this through.)

    This seems related to having inconsistent beliefs in logic. When once you believe a contradiction, everything follows. When once you evaluate incoherently, every evaluation follows.

    And from that position of being able to evaluate anything any amount, can you say that it is better to reform your preferences to be more coherent? Yes. But you can also say that it is better to reform them to make them less coherent. Does coherence for some reason win out?

    In fact slightly incoherent people don’t seem to think that they are indifferent between everything (and slightly inconsistent people don’t think they believe everything). And my impression is that people do become more coherent with time, rather than less, or a mixture at random.

    If you wanted to apply this to alien AI minds though, it would seem nice to have a version of the arguments that go through, even if just via a clear account of the pragmatic considerations that compel human behavior in one direction. Does someone have an account of this? Do I misunderstand these arguments? (I haven’t actually read them for the most part, so it wouldn’t be shocking.)

  • Holidaying and purpose

    I’m on holiday. A basic issue with holidays is that it feels more satisfying and meaningful to do purposeful things, but for a thing to actually serve a purpose, it often needs to pass a higher bar than a less purposeful thing does. In particular, you often have to finish a thing and do it well in order for it to achieve its purpose. And finishing things well is generally harder and less fun than starting them, and so in other ways contrary to holidaying.

    This isn’t a perfect relationship though, so a natural way to mitigate the trade-off is to just look harder until you find things that serve a worthy purpose while being non-committal and consistently non-arduous. For instance, you can exercise or learn about history or practice guitar or write half-assed blog posts without real conclusions or narrative consistency.

    There is also probably good holidaying to be done that doesn’t seem obviously purposeful, and maybe that is more in the spirit of holidaying. Perhaps one should avoid too much purpose, lest one end up not holidaying?

    Today I travelled by rowing boat across a lake and back, with my boyfriend and some of his family.


    Now we are going to the zoo.

  • Coherence arguments imply a force for goal-directed behavior

    Crossposted from AI Impacts

    [Epistemic status: my current view, but I haven’t read all the stuff on this topic even in the LessWrong community, let alone more broadly.]

    There is a line of thought that says that advanced AI will tend to be ‘goal-directed’—that is, consistently doing whatever makes certain favored outcomes more likely—and that this is to do with the ‘coherence arguments’. Rohin Shah, and probably others1 , have argued against this. I want to argue against them.

    The old argument for coherence implying (worrisome) goal-directedness

    I’d reconstruct the original argument that Rohin is arguing against as something like this (making no claim about my own beliefs here):

    • ‘Whatever things you care about, you are best off assigning consistent numerical values to them and maximizing the expected sum of those values’
      ‘Coherence arguments’2 mean that if you don’t maximize ‘expected utility’ (EU)—that is, if you don’t make every choice in accordance with what gets the highest average score, given consistent preferability scores that you assign to all outcomes—then you will make strictly worse choices by your own lights than if you followed some alternate EU-maximizing strategy (at least in some situations, though they may not arise). For instance, you’ll be vulnerable to ‘money-pumping’—being predictably parted from your money for nothing.3
    • ‘Advanced AI will tend to do better things instead of worse things, by its own lights’
      Advanced AI will tend to avoid options that are predictably strictly worse by its own lights, due to being highly optimized for making good choices (by some combination of external processes that produced it, its own efforts, and the selection pressure acting on its existence).
    • ‘Therefore advanced AI will maximize EU, roughly’
      Advanced AI will tend to be fairly coherent, at least to a level of approximation where becoming more coherent isn’t worth the cost.4 Which will probably be fairly coherent (e.g. close enough to coherent that humans can’t anticipate the inconsistencies).
    • ‘Maximizing EU is pretty much the same as being goal-directed’
      To maximize expected utility is to pursue the goal of that which you have assigned higher utility to.5

    And since the point of all this is to argue that advanced AI might be hard to deal with, note that we can get to that conclusion with:

    1. ‘Highly intelligent goal-directed agents are dangerous’
      If AI systems exist that very competently pursue goals, they will likely be better than us at attaining their goals, and therefore to the extent there is a risk of mismatch between their goals and ours, we face a serious risk.

    Rohin’s counterargument

    Rohin’s counterargument begins with an observation made by others before: any behavior is consistent with maximizing expected utility, given some utility function. For instance, a creature just twitching around on the ground may have the utility function that returns 1 if the agent does whatever it in fact does in each situation (where ‘situation’ means, ‘entire history of the world so far’), and 0 otherwise. This is a creature that just wants to make the right twitch in each detailed, history-indexed situation, with no regard for further consequences. Alternately the twitching agent might care about outcomes, but just happen to want the particular holistic unfolding of the universe that is occurring, including this particular series of twitches. Or it could be indifferent between all outcomes.

    The basic point is that rationality doesn’t say what ‘things’ you can want. And in particular, it doesn’t say that you have to care about particular atomic units that larger situations can be broken down into. If I try to call you out for first spending money to get to Paris, then spending money to get back from Paris, there is nothing to say you can’t just have wanted to go to Paris for a bit and then to come home. In fact, this is a common human situation. ‘Aha, I money pumped you!’ says the airline, but you aren’t worried. The twitching agent might always be like this—a creature of more refined tastes, who cares about whole delicate histories and relationships, rather than just summing up modular momentarily-defined successes. And given this freedom, any behavior might conceivably be what a creature wants. 

    Then I would put the full argument, as I understand it, like this:

    1. Any observable sequence of behavior is consistent with the entity doing EU maximization (see observation above)
    2. Doing EU maximization doesn’t imply anything about what behavior we might observe (from 1)
    3. In particular, knowing that a creature is an EU maximizer doesn’t imply that it will behave in a ‘goal-directed’ way, assuming that that concept doesn’t apply to all behavior. (from 2)

    Is this just some disagreement about the meaning of the word ‘goal-directed’? No, because we can get back to a major difference in physical expectations by adding:

    1.  Not all behavior in a creature implicates dire risk to humanity, so any concept of goal-directedness that is consistent with any behavior—and so might be implied by the coherence arguments—cannot imply AI risk.

    So where the original argument says that the coherence arguments plus some other assumptions imply danger from AI, this counterargument says that they do not. 

    (There is also at least some variety in the meaning of ‘goal-directed’. I’ll use goal-directedRohin to refer to what I think is Rohin’s preferred usage: roughly, that which seems intuitively goal directed to us, e.g. behaving similarly across situations, and accruing resources, and not flopping around in possible pursuit of some exact history of personal floppage, or peaceably preferring to always take the option labeled ‘A’.6)

    My counter-counterarguments

    What’s wrong with Rohin’s counterargument? It sounded tight. 

    In brief, I see two problems:

    A. The whole argument is in terms of logical implication. But what seems to matter is changes in probability. Coherence doesn’t need to rule out any behavior to matter, it just has to change the probabilities of behaviors. Understood in terms of probability, argument 2 is a false inference: just because any sequence of behavior is consistent with EU maximization doesn’t mean that EU maximization says nothing about what behavior we will see, probabilistically. All it says is that the probability of a behavioral sequence is never reduced to zero by considerations of coherence alone, which is hardly saying anything.

    You might then think that a probabilistic version still applies: since every entity appears to be in good standing with the coherence arguments, the arguments don’t exert any force, probabilistically, on what entities we might see. But:

    B. An outside observer being able to rationalize a sequence of observed behavior as coherent doesn’t mean that the behavior is actually coherent. Coherence arguments constrain combinations of external behavior and internal features—‘preferences’7 and beliefs. So whether an actor is coherent depends on what preferences and beliefs it actually has. And if it isn’t coherent in light of these, then coherence pressures will apply, whether or not its behavior looks coherent. And in many cases, revision of preferences due to coherence pressures will end up affecting external behavior. So 2) is not only not a sound inference from 1), but actually a wrong conclusion: if a system moves toward EU maximization, that does imply things about the behavior that we will observe (probabilistically).

    Perhaps Rohin only meant to argue about whether it is logically possible to be coherent and not goal-directed-seeming, for the purpose of arguing that humanity can construct creatures in that perhaps-unlikely-in-nature corner of mindspace, if we try hard. In which case, I agree that it is logically possible. But I think his argument is often taken to be relevant more broadly, to questions of whether advanced AI will tend to be goal-directed, or to be goal-directed in places where they were not intended to be.

    I take A) to be fairly clear. I’ll lay out B) in more detail.

    My counter-counterarguments in more detail

    How might coherence arguments affect creatures?

    Let us step back.

    How would coherence arguments affect an AI system—or anyone—anyway? They’re not going to fly in from the platonic realm and reshape irrational creatures.

    The main routes, as I see it, are via implying:

    1. incentives for the agent itself to reform incoherent preferences
    2. incentives for the processes giving rise to the agent (explicit design, or selection procedures directed at success) to make them more coherent
    3. some advantage for coherent agents in competition with incoherent agents

    To be clear, the agent, the makers, or the world are not necessarily thinking about the arguments here—the arguments correspond to incentives in the world, which these parties are responding to. So I’ll often talk about ‘incentives for coherence’ or ‘forces for coherence’ rather than ‘coherence arguments’.

    I’ll talk more about 1 for simplicity, expecting 2 and 3 to be similar, though I haven’t thought them through.

    Looking coherent isn’t enough: if you aren’t coherent inside, coherence forces apply

    If self-adjustment is the mechanism for the coherence, this doesn’t depend on what a sequence of actions looks like from the outside, but from what it looks like from the inside.

    Consider the aforementioned creature just twitching sporadically on the ground. Let’s call it Alex.

    As noted earlier, there is a utility function under which Alex is maximizing expected utility: the one that assigns utility 1 to however Alex in fact acts in every specific history, and utility 0 to anything else.

    But from the inside, this creature you excuse as ‘maybe just wanting that series of twitches’ has—let us suppose—actual preferences and beliefs. And if its preferences do not in fact prioritize this elaborate sequence of twitching in an unconflicted way, and it has the self-awareness and means to make corrections, then it will make corrections8. And having done so, its behavior will change. 

    Thus excusable-as-coherent Alex is still moved by coherence arguments, even while the arguments have no complaints about its behavior per se.

    For a more realistic example: suppose Assistant-Bot is observed making this sequence of actions: 

    • Offers to buy gym membership for $5/week 
    • Consents to upgrade to gym-pro membership for $7/week, which is like gym membership but with added morning classes
    • Takes discounted ‘off-time’ deal, saving $1 per week for only using gym in evenings

    This is consistent with coherence: Assistant-Bot might prefer that exact sequence of actions over all others, or might prefer incurring gym costs with a larger sum of prime factors, or might prefer talking to Gym-sales-bot over ending the conversation, or prefer agreeing to things.

    But suppose that in fact, in terms of the structure of the internal motivations producing this behavior, Assistant-Bot just prefers you to have a gym membership, and prefers you to have a better membership, and prefers you to have money, but is treating these preferences with inconsistent levels of strength in the different comparisons. Then there appears to be a coherence-related force for Assistant-Bot to change. One way that that could look is that since Assistant-Bot’s overall behavioral policy currently entails giving away money for nothing, and also Assistant-Bot prefers money over nothing, that preference gives Assistant-Bot reason to alter its current overall policy, to avert the ongoing exchange of money for nothing.9 And if its behavioral policy is arising from something like preferences, then the natural way to alter it is via altering those preferences, and in particular, altering them in the direction of coherence.

    One issue with this line of thought is that it’s not obvious in what sense there is anything inside a creature that corresponds to ‘preferences’. Often when people posit preferences, the preferences are defined in terms of behavior. Does it make sense to discuss different possible ‘internal’ preferences, distinct from behavior? I find it helpful to consider the behavior and ‘preferences’ of groups:

    Suppose two cars are parked in driveways, each containing a couple. One couple are just enjoying hanging out in the car. The other couple are dealing with a conflict: one wants to climb a mountain together, and the other wants to swim in the sea together, and they aren’t moving because neither is willing to let the outing proceed as the other wants. ‘Behaviorally’, both cars are the same: stopped. But their internal parts (the partners) are importantly different. And in the long run, we expect different behavior: the car with the unconflicted couple will probably stay where it is, and the conflicted car will (hopefully) eventually resolve the conflict and drive off.

    I think here it makes sense to talk about internal parts, separate from behavior, and real. And similarly in the single agent case: there are physical mechanisms producing the behavior, which can have different characteristics, and which in particular can be ‘in conflict’—in a way that motivates change—or not. I think it is also worth observing that humans find their preferences ‘in conflict’ and try to resolve them, which is suggests that they at least are better understood in terms of both behavior and underlying preferences that are separate from it. 

    So we have: even if you can excuse any seizuring as consistent with coherence, coherence incentives still exert a force on creatures that are in fact incoherent, given their real internal state (or would be incoherent if created). At least if they or their creator have machinery for noticing their incoherence, caring about it, and making changes.

    Or put another way, coherence doesn’t exclude overt behaviors alone, but does exclude combinations of preferences, and preferences beget behaviors. This changes how specific creatures behave, even if it doesn’t entirely rule out any behavior ever being correct for some creature, somewhere. 

    That is, the coherence theorems may change what behavior is likely to appear amongst creatures with preferences. 

    Reform for coherence probably makes a thing more goal-directedRohin

    Ok, but moving toward coherence might sound totally innocuous, since, per Rohin’s argument, coherence includes all sorts of things, such as absolutely any sequence of behavior. 

    But the relevant question is again whether a coherence-increasing reform process is likely to result in some kinds of behavior over others, probabilistically.

    This is partly a practical question—what kind of reform process is it? Where a creature ends up depends not just on what it incoherently ‘prefers’, but on what kinds of things its so-called ‘preferences’ are at all10, and what mechanisms detect problems, and how problems are resolved.

    My guess is that there are also things we can say in general. It’s is too big a topic to investigate properly here, but some initially plausible hypotheses about a wide range of coherence-reform processes:

    1. Coherence-reformed entities will tend to end up looking similar to their starting point but less conflicted
      For instance, if a creature starts out being indifferent to buying red balls when they cost between ten and fifteen blue balls, it is more likely to end up treating red balls as exactly 12x the value of blue balls than it is to end up very much wanting the sequence where it takes the blue ball option, then the red ball option, then blue, red, red, blue, red. Or wanting red squares. Or wanting to ride a dolphin.

      (I agree that if a creature starts out valuing Tuesday-red balls at fifteen blue balls and yet all other red balls at ten blue balls, then it faces no obvious pressure from within to become ‘coherent’, since it is not incoherent.)
    2. More coherent strategies are systematically less wasteful, and waste inhibits goal-directionRohin, which means more coherent strategies are more forcefully goal-directedRohin on average
      In general, if you are sometimes a force for A and sometimes a force against A, then you are not moving the world with respect to A as forcefully as you would be if you picked one or the other. Two people intermittently changing who is in the driving seat, who want to go to different places, will not cover distance in any direction as effectively as either one of them. A company that cycles through three CEOs with different evaluations of everything will—even if they don’t actively scheme to thwart one another—tend to waste a lot of effort bringing in and out different policies and efforts (e.g. one week trying to expand into textiles, the next week trying to cut everything not involved in the central business).
    1. Combining points 1 and 2 above, as entities become more coherent, they generally become more goal-directedRohin. As opposed to, for instance, becoming more goal-directedRohin on average, but individual agents being about as likely to become worse as better as they are reformed. Consider: a creature that values red balls at 12x blue balls is very similar to one that values them inconsistently, except a little less wasteful. So it is probably similar but more goal-directedRohin. Whereas it’s fairly unclear how goal-directedRohin a creature that wants to ride a dolphin is compared to one that wanted red balls inconsistently much. In a world with lots of balls and no possible access to dolphins, it might be much less goal-directedRohin, in spite of its greater coherence. 
    1. Coherence-increasing processes rarely lead to non-goal-directedRohin agents—like the one that twitches on the ground
      In the abstract, few starting points and coherence-motivated reform processes will lead to an agent with the goal of carrying out a specific convoluted moment-indexed policy without regard for consequence, like Rohin’s twitching agent, or to valuing the sequence of history-action pairs that will happen anyway, or to being indifferent to everything. And these outcomes will be even less likely in practice, where AI systems with anything like preferences probably start out caring about much more normal things, such as money and points and clicks, so will probably land at a more consistent and shrewd version of that, if 1 is true. (Which is not to say that you couldn’t intentionally create such a creature.)

    These hypotheses suggest to me that the changes in behavior brought about by coherence forces favor moving toward goal-directednessRohin, and therefore at least weakly toward risk.

    Does this mean advanced AI will be goal-directedRohin?

    Together, this does not imply that advanced AI will tend to be goal-directedRohin. We don’t know how strong such forces are. Evidently not so strong that humans11, or our other artifacts, are whipped into coherence in mere hundreds of thousands of years12. If a creature doesn’t have anything like preferences (beyond a tendency to behave certain ways), then coherence arguments don’t obviously even apply to it (though discrepancies between the creature’s behavior and its makers’ preferences probably produce an analogous force13 and competitive pressures probably produce a similar force for coherence in valuing resources instrumental to survival). Coherence arguments mark out an aspect of the incentive landscape, but to say that there is an incentive for something, all things equal, is not to say that it will happen.

    In sum

    1) Even though any behavior could be coherent in principle, if it is not coherent in combination with an entity’s internal state, then coherence arguments point to a real force for different (more coherent) behavior.

    2) My guess is that this force for coherent behavior is also a force for goal-directed behavior. This isn’t clear, but seems likely, and also isn’t undermined by Rohin’s argument, as seems commonly believed.

    Two dogs attached to the same leash are pulling in different directions. Etching by J. Fyt, 1642</figure>

    1. For instance, Richard Ngo agrees here, and Eric Drexler makes a related argument here, section 6.4. 

    2. Something something This has more on these arguments. 

    3. I haven’t read all of this, and don’t yet see watertight versions of these arguments, but this is not the time I’m going to get into that. 

    4. Assuming being ‘more coherent’ is meaningful and better than being ‘less coherent’, granting that one is not coherent, which sounds plausible, but which I haven’t got into. One argument against is that if you are incoherent at all, then it looks to me like you can logically evaluate any bundle of things at any price. Which would seem to make all incoherences identical—much like how all logical contradictions equivalently lead to every belief. However this seems unlikely to predict well how creatures behave in practice if they have an incoherent preferences. 

    5. This isn’t quite right, since ‘goal’ suggests one outcome that is being pursued ahead of all others, whereas EU-maximizing implies that all possible outcomes have an ordering, and you care about getting higher ones in general, not just the top one above all others, but this doesn’t seem like a particularly relevant distinction here. 

    6. I am not sold on this usage myself for ‘goal-directed’—there is an appeal to using that phrase for ‘pursues goals’ in its most basic sense, but I am also tentatively in favor of having as many concepts as possible. 

    7. It seems perhaps misleading to call these ‘preferences’, if they are incoherent, and so do not together implicate orderings of outcomes being better than one another. If a creature is not coherent, what are even the objects of its decision calculus? I am inclined to think in terms of ‘decision criteria’, e.g. ‘given X and Y, choose X’, and ‘given Y and Z, choose Y’, which don’t necessarily imply anything about ‘given Z and X, choose …’, but I haven’t thought about this much, and it seems like a technical detail of the creature in question. Whatever they are though, if the creature has behavior, then there are internal dynamics that produce it. When exactly an aspect of these should be considered a ‘preference’ for the sake of this argument isn’t entirely clear to me, but would seem to depend on something like whether it tends to produce actions favoring certain outcomes over other outcomes across a range of circumstances (similar to the unclear definition of ‘agent’). 

    8. The ‘right’ way to correct your own incoherent preferences seems complicated and not obviously well defined or existent, and perhaps there is not much more to say than that what you do will depend on your design. It’s also not clear to me that a genuinely incoherent creature should necessarily want to reform, by its own lights, but that is a question for another time—here I’m assuming that the coherence arguments do have this implication that seems commonly attributed to them. My guess is that in practice, such creatures often do want to reform, and exactly how they do it doesn’t matter for my argument here. 

    9. I’m describing the force potentially felt by Assistant-Bot itself, but to the extent that its makers, or users also have preferences for money over nothing, and wish to use Assistant-Bot, and can alter it, they would seem to have similar incentives to mitigate its self-defeating behavior. 

    10. The creature’s ‘preferences’ can’t be in terms of consistent numerical values assigned to everything, because those would be consistent. So what are they? For instance, one might imagine that they are pairwise comparisons between some kind of things (which can include ‘A > B’ and ‘B > C’ and ‘C > A’), or that they are a set of ‘situation—>action’ mappings, or they are a noisy ‘options—>feelings’ mapping combined with a set of deontological constraints over actions and feelings (‘choose things you feel better about, except don’t choose things out of selfishness, except when you feel more than 50% scared…’, etc. 

    11. For instance, humans are insistently incoherent on the Allais paradox

    12. That seems pretty interesting—but note that well-designed computers have been known to do things that took humans participating in biological and cultural evolution hundreds of thousands of years before, so inference here not straightforward, and the forces of coherence depend on the costs of reform, which depend on the machinery for it. Also, we don’t know what other forces were in play—there might even have been forces for apparent incoherence, e.g. insofar as hypocrisy can benefit social animals, and dishonesty is complicated (The Elephant in the Brain discusses such ideas). 

    13. For instance, the coherent creature that evaluates red balls differently on Tuesday and Wednesday might be in conflict with its creators, if they have a more consistent red ball evaluation, giving them reason to reform it. You might class this under the question, ‘what kinds of advanced AI will people want?’, but the reason for it is very similar to the reasons for internal pressure for coherence. If you refuse to pay $13 for a red ball, and your AI then goes out and buys you one for $15 because it is Tuesday, then the pair of you together could have done better. 

    The old argument for coherence implying (worrisome) goal-directedness

    I’d reconstruct the original argument that Rohin is arguing against as something like this (making no claim about my own beliefs here):

    • ‘Whatever things you care about, you are best off assigning consistent numerical values to them and maximizing the expected sum of those values’
      ‘Coherence arguments’2 mean that if you don’t maximize ‘expected utility’ (EU)—that is, if you don’t make every choice in accordance with what gets the highest average score, given consistent preferability scores that you assign to all outcomes—then you will make strictly worse choices by your own lights than if you followed some alternate EU-maximizing strategy (at least in some situations, though they may not arise). For instance, you’ll be vulnerable to ‘money-pumping’—being predictably parted from your money for nothing.3
    • ‘Advanced AI will tend to do better things instead of worse things, by its own lights’
      Advanced AI will tend to avoid options that are predictably strictly worse by its own lights, due to being highly optimized for making good choices (by some combination of external processes that produced it, its own efforts, and the selection pressure acting on its existence).
    • ‘Therefore advanced AI will maximize EU, roughly’
      Advanced AI will tend to be fairly coherent, at least to a level of approximation where becoming more coherent isn’t worth the cost.4 Which will probably be fairly coherent (e.g. close enough to coherent that humans can’t anticipate the inconsistencies).
    • ‘Maximizing EU is pretty much the same as being goal-directed’
      To maximize expected utility is to pursue the goal of that which you have assigned higher utility to.5

    And since the point of all this is to argue that advanced AI might be hard to deal with, note that we can get to that conclusion with:

    1. For instance, Richard Ngo agrees here, and Eric Drexler makes a related argument here, section 6.4. 

    2. Something something This has more on these arguments. 

    3. I haven’t read all of this, and don’t yet see watertight versions of these arguments, but this is not the time I’m going to get into that. 

    4. Assuming being ‘more coherent’ is meaningful and better than being ‘less coherent’, granting that one is not coherent, which sounds plausible, but which I haven’t got into. One argument against is that if you are incoherent at all, then it looks to me like you can logically evaluate any bundle of things at any price. Which would seem to make all incoherences identical—much like how all logical contradictions equivalently lead to every belief. However this seems unlikely to predict well how creatures behave in practice if they have an incoherent preferences. 

    5. This isn’t quite right, since ‘goal’ suggests one outcome that is being pursued ahead of all others, whereas EU-maximizing implies that all possible outcomes have an ordering, and you care about getting higher ones in general, not just the top one above all others, but this doesn’t seem like a particularly relevant distinction here. 

  • Animal faces

    [Epistemic status: not reflective of the forefront of human undersetanding, or human understanding after any research at all. Animal pictures with speculative questions.]

    Do the facial expressions of animals mean anything like what I’m inclined to take them to mean?















    On the one hand, we can’t have been the first creatures to have the innovation of facial expressions. And the same kind of animal is often seen to have different facial expressions, which has got to happen for a reason. But on the other hand, the whole race of dolphins just looks vaguely elated, and surely that can’t track the collective personality of the species.





    I suppose the answer is that many animals do have meaningful facial expressions, and even that broadly ‘smiling’ matches our broadly ‘smiling’, but that for the most part, they don’t match in their subtleties, and species-wide features don’t track anything.

    That all sounds very reasonable, but can I really look at this creature and not see it as happy?


    Perhaps I can’t, but my hard-to-quash beliefs are just wrong. How tragic in that case to have a visual right on one’s face—standing in the way of our ever looking each other in the eye and understanding at all, even if we were to spend a lifetime together.

    And seals—they seem to look extremely contented more than half the time, but not all of the time. So it’s not that they are automatically contented-looking. What does it mean?










    Are these creatures not contented?


    Do the facial expressions of humans mean anything like what I’m inclined to take them to mean?

    I tend to read slightly different smiles with a lot of flavor and significance, for instance.

    But when I look in the mirror, my face doesn’t look like what I mean it to look like. At least not reliably and with fine specifically. I’m way better than chance on ‘smile’ when intending to smile, but ‘friendly smile’ can easily end up as ‘ridiculous to the point of questioning how anyone takes me seriously’ smile, or ‘somewhat intimidating smile’.


    And people’s default faces have features that my brain interprets as expression—if one person looks more grouchy or judgmental or open or lighthearted across the board than another, does that mean something? My automatic interpretation of things thinks so, but it seems very questionable.



  • Quarantine variety

    Among people sheltering from covid, I think there is a common thought that being stuck in your home for a year begets a certain sameyness, that it will be nice to be done with.

    It’s interesting to me to remember that big chunk of the variety that is missing in life comes from regular encounters with other people, and their mind-blowing tendencies to do and think differently to me, and jump to different conclusions, and not even know what I’m talking about when I mention the most basic of basic assumptions.

    And to remember that many of those people are stuck in similar houses, similarly wishing for variety, but being somewhat tired of a whole different set of behaviors and thoughts and framings and assumptions.

    Which means that the variety is not fully out of safe reach in the way that, say, a big lick-a-stranger party might be. At least some of it is just informationally inaccessible, like finding the correct answer to a hard math problem. If I could somehow spend a day living like a person stuck in their house across the street lives, I would see all kinds of new things. My home itself—especially with its connection to the internet and Amazon—is capable of vastly more variety than I typically see.

  • Why does Applied Divinity Studies think EA hasn't grown since 2015?

    Applied Divinity Studies seeks to explain why the EA community hasn’t grown since 2015. The observations they initially call the EA community not having grown are:

    1. GiveWell money moved increased a lot in 2015, then grew only slightly since then.
    2. Open Phil (I guess money allocated) hasn’t increased since 2017
    3. Google Trends “Effective Altruism” ‘grows quickly starting in 2013, peaks in 2017, then falls back down to around 2015 levels’.

    Looking at the graph they illustrate with, 1) is because GiveWell started receiving a large chunk of money from OpenPhil in 2015, and that chunk remained around the same over the years, while the money not from Open Phil has grown.

    So 1) and 2) are both the observation, “Open Phil has not scaled up its money-moving in recent years”.

    I’m confused about how this observation seems suggestive about the size of the EA community. Open Phil is not a community small-donations collector. You can’t even donate to Open Phil. It is mainly moving Good Ventures’ money, i.e. the money of a single couple: Dustin Moskovitz and Cari Tuna.

    One way that I could imagine Open Phil’s spending saying something about the size of the EA community is that the community might provide funding opportunities for Open Phil, so that its growth was reflected in Open Phil’s spending. But this would require EA growth at a scale that produced large enough funding opportunities, that met Open Phil’s specific criteria, to show up amidst hundreds of millions of dollars of annual grant-making. I think this at least requires argument.

    I’m further confused when in trying to explain the purported end of growth, ADS says, ‘One possibility is that there was not a strange hidden cause behind widespread stagnation. It’s just that funding slowed down, and so everything else slowed down with it’, then go on to explore the possibility that funding from Open Phil/Good Ventures has slowed down in line with this ‘widespread’ stagnation (in different aspects of Open Phil and one Google Trends result). They find that indeed it has.

    There is a strange hidden cause! Which is the underlying structural relationship between Open Phil and a combination of Open Phil twice and another thing.

    (In fairness, ‘widespread’ might at that point in the post also include the stagnation since 2010 and 2013 respectively of Google searches for popular Rationalist blogs, Less Wrong, and Slate Star Codex. But stagnations that old seem unlikely to be symptomatic of an EA post-2015 stagnation, so probably the author didn’t intend to include them in it.)

    While it’s not relevant to my above confusion, I should also note that even if Open Phil funding of Givewell was a better metric of the size of EA than I understand, if the data for 1) reached 2020, it would probably tell a different story: Open Phil’s allocation to GiveWell charities grew a lot— to $100 million, up from $54 million the previous year (according to the figure). So there isn’t even a stagnation of Open Phil allocation to GiveWell to be confused by.

    I agree that 3) is evidence of the EA community receiving less attention since 2017 (though not 2015). But mild evidence, because searching for “effective altruism” on Google is probably not that closely associated with commitment to EA, as ADS notes, and because Google Trends is just pretty complicated to make sense of.

    I think it’s worth finding better evidence, before we get to speculating about such things. For instance, this 2018 post by Peter Hurford looks at about thirty metrics, only two of which peaked in or before 2015. I haven’t read that post, but my impressions from skimming it a bit seem like a more informative start than a single Google trend.

    In the process of coming up with explanations, ADS does also point to further relevant trends: the aforementioned flatness of two major rationalist blog name Google Trends, the aforementioned further measure of Open Phil activity via Good Ventures which looks like the other Open Phil activity measure, and the trend of Giving What We Can membership.

    Giving What We Can membership seems potentially very informative. Does it support the view that EA hasn’t grown since 2015? No—GWWC membership has apparently grown by about 7x since 2015. ADS correctly notes that this is a measure of “cumulative interest” whereas how many people Google a term is a measure of “active interest”. That is, the total GWWC members on a day are the sum of everyone who has ever wanted to become a GWWC member, whereas searches for “join GWWC”, say, would be a measure of how many are wanting to do that right now. But ADS’s claim was that EA hasn’t grown. It wasn’t that EA’s rate of growth hasn’t grown. And the size of the EA community is about cumulative interest, to the extent interest persists. (For some activities, you might expect them to happen again every day that you remain involved, but neither joining a society for giving away 10% of your money, nor Googling “effective altruism” is one.) Or skipping all this complication about cumulativeness, the size of EA is very much the same kind of thing as the size of GWWC membership.

    In sum, I don’t have a clear picture of how EA has grown since 2015, but as far as I can tell, the main relevant evidence that this post presents is that Google Trends “effective altruism” is flattish, Giving What We Can membership is growing (though not exponentially), and non-Open-Phil GiveWell money moved is growing. (Also, including information not in the post, it seems that Open Phil money moved to GiveWell was flat for a while then grew a lot in 2020, but it remains unclear to me how this is relevant anyway.) I’m inclined to interpret this evidence as mildly supporting ‘EA has grown since 2015’, but it doesn’t seem like much evidence either way. I think we should at least hold off on taking for granted that EA hasn’t grown since 2015 and trying to explain why.

  • Sleep math: red clay blue clay

    To me, going to bed often feels more like a tiresome deprivation from life than a welcome rest, or a painless detour through oblivion to morning. When I lack patience for it, I like to think about math puzzles. Other purposeful lines of thought keep me awake or lose me, but math leads me happily into a world of abstraction, from which the trip to dreamland comes naturally.

    (It doesn’t always work. Once I was still awake after seemingly solving two Putnam problems, which is about as well as I did in the actual Putnam contest.)

    A good puzzle for this purpose should be easy to play with in one’s head. For me, that means it should be amenable to simple visualization, and shouldn’t have the kind of description you have to look at multiple times. A handful of blobs is a great subject matter; an infinite arrangement of algebra is not.

    Recently I’ve been going to sleep thinking about the following puzzle. I got several nights of agreeable sleep out of it, but now I think I have a good solution, which I’ll probably post in future.

    Suppose that you have 1 kg of red clay that is 100 degrees and 1 kg of blue clay that is 0 degrees. You can divide and recombine clay freely. If two pieces of clay come into contact, temperature immediately equilibrates—if you put the 1kg of red clay next to 0.5 kg of blue clay, all the clay will immediately become 66 degrees. Other than that the temperature of the clay doesn’t change (i.e. no exchange with air or your hands, no radiation, etc.). Your goal is to end up with all of the blue clay in a single clump that is as hot as possible. How hot can you make it? (Equivalently: how cold can you make the red clay?)

    HT Chelsea Voss via Paul Christiano

    Clue: it’s more than 50 degrees.

  • Arrow grid game


    There’s something I like about having different systems all the time. Apparently.

  • Remarks on morality, shuddering, judging, friendship and the law

    I lately enjoyed listening to Julia Galef and Jonathan Haidt discuss Haidt’s theorized palate of ‘moral foundations’—basic flavors of moral motivation—and how Julia should understand the ones that she doesn’t naturally feel.

    I was interested in Julia’s question of whether she was just using different words to those who for instance would say that incest or consensual cannibalism are ‘morally wrong’.

    She explained that her earlier guest, Michael Sandel, had asked whether she didn’t ‘cringe’ at the thought of consensual cannibalism, as if he thought that was equivalent to finding it immoral. Julia thought she could personally cringe without morally condemning a thing. She had read Megan McArdle similarly observing that ‘liberals’ claim that incest is moral, but meanwhile wouldn’t befriend someone who practices it, so do in fact morally object after all1. As if morality was defined by friendship preferences or inclination to pass judgment. Again Julia is perplexed. Do these people agree with her, and just call their facial movements and personal social preferences ‘morality’? Michael Sandel even explained that he didn’t want to ban consensual cannibalism, he just thought it was immoral. Julia says, “I don’t even know what people are talking about when they talk about whether a thing is moral or immoral, if they’re not talking about wanting to use legal pressure to change people’s behavior”

    Here’s how I see morality and its relationship with shuddering, judging, declining friendship, and the law:

    Morality is about what is actually good.

    Shuddering, cringing and wincing are instinctive guesses about what is actually good. But much like your instinctive guesses about whether a spider is dangerous or whether you are driving well, they can err, and can even continue to err after you have seen the error. They are especially likely to err when the goodness of a thing is not readily available to your imagination. For instance, if you are a twelve-year-old hearing about sex, you might shudder, but you are missing a lot of the good bits of the picture. Or if you are thinking about a baby coming from a test-tube in 1977, you may be mostly focused on the test tube situation, and not on the century of happy life that the child will have, and the love and joy in the family that would have been emptier.

    Judging people is a response to your guesses about morality, but also to your guesses about other qualities like social status, effectiveness, personal aesthetics, and likelihood of being immoral in the future. You might judge people because they seem ridiculous, or slow, or aesthetically repellant, without thinking that the world would be objectively better off with a random different person existing instead. Or you might judge a person’s enjoyment of watching YouTube videos of plane crashes, without thinking that it is doing any harm.

    Choices about friendship seem to bring in everything that ‘judging’ does, but are also substantially complicated by preferences for actual interactions. For instance, you might have no moral problem with cooking, while dispreferring cooking-obsessed friends. Sometimes you might even prefer friends with traits that you judge—you might find continental philosophy ridiculous, while also enjoying that your friend Bob is so into it, since he explains bits of it in a language that you understand, and it’s fun to tease each other.

    The law is a complicated agreement between a collection of people for cooperating in pursuit of their different ends. It is very unclear that a thing being immoral (in the sense of ‘actually making the world worse’) means that the law should control it:

    • If the population mostly don’t know that the thing is bad, then it doesn’t seem like they should agree to be constrained from doing it. Similarly if they are in broad disagreement e.g. If you and 40% of people think A is moral, but 40% of people think it is immoral, and 20% are very unsure, then it may be that all of you think that it shouldn’t be legislated. The government doesn’t have access to the truth about it, and neither requiring or banning A would seem like a good compromise in the current state of knowledge, and either would probably have to be undemocratic.
    • Sometimes there would be harm in the law controlling things, or it wouldn’t be a good judge. e.g. maybe everyone agrees that it is immoral to cheat on your partner, or to not pay attention to signs that your teenage son is depressed because you are into playing a computer game, but most people wouldn’t want the law policing these things.
    • The law has certain actions available to it, and sometimes none is the best response to a moral concern. For instance, perhaps a person seems unfairly critical of others in a way that seems wrong to their friends. This is arguably better rectified with quiet words, sharp looks, or a reduction in invitations and endorsements, than by fines or sentences.
    • it can be better ultimately for people to make good choices through understanding the situation themselves over time, even while initially making errors, rather than having their good behavior forced or otherwise incentivized. For instance, it might be better for a person to learn compassion than to be trained by numb rote to act kindly, even if kindness is always better in the moment. Or, if the good behavior in question was enjoying the best experiences of art: you can’t actually force someone to experience a thing.
    • Coercion itself might be bad, or it might be fundamentally better for people to be free. (I’m being agnostic about what is or isn’t moral.)

    Julia and these other people aren’t using words differently, I interpret2: Michael thinks a world where people consensually eat one another is worse, whereas Julia thinks it isn’t—her own displeasure at it is a little bad, but this is presumably nothing compared to whatever satisfaction those involved enjoy. Michael and Julia both have an emotional response, and perhaps neither would be friends with the participants. But Julia doesn’t want to use the law because she thinks the act doesn’t make the world worse, whereas Michael may or may not want to use the law, but isn’t talking about that—he’s saying that it makes the world worse.


    1. here, but gated, so just going by Julia’s recollection. 

    2. To be clear, I’m going by Julia’s description in this discussion with Haidt, and haven’t e.g. listened to the discussion with Sandel. 

  • Coffee trucks: a brilliant idea that someone should do?

    I sometimes wonder if the world should have coffee trucks, like ice cream trucks, roaming the street. Especially when half the population is working from home.

    Coffee seems ideal for this because:

    1. you can probably prepare it fresh in a briefly stopping vehicle,
    2. you don’t need much variety,
    3. people are often flexible about when they get it,
    4. they often like to get it as a short break from work in which they bump into other people,
    5. more than half of the US population consumes it at least once every day, so potential demand is radically higher than for most foodstuffs!
    6. most people don’t have the means to make cafe-quality coffee in their home,
    7. it doesn’t go bad easily

    There are clearly stationary coffee trucks, like food trucks. I think moving trucks may also exist or have existed ever, but I don’t think I’ve ever seen one, and it’s hard to find mention of them. Here are some people who had the same idea, but since nobody was expecting a moving coffee bus, they got more traffic sitting still like a cafe. But surely people expecting your product to exist is an obstacle not unique to circulating coffee trucks. Here is someone with the same idea, and someone who says there was one in a holiday park once, and some others who think they exist in Australia and Buffalo, but it sounds like they might be thinking of stationary coffee trucks.

    I’m not sure that it’s good for any kind of truck to exist if it makes noise all the time as it travels through neighborhoods, delighting some, but surely disrupting others. But I think that can be entirely resolved digitally: instead of the truck playing music, the service could have an app that plays music when truck is approaching, if you signed up for that. Then you touch the notification to turn off the music and at the same time report whether you want a coffee.

    Am I missing something?

  • Oliver Sipple

    The other day I read Wikipedia arguably too much, and consequently came to know the story of Oliver Sipple. Here’s my summary of the story according to these two Wikipedia pages and this page:

    In the September of 1975, Oliver (‘Billy’) Sipple was an ex-marine of thirty-three, injured in Vietnam and living in San Francisco. He was in and out of the veteran’s hospital, six years into civilian life.

    One afternoon, he stood in a crowd of thousands of people to see the visiting President Gerald Ford leave a San Francisco hotel from across the street. Ford stopped to wave. Suddenly, a shot sounded, and Oliver saw a woman nearby adjusting the aim of her revolver. He lunged and grabbed her arm, sending the second bullet into the hotel, injuring a man inside.

    Oliver was thanked for saving the president, and celebrated as a hero by the media. A heroic veteran.

    Soon the media learned that he was in fact a heroic gay veteran.

    Oliver had shared his sexual orientation with with the San Francisco gay community—or at least he had worked at a gay bar, paraded for gay pride, demonstrated for gay rights, helped in the (LGBT) Imperial Court System, and worked on the campaign to elect openly gay board of supervisors candidate Harvey Milk. But he hadn’t shared it with his family in Detroit, who had more old-fashioned impressions about the morality of homosexuality. He also hadn’t shared it with the world at large, who after all, lived at a time when evidence of a gay person being a public hero was considered fascinating news.

    How did the media learn about this? Perhaps there were many sources, or would have been eventually. But the morning after the shooting, two prominent gay activists each outed Oliver to the San Francisco Chronicle. One was Reverend Ray Broshears, leader of the ‘Lavender Panthers’. The other was Oliver’s own friend, Harvey Milk.


    Harvey is reported to have explained privately to a friend, “It’s too good an opportunity. For once we can show that gays do heroic things, not just all that caca about molesting children and hanging out in bathrooms.”

    The next day, Herb Caen, the San Francisco Chronicle reporter who received these messages, reported to the world that Oliver was gay. He added that Oliver was friends with Harvey Milk, and speculated that President Ford hadn’t invited him to the White House because of his sexual orientation.

    Somewhere in here, Oliver asked that the media not report on the topic of his sexual orientation, lest his family or current employer learn of it. It’s not clear to me whether this was in time for them to definitively know that he didn’t want them to when they first did it, since apparently Caen ‘couldn’t contact him’.

    At any rate, the topic was reported on thoroughly. Gay activists called for his recognition as a gay hero. He was deluged by reporters, and hid at a friend’s house, at which point they turned to interviewing Harvey Milk. Harvey opined that President Ford’s gratitude would indeed have flowed more generously had Oliver been straight.

    Oliver’s mother was purportedly harassed by her neighbors, and declared her intent never to speak to him again. He was estranged from his family. His father at some point instructed his brother to forget that he had a brother.

    Oliver sued the reporter Caen and numerous newspapers and publishers for the invasion of his privacy. The suit was dismissed, but he fought on. In 1984 a state court of appeals held that he had become news, and his sexual orientation was part of the story.

    Oliver didn’t do well after becoming a hero. He drank heavily, was diagnosed with schizophrenia, put on weight, and needed a pacemaker. Over a drink, he was heard to say that he regretted grabbing the gun.

    It is said that he eventually reconciled with his family, but it is also said that his father didn’t let him come to his mother’s funeral, so granting both stories it may have been a late or mild reconciliation.

    One February day in 1989, Oliver’s friend found him dead in his San Francisco apartment, alongside a bottle of Jack Daniels and a running television. He was 47.

    Years later, journalistic ethics professors found this an instructive class discussion topic.

  • Neck abacus

    My points for anxiety system continued to help, but was encumbered by the friction of getting my phone out to mark points. Thus I have turned to wearable abaci.

    neclace 1

    necklace 2

    I made this necklace according to very roughly this picture, using knitting wool and beads I bought at one point to use as virtual in-house currency and now found in a box in my room. It works well! The beads don’t shift unless I move them, which is easy and pleasing. It seems clearly more convenient than my phone. (Plus, I can show off to those in the know that I have 4 or 6 or 24 or 26 of something!) I am also for now reminded when I look in a mirror to consider whether I can get a point, which is currently a plus.

    You can buy bracelet versions in a very small number of places online, and also keychain or general hanging clip-on versions, but I don’t think I saw necklaces anywhere. This seems striking, given the clear superiority to a phone counter for me so far, and the likely scale of phone counter usage in the world.

  • Training sweetness

    (This is my attempt to summarize the ‘Taste & Shaping’ module in a CFAR 2018 participant handbook I have, in order to understand it better (later version available online here). It may be basically a mixture of their content and my misunderstandings. Sorry for any misunderstandings propagated. I also haven’t checked or substantially experimented with most of this, but it seems so far like a good addition to my mental library of concepts.)

    Some things seem nice, and you just automatically do them (or gravitate toward them), and have to put in effort if you don’t want that to happen. Other things seem icky, and even though maybe you know they are good, you won’t get around to them for months even if they would take a minute and you spend more than that long every week glancing at them and deciding to do them later. (In my own dialect, the former are ‘delicious’. As in, ‘oh goody, my delicious book’).

    How delicious things seem is caused by a kind of estimate by your brain of how good that thing will be for the goals it thinks you have.

    Your brain makes these estimates in a funny way, with some non-obvious features:

    • The causal connections between things in the brain’s model are not the ones you would give if asked to describe the situation. For instance, you might say that practicing piano causes you to get better at piano, while in the model, practicing piano mostly causes you to be bad at the piano, since you usually experience being bad at piano immediately after you experience practicing it.
    • The effects of an action are based mostly on past visceral experiences with similar actions. For instance, if you usually hit your thumb when you use a hammer, then when you get out a hammer today, it might seem non-delicious. Whereas if you are just told that most people hit their thumbs when using hammers, this might not affect deliciousness as much. It is as though it is not in the right language for your brain’s model to take it in. (My guess is that it is more likely to get taken in if you translate it into ‘experience’ via imagining.)
    • The connection between an action and an outcomes is modeled as much weaker as more delay occurs between them. So that if you press a button which has a good effect in half a second and an equally bad effect in ten seconds, this will sum up in the estimate as good overall, because your brain will model the second effect more weakly.
    • If B is delicious, and you demonstrate a strong empirical connection between A and B in language your brain’s model can take in, then A will often come to also be delicious. Thus if doing Z leads to A which leads to the excellent B much later, if the connection between A and B is made clear, then Z can become delicious, even though it is fairly distant from the ultimately good outcome.
    • Since adjusting the deliciousness of options happens based on experience, it is difficult to update ones that happen rarely. For instance, if you want to train a pigeon to peck out drawing of a tree, you can’t just reward it when it happens to do that, because it will take way too long for it to even do it once. A way to get around this is to start by rewarding it if it pecks at all, then reward it if it pecks along in a line (then maybe stop rewarding it for pecking at all, since it knows it has to do that now to get the pecking in a line reward), then reward it if it pecks a more tree-shaped line, and so on. This is called ‘shaping’.
    • your brain generalizes between things, so if it tried an action and that was bad, then it will estimate that another action like that one is probably also bad. So if someone punishes you when you do almost the right thing, that can make your brain estimate that doing the right thing is bad. This is especially harmful if it doesn’t receive a punishment for doing things very far away. For instance, if playing the piano badly gets a frown, and not playing the piano at all gets nothing, your brain might avoid the piano, rather than honing in on the narrow band of good piano playing right next to the punishable bad piano playing. This and the last point means that if you are trying to teach your brain what is good by giving it extra rewards or punishments as soon as it does things, you want to give it rewards for anything near the best action, at least at first.

    Quick takeaways:

    1. How nice things seem is in your deliciousness-model, not the world
    2. Your deliciousness-model can be pragmatically shifted, much like a bucket of water can be shifted. Things that are awful can become genuinely nice.
    3. If a thing seems like it should be nice, but your deliciousness-model is rating it as not nice, you can think about why it is wrong and how to communicate its error to it. Has it not taken in the nice consequence? Does it not understand the causal connection, because the consequence takes too long to happen? Does it not realize how bad things are even when you are not near the piano?
    4. You should generally reward or punish yourself according to whether you want yourself to do ‘things like this’ more or less. Which often means rewarding yourself for getting closer to your goal than in the most available possible worlds where you looked at social media all afternoon or played a computer game, even if your success was less than in some hard to find narrow band nearby.

    (I called this post ‘training sweetness’ because the thought of changing which things taste sweet or not via ‘training’ sounds kind of wild, and reminds me that what seems like real, objective niceness in the world is what we are saying is in your mind and malleable, here. I don’t know whether a literal sweet taste can be retrained, though it seems that one can come to dislike it.)

  • Remember that to value something infinitely is usually to give it a finite dollar value

    Just an occasional reminder that if you value something so much that you don’t want to destroy it for nothing, then you’ve got to put a finite dollar value on it. Things just can’t be infinitely more important than other things, in a world where possible trades weave everything together. A nice illustration from Arbital:

    An experiment in 2000–from a paper titled “The Psychology of the Unthinkable: Taboo Trade-Offs, Forbidden Base Rates, and Heretical Counterfactuals”–asked subjects to consider the dilemma of a hospital administrator named Robert:

    Robert can save the life of Johnny, a five year old who needs a liver transplant, but the transplant procedure will cost the hospital $1,000,000 that could be spent in other ways, such as purchasing better equipment and enhancing salaries to recruit talented doctors to the hospital. Johnny is very ill and has been on the waiting list for a transplant but because of the shortage of local organ donors, obtaining a liver will be expensive. Robert could save Johnny’s life, or he could use the $1,000,000 for other hospital needs.

    The main experimental result was that most subjects got angry at Robert for even considering the question.

    After all, you can’t put a dollar value on a human life, right?

    But better hospital equipment also saves lives, or at least one hopes so. 4 It’s not like the other potential use of the money saves zero lives.

    Let’s say that Robert has a total budget of $100,000,000 and is faced with a long list of options such as these:

    • $100,000 for a new dialysis machine, which will save 3 lives
    • $1,000,000 for a liver for Johnny, which will save 1 life
    • $10,000 to train the nurses on proper hygiene when inserting central lines, which will save an expected 100 lives

    Now suppose–this is a supposition we’ll need for our theorem–that Robert does not care at all about money, not even a tiny bit. Robert only cares about maximizing the total number of lives saved. Furthermore, we suppose for now that Robert cares about every human life equally.

    If Robert does save as many lives as possible, given his bounded money, then Robert must behave like somebody assigning some consistent dollar value to saving a human life.

    We should be able to look down the long list of options that Robert took and didn’t take, and say, e.g., “Oh, Robert took all the options that saved more than 1 life per $500,000 and rejected all options that saved less than 1 life per $500,000; so Robert’s behavior is consistent with his spending $500,000 per life.”

    Alternatively, if we can’t view Robert’s behavior as being coherent in this sense–if we cannot make up any dollar value of a human life, such that Robert’s choices are consistent with that dollar value–then it must be possible to move around the same amount of money, in a way that saves more lives.

    In particular, if there is no dollar value for which you took all of the opportunities to pay less to save lives and didn’t take any of the opportunities to pay more to save lives, and ignoring complications with lives only being available at a given price in bulk, then there is at least one pair of opportunities where you could swap one that you took for one that you didn’t take and save more lives, or at least save the same number of lives and keep more money, which at least in a repeated game like this seems likely to save more lives in expectation.

    I used to be more feisty in my discussion of this idea:

    Another alternative is just to not think about it. Hold that lives have a high but finite value, but don’t use this in naughty calculative attempts to maximise welfare! Maintain that it is abhorrent to do so. Uphold lots of arbitrary rules, like respecting people’s dignity and beginning charity at home and having honour and being respectable and doing what your heart tells you. Interestingly, this effectively does make human life worthless; not even worth including in the calculation next to the whims of your personal emotions and the culture at hand.

  • What didn't happen

    I have a secret fiction blog, which I intermittently mean to publish things on, but apparently haven’t now in over ten years, which seems like a reasonable point at which to make it less secret. Here is the start. It’s not very long.

    Here is an excerpt inspired by events leading to my first kiss (names changed, coincidence with name of my later partner coincidental):

    The main argument for believing other people are conscious is that in all other respects they resemble you. Carrie stared tiredly into the crowd of blurs surrounding her and found this argument uncompelling. She couldn’t actually imagine thinking any of the things that had recently been shouted near her, which strengthened the hypothesis that nobody else was thinking them either. Which pressed the question of why someone was simulating this particular reality for her, and what the significance was of a tall man screeching ‘It’s beer pong o clock!’.

    She had the same unease with movies often. Did that scene of the couple driving in their car add something to the plot? Either half the movie was revealing information entirely invisible to her, or film producers went to great expense to make films a certain length despite the fact that no story required it. She liked to think that if she spent years studying this it would all make sense, as she regularly insisted to other people that everything did if you studied it enough. Part of her was terrified that this wasn’t true. When it got too insistent a bigger, more heavily armed part of her would menacingly point out, ‘that doesn’t make sense and you have zero reason to believe it’ and the other part would whimper ‘what if that doesn’t matter?’ and go off to wring its hands in a less conscious corner. A short handsome boy sat down unusually close to Carrie, and she turned to make funny noises at him.

    “Paul. How do you do?”

    “Uh..I..do..am.. Carrie..fine, actually.. not.. sorry, never mind”, Carrie smiled reassuringly.

    “You’re cute. What do you do?” He pretended to be pushed closer to her by someone else sitting on his other side.

    When she was younger Carrie had had a reasonably high prior on her having a male partner, or several, in her lifetime. By the time she was eighteen and still didn’t have a single close friend, let alone a male one, ‘kiss someone, ever” was well down her list of unrealistically optimistic goals, between ‘stop global warming’ and ‘build a computer that understands everything’. So the fact that this boy seemed to be coming on to her suggested that she was misunderstanding human mating behaviour even worse than she suspected, or that he was much more drunk than he seemed.

    “I try to save the world, but I’m not very good at it. Also I’m not interested in romance at the moment because I’ve just realized that other people probably aren’t conscious, so I think it would be hard to relate to one, and kind of creepy to hang out with them, and other bits would be too much like necrophilia.. so I might go home soon actually”

    “You do philosophy?” he smiled.

    “Not officially”.

    “You’re fun. Come inside and dance with me.”

    “Only if you convince me that you’re probably not a zombie”

    He looked deep into her eyes and made a reassuring smile. His eyes were soft, brown, and impenetrable. She felt completely alone. “I promise you I’m not, and I should know.”

    Nonplussed, angered by his dismissive stupidity, but sheepishly unable to forgo an opportunity to dance with a male, Carrie followed him inside woozily. She wasn’t sure whether to be disappointed or amused at the lack of shattering force with which extremely important philosophical considerations could influence human mating.

  • The ecology of conviction

    Supposing that sincerity has declined, why?

    It feels natural to me that sincere enthusiasms should be rare relative to criticism and half-heartedness. But I would have thought this was born of fairly basic features of the situation, and so wouldn’t change over time.

    It seems clearly easier and less socially risky to be critical of things, or non-committal, than to stand for a positive vision. It is easier to produce a valid criticism than an idea immune to valid criticism (and easier again to say, ‘this is very simplistic - the situation is subtle’). And if an idea is criticized, the critic gets to seem sophisticated, while the holder of the idea gets to seem naïve. A criticism is smaller than a positive vision, so a critic is usually not staking their reputation on their criticism as much, or claiming that it is good, in the way that the enthusiast is.

    But there are also rewards for positive visions and for sincere enthusiasm that aren’t had by critics and routine doubters. So for things to change over time, you really just need the scale of these incentives to change, whether in a basic way or because the situation is changing.

    One way this could have happened is that the internet (or even earlier change in the information economy) somehow changed the ecology of enthusiasts and doubters, pushing the incentives away from enthusiasm. e.g. The ease, convenience and anonymity of criticizing and doubting on the internet puts a given positive vision in contact with many more critics, making it basically impossible for an idea to emerge not substantially marred by doubt and teeming with uncertainties and summarizable as ‘maybe X, but I don’t know, it’s complicated’. This makes presenting positive visions less appealing, reducing the population of positive vision havers, and making them either less confident or more the kinds of people whose confidence isn’t affected by the volume of doubt other people might have about what they are saying. Which all make them even easier targets for criticism, and make confident enthusiasm for an idea increasingly correlated with being some kind of arrogant fool. Which decreases the basic respect offered by society for someone seeming to have a positive vision.

    This is a very speculative story, but something like these kinds of dynamics seems plausible.

    These thoughts were inspired by a conversation I had with Nick Beckstead.

  • Things a Katja-society might try (Part 2)

    (Part 1)

    1. Carefully structured and maintained arguments for any interesting claims that people believe. For instance, I would like to see the argument for any of the causes generally considered Effective Altruist carefully laid out (I’m not claiming that these don’t exist, just that they aren’t known to me).

    2. A wider variety of accommodations. For instance, you could rent houses in cheap versions of this sort of style:





    (A view of the interior of Nasir ol Molk Mosque located in Shiraz, Iran. Image: Ayyoubsabawiki, CC BY-SA 4.0, via Wikimedia Commons)

    Shah Mosque

    (Interior view of the dome of the Shah Mosque, Isfahan, Iran. Photo: Amir Pashaei, CC BY-SA 4.0, via Wikimedia Commons)

    Sheikh Lotfallah Esfahan

    (Sheikh Lotfallah Mosque, Isfahan, Iran. Photo by Nicolas Hadjisavvas, CC BY 2.5, via Wikimedia Commons)

    1. Adult dorms. An organization would buy a complex that could house something like a few hundred people, and had common areas and such. They would decide on the kind of community they wanted, and what services they would provide (e.g. cleaning, food, nice common areas). There would be a selection process to get in. If you lived there, you would be assumed part of the community, like in school.

    2. Well directed quantitative contests and assessments for adults, that put numbers on things that the adults would like to care about. If there were a Save The World From X-Risk Olympiad, or an ‘expected relative value of your intellectual contributions’ number that was frequently updating, it would be easier to optimize for those things relative to e.g. optimizing for number of internet-dogs who visited your page, or success at memorizing Anki cards.

    3. Social-implication-limited socializing services. There are many reasons to want to be around other people, and not all of them are strongly correlated with wanting the entire cluster of attributes that come with the specific personal relationships that you want. For instance, if you want some social pressure to have your act together sometimes, but the kinds of people you make friends with are too forgiving, you can have someone with their act together stop by sometimes and totally expect you to be on top of things. Or if you are sick and want someone nice to take care of you, yet none of your specific personal relationships are at a point where getting vomit on them would be a plus? Or if you just sometimes want to tell a young person some useful life-lessons, so you be helpful instead of irrelevant, you don’t have to go out and have a whole relationship with a younger person to do that.
      (If this went well, an ambitious version might try to move into transactionalizing less-transactional relationships, where for instance you say you want to have a long term slightly flirtatious yet old fashioned relationship with the person you buy bread from, and your preferences are carefully assessed and you are sent to just the right bread seller, and you don’t even like bread, but you like Raf’s bread because you know the whole story about his epic fight for the recipe, and you were there when the buckwheat got added, and the smell reminds you of a thousand moments of heartfelt joy at his counter, laughing together over a pile of loop-shaped loaves like this one. Which is exactly what you wanted, without the previous risks and trials and errors of being unusually open in bakeries across your city.)

  • The art of caring what people think

    People care what people think. People often strive to not care what people think. People sometimes appear to succeed.

    My working model though is that it is nearly impossible for a normal person to not care what people think in a prolonged way, but that ‘people’ doesn’t mean all people, and that it is tractable and common to change who falls into this category or who in it is salient and taken to represent ‘people’. And thus it is possible to control the forces of outside perception even as they control you. Which can do a lot of the job of not caring what other people think.

    To put it the other way around, most people don’t care what other people think, for almost all values of ‘other people’. They care what some subset of people think. So if there are particular views from other people that you wish to not care about, it can be realistic to stop caring about them, as long as you care what some different set of people think.

    Ten (mostly fictional) examples:

    1. You feel like ‘people’ think you should be knowledgeable about politics and current events, because they are always talking about such things. You read some philosophers through the ages, and instead feel like ‘everyone’ thinks you should be basically contributing to the timeless philosophical problems of the ages. (Also, everyone else has some kind of famous treatise - where is yours?)
    2. You haven’t really thought through which causes are important, but ‘people’ all seem to think it’s nuclear disarmament, so looking into it feels a bit pointless. You go to a weekend conference on soil depletion and experience the sense that ‘people’ basically agree that soil degradation is THE problem, and that it would be embarrassing to ask if it isn’t nuclear disarmament, without having a much better case.
    3. You are kind of fat. You wish you didn’t care what ‘people’ thought, but you suspect they think you’re ugly, because you’ve seen ‘people’ say that or imply it. You read about all the people who appreciate curviness, and recalibrate your sense of what ‘people’ think when they see you.
    4. You can hardly think about the issue of gun regulation because you feel so guilty when you aren’t immediately convinced by the arguments on your side, or don’t have an eloquent retort for any arguments the other side comes up with. You wish you were brave enough to think clearly on any topic, but knowing everyone agrees that you would be contemptible if you came to the wrong conclusion, you are stressed and can’t think or trust your thoughts. You become an undergraduate and live in a dorm and hang out with people who have opposing views, and people who don’t care, and people who think it’s unclear, and people who think that thinking clearly is more important than either side. Your old sense of ‘people’ condemning the bad side is replaced by a sense that ‘people’ want you to have a novel position and an interesting argument.
    5. You tried out writing poetry, and to your surprise you really like it. You want to share it, but you think people will laugh at you, because it’s all poetic. You wish you didn’t care what people thought, because you want to express yourself and get feedback. But ‘people’ in your mind are in fact your usual crowd of Facebook friends, and they are not poetic types. But if you instead share your writing on allpoetry.com, you are surrounded by people who like poetry and compliment yours, and soon you are thinking ‘people liked my poem!’.
    6. You kind of think climate change is a big deal, but ‘people’ seem to think it isn’t worth attention and that you should focus on AI risk. It doesn’t seem like their arguments are great, but getting into it and being the one person with this crazy view isn’t appealing. So you tell the next five people you meet from your social circles about the situation, and they are all like, ‘what? climate change is the worst. Who are these cranks?’ and then you feel like socially there are two sides, and you can go back and have the debate.
    7. You want to write about topics of enduring importance, but you can’t bear to be left out of what people are talking about, and you feel somehow silly writing about the simulation argument when everyone is having a big discussion together about the incredibly important present crisis. So you make an RSS feed or a Twitter list of people who keep their eye on the bigger questions, and converse with them.
    8. You feel like people are super judgmental of everything, so that it’s hard to even know what flavor of hummus you like, as you anticipate the cascade of inferences about your personality. The only thing that keeps you expressing preferences at all is the distain you expect looms for indecisive people. So you notice who around you gives less of this impression, and hang out with them more.
    9. You imagine liking being a mathematician, but the other kids have decided that physics is cooler, and you don’t want to be left as the only one doing a less cool degree. So you do math anyway, and a year later you have new friends who think math is cooler than physics.
    10. You hang out with various groups. Some clusters are so ubiquitously accomplished that you think they must have let you in by mistake. In others, people turn to look when you walk in, and a crowd gathers to talk to you. You find yourself gravitating to the former groups, then developing an expectation that ‘people’ are never impressed by you, and being discouraged. So you hang out in broader circles and are buoyed up by ‘people’ being regularly interested in you and your achievements.
  • In balance and flux

    Someone more familiar with ecology recently noted to me that it used to be a popular view that nature was ‘in balance’ and had some equilibrium state, that it should be returned to. Whereas the new understanding is that there was never an equilibrium state. Natural systems are always changing. Another friend who works in natural management also recently told me that their role in the past might have been trying to restore things to their ‘natural state’, but now the goal was to prepare yourself for what your ecology was becoming. A brief Googling returns this National Geographic article by Tik Root, The ‘balance of nature’ is an enduring concept. But it’s wrong. along the same lines. In fairness, they seem to be arguing against both the idea that nature is in a balance so intense that you can easily disrupt it, and the idea that nature is in a balance so sturdy that it will correct anything you do to it, which sounds plausible. But they don’t say that ecosystems are probably in some kind of intermediately sturdy balance, in many dimensions at least. They say that nature is ‘in flux’ and that the notion of balance is a misconception.

    It seems to me though that there is very often equilibrium in some dimensions, even in a system that is in motion in other dimensions, and that that balance can be very important to maintain.

    Some examples:

    • bicycle
    • society with citizens with a variety of demeanors, undergoing broad social change
    • human growing older, moving to Germany, and getting pregnant, while maintaining a narrow range of temperatures and blood concentrations of different chemicals

    So the observation that a system is in flux seems fairly irrelevant to whether it is in equilibrium.

    Any system designed to go somewhere relies on some of its parameters remaining within narrow windows. Nature isn’t designed to go somewhere, so the issue of what ‘should’ happen with it is non-obvious. But the fact that ecosystems always gradually change along some dimensions (e.g. grassland becoming forest) doesn’t seem to imply that there are not still balance in other dimensions, where they don’t change so much, and where changing is more liable to lead to very different and arguably less good states.

    For instance, as a grassland gradually reforests, it might continue to have a large number of plant eating bugs, and bug-eating birds, such that the plant eating bugs would destroy the plants entirely if there were ever too many of them, but as there become more of them, the birds also flourish, and then eat them. As the forest grows, the tree-eating bugs become more common relative to the grass-eating bugs, but the rough equilibrium of plants, bugs, and birds remains. If the modern world was disrupting the reproduction of the birds, so that they were diminishing even while the bugs to eat were plentiful, threatening a bug-explosion-collapse in which the trees and grass would be destroyed by the brief insect plague, I think it would be reasonable to say that the modern world was disrupting the equilibrium, or putting nature out of balance.

    The fact that your bike has been moving forward for miles doesn’t mean that leaning a foot to the left suddenly is meaningless, in systems terms.

  • A great hard day

    (Draft from a while ago, lightly edited.)

    There were times when I was younger that I used to fantasize about having a friend at all, let alone a boyfriend. And there were times when I thought that if I could just figure out how to make life consistently bearable, I’d really be onto something. So when I say how great my life is, it means that hard lives can get a lot better, not that mine is likely to be consistently more awesome than yours (I hope).


    Today was great. I arrived in the world caught in a bundle of sheets with my boyfriend. Half asleep, I decided to wake him up by incrementally escalated cuddling, which I assume is similar in its benefits to those slowly loudening alarms.

    At work I came across a thing that was that most unpleasant combination, of implicitly disrespectful and genuine evidence that I might be bad. Which I dealt with largely with calm, curiosity, and good intent. I thought about it and wrote down considerations. Then I asked a couple of other people about that and about another customarily shameful and distressing question, for good measure. I felt something good in my mind growing in strength, and exercising it made other things blossom: what had been an uncomfortable reserve into a fruitful friendliness.

    I had gone to bed last night with a headache, and fallen asleep thinking that on the outside view headaches often disappear by sleeping, but that my intuition said that this one would get worse. By the time I finished making lunch today, it was so bad that I quickly degenerated into an unhappy heap. It was that kind of headache where you feel hot and fragile and your neck aches and you wonder if you have meningitis but you don’t have the strength to get into that kind of inquiry. I lay in my reclining chair and thought that it would be wise to take painkillers, but that would involve doing things.

    My boyfriend came and looked after me. He put all the things around me - tea and sugar in a bowl and apple sauce and little packets of MeWe peanut butter and painkillers. He got another table for it all, and rubbed my neck, and looked in my eyes, and talked to me about what I care about in the world. I nibbled at the sugar and sipped the tea. I played Sporcle and learned about historic dates and American presidents, and I didn’t feel like I should be doing something else.

    I took some xanax, in case my headache was being worsened by my unease about it. I suppose it knocked out my unease at all levels about anything, because after recovering a bit I just kept wanting to work, until I’d been at work for about 10.5 hours, even having missed two hours to wretchedness in the middle of the day.

    I felt communality with the people far away reaching out to me across the internet. My room was full of warm lamps and orange wood, with green leaves here and there. My housemates made me meatballs and pasta and my boyfriend brought them to me with butter and parsley and dill. I was comfortable in my fully-reclined chair. I thought about things and made decisions. Someone sent me a book they were writing, and I liked it.

  • Evolution from distinction difference

    If we have norms such that each copy of a common behavior must to be a tiny step away from from its parent, rather than a giant step or no step, this would seem to make culture much more amenable to gradient descent via evolution than it otherwise would be.

    Is the latter somehow reason for us seeing the former? For instance, did ancient groups who frowned on really weird people and who felt awkward being too conformist outflourish other groups with their better evolved cultural norms and artifacts?

    Also, is this picture true and applicable to the real world? Is the size of these steps in human culture such that culture learns well? Can you see such effects in the world?

    Is this why women’s clothing—which seems expected to vary more between women—also changes faster over time than men’s clothing? (Is that even true?)

  • The distinction distance

    People have a strong tendency to be different from one another (e.g. are horrified to be caught in the same dress, find it weird to order the same dishes as their companion without comment or to choose the same art for their living room). Yet they also have a strong tendency to conform.

    These are even in the same areas, and the best behavior seems to be balancing on an edge between the two forces. You don’t want to wear either a dress that someone else is wearing, nor a dress in a style that hasn’t been worn since the 1600s.

    I have noticed both of these human forces before, but I hadn’t seen them so vividly as acting in the same realm. You don’t want your essay to be on an identical topic to another student’s, but you also don’t want it to be outside the bounds of what the professor thinks of as an essay, or expressing views beyond a short hop from those others would endorse.

    This makes me imagine the curlicues of culture as growing in the fertile interstitial zone between options too conformist to consider and options too wild to consider. Kind of like a Mandelbrot set or a tidal flat or a cellular automaton. There’s a similar pattern in the space of ways the whole of culture could have been: if everyone was very conformist about everything, it would be monotony, and if everyone immediately flung themselves as far away from anyone else as they could on every axis, it would be another kind of monotony. But with this balance of effects as it is, we get some complicated spiraling evolution of art movements and attitudes, trousers and tools. Each idea bringing forth riffs of of it in every direction.

    Inspired by a conversation with Robin Hanson, where he probably basically said the main point here, that these two forces act in opposition.

  • Massive consequences

    Hypothesis: whenever you make a choice, the consequences of it are almost as likely to be bad as good, because the scale of the intended consequences is radically smaller than the scale of the chaotic unintended effects. (The expected outcome is still as positive as you think, it’s just a small positive value plus a very high variance random value at each step.)

    This seems different from how things are usually conceived, but does it change anything that we don’t already know about?

    Could this be false?

  • Current cryonics impressions

    People around me often sign up for cryonics, and say that it is very important. My guess is that this argument for it, heavily inspired by Waitbutwhy’s much longer piece, as well as years of talking to people around me and reading their blogs, is correct:

    1. One day people will probably live much longer than they do now.

      Probably we will work out how to beat the diseases of aging, as we have many of the infectious diseases. Eventually dying at age 90 of heart disease will seem as much of a needless tragedy as dying of an infection at age 45 does to us now.

    2. One day we will probably be able to ‘freeze’ and usefully thaw organs like brains using vitrification.

      We can already do this with other organs. For instance a rabbit kidney can apparently already be vitrified then warmed up and put back in a rabbit and work.

    3. People can start to successfully evade the diseases of aging as soon as science reaches the freezing part of 2, even if it hasn’t got to the thawing part or to 1 yet. Because once you are vitrified, you can wait quite a long time for further developments.

    4. There is a decent chance that we are already at the freezing part of 2. For instance, a defrosted vitrified rabbit brain apparently appeared to be in good order, though I assume we don’t know how to reattach brains to rabbits, alas.

    5. The chance that we are there on the freezing is high enough that people dying soon (by our current standards of irrevivability) should generally be vitrified instead of burned or buried, if the chance to survive longer is worth the price to them.

    6. You can sign up for something like this at the cost of a not-super-expensive life insurance policy, though I think the more promising techniques at the moment aren’t available yet to purchase.

    I haven’t actually signed up for this, but I might, and if I thought there was a higher chance of me dying sooner, I would get around to figuring it out more urgently. So I thought I’d point it out to others older than me, who might want to think about it more promptly.

    I found Waitbutwhy’s essay on these topics pretty good.

  • Ways of being with you

    Suppose you want to get in touch with your care for someone, or to feel empathy for them. One way is to imagine what it is like to be them, looking out of their eyes and feeling what they are feeling. For instance, if your friend did something frustrating, you might imagine the experience of choosing under pressure that might have led to it, and you might thereby feel warmth toward them and sympathy for their error.

    But I think it is more common to do something else, kind of like imagining them from the outside, but such that their mental content is also somehow accessible. For instance, you might imagine them hurrying around, being frazzled and not sure what to do, trying. And it’s not just that they look like they might be those things - in imagination, it is available that they are. While you aren’t in their mind, you also aren’t just observing its consequences and inferring things about it from a distance. You experience their feeling.

    But what is it to experience a person’s sadness, say, without experiencing it as an aspect of their experience? i.e. as something that can requires an imagining of their mental experience to imagine, in the way that imagining the wetness of a dog would seem to require—or be greatly aided by—imagining the dog?

    In the everyday togetherness of being with someone, I am also rarely actually picturing the world through their eyes, and yet I am often doing some kind of ‘seeing things from their perspective’ and some kind of ‘being with them’ that feels like relatively direct contact with their mind. (What is it to experience a mind, if not to look at the conscious experience that is a mind’s main deal?) I am engaging with them very differently from how I would engage with an object, or with a set of observations of a distant person. Is it strange that your experience of being you is not the thing that I focus on when I’m experiencing you, together with you?

  • Speaking of the efficiency of utopia

    I noted that it is probably reasonable for people to be wary of things introducing themselves as ‘efficiency’, since maximization of specific metrics has some tendency to go badly.

    On the other hand, ‘efficiency’ doesn’t mean anything about building explicit or quantitative machinery. It just means getting a lot of the desired output per input. So one might wonder why, if these explicit efforts would tend to make things worse for our actual goals, we would pursue efficiency in such ways, and continue to call that ‘efficiency’. For those who think quantitative pursuit of well-defined goals has been a bad strategy overall, shouldn’t ‘efficient daycare’ suggest a daycare where we have used our best intuitions for holistically improving the experience?

    I think one reason why not is that you don’t tend to have much evidence that a thing is efficient unless you are in the business of making quantitative measurements. You are more likely to say ‘this daycare is unusually efficient’ if you have been measuring ‘costs of providing legally adequate childcare’ across different centers, than if you have been observing children and workshopping policies for their thriving. If you have been doing the latter, you will probably just call it something else.

    It seems to me that we would be better with more specific concepts though, rather than conflating striving with an eye to a defined metric and the doing the best we can do with our materials. I suggest narrow efficiency and inclusive efficiency. An assembly line is narrowly efficient. Utopia is inclusively efficient.

  • Covid cafes

    I’m puzzled that during the pandemic so few cafes near me have moved to serving customers outside, by moving their ordering and payment apparatus to the doorway. I’ve seen about five cafes in San Francisco do this (few enough that none are conveniently close).

    (I wanted to include a photo, but I actually just can’t find a picture online, such an obscure idea it is, I guess?)

    Is this harder than it looks to organize? And even if it is for a small business run by a single person without a spare second all year to think about aerosols or reorganize, I’m still surprised that Starbucks doesn’t have its act together more.

    How bad is it to have your customers indoors to order and pickup their drinks? Some Microcovid estimates for San Francisco, right now: Indoor: about 20 microcovids Outdoor: about 1 microcovid

    So very roughly 20 extra microcovids per visitor.

    How many visitors? Seems like a Starbucks restaurant serves maybe 500 customers per day = ~10k extra microcovids per day. A lot, but still 100 days to cause an extra case of covid. So only a few extra cases of covid per year, per cafe. Then maybe a factor of two for people they infect. Should they move their whole ordering counter to the front of the store just to avoid like seven people getting covid per year?

    Well, that’s maybe 14% of a death per year per cafe, and maybe one person disabled longer term.

    Which seems pretty bad, for an arrangement of furniture. For instance, if some brand of cafe had a characteristic setup of tables such that every year in one in seven of these cafes someone died, and in just about every outlet, someone was badly injured every year, I think their negligence would be their main claim to fame, if they were not straight up shut down by the law. Am I wrong?

    We can also ask, how bad is it for someone to get covid, then compare to the costs of rearranging. Let’s very roughly guess: P(death) of 1.7% * estimated statistical value of life of $9M * 15% of life left at average age of US covid death of about 76 * a factor of 2 for disability (a guesstimate I’ve heard, based on way more likely but way less bad than death, though I wonder about this, since it is also affecting people much younger). That gives us $0.05/microcovid.1 So doing it outside seems worth very roughly $1 to each customer, or about $500 per day across customers.

    This has ignored spread from each person who gets it to others, which seems hard to reason about, but it seems that so far a person who gets covid spreads it to more than one other person directly on average, since the number of cases has gone up overall. For the social costs of this, we also care about further cases caused indirectly, but that seems hard to reason about, so let’s say roughly (and I think optimistically) that if you give an extra customer covid, that causes around one additional covid case. This gives us a total social cost of around $1000 per day from not moving the counter.

    (This doesn’t account for the customers who find your cafe too risky and avoid it.)

    The effort of moving the counter seems unlikely to be this high.

    But maybe it isn’t negligible, and cafes can’t afford to do it without recouping some costs from customers, and there just isn’t demand?

    Do customers not care? If an identical coffee were $2.00 more at one cafe instead of another nearby, I expect a lot of people to habitually frequent the cheaper one. Do people not think covid is that bad? (Why not?) My guess would have been that there were a variety of people, and that many people were going to great lengths to avoid getting covid, so at least their patronage would be altered by the covid dose one gets with one’s coffee. But that doesn’t seem true.

    Especially perplexing evidence reached me via a trip to Philz. I am told they did have some kind of outdoors serving, which makes sense since they have a large window in the front of their store. But when I went there, in the middle of the pandemic, they had just moved everything back into the store and proudly told me that ‘I could come inside!’ as if I was really hanging out for an opportunity for some indoor co-breathing but had been barred from this by some kind of paternalism. I continued on my way, looking for an outdoor cafe, but couldn’t find one, so eventually came back because at least Philz could be ordered from my phone and had a window open. But their behavior suggests that there is some kind of public demand for the joy of ordering indoors.

    I wonder if I’m wrong somehow. I’ve historically been pretty pro-Starbucks, but when their customers’ lives were at stake, they seem to have just thrown a bunch of them away. I wish I saw their perspective better. (Not to suggest they are worse than many others. I wish I understood any of them.)

    1. For a different estimate, I think my house puts a microcovid at about 0.2 minutes of loss in expectation (including from death and disability). If an average person here earns very roughly $50k, maybe they value their time at roughly $25/h (their salary), which is $0.08 per 0.2 minutes, so $0.08 per microcovid. 

  • Elephant seal 2

  • Feedback for learning

    I wish there were better (or more known to me) quantitative tests of skills that are good for me to have. I often find learning things when there is tight feedback pretty fun. For instance, I play geography and history quizzes on Sporcle with an addictive vigor, and enjoy learning various languages on Duolingo, and various facts via Anki. I used to memorize poetry from the bathroom walls. But none of these seems that useful (in Anki I mostly learn about famous art and art movements). And meanwhile, I fail to know all manner of things that would be good to know, and forget most of what I read. (For instance, I’d like to know many more details of machine learning, how the US government works, and what happened in most of history, and I wish I remembered the details of The Precipice or The Better Angels of our Nature or even War and Peace—which I haven’t read much of, substantially because I keep losing track of what is going on or who the characters are.)

    I feel like the situation was better in high school: there were contests and exams for all kinds of stuff that seemed decently useful, like math and chemistry. I guess one problem with my current situation is that because I learn things in a more distributed way, they don’t tend to come with a well matched test. A couple of times I have learned some math for instance, then looked for a suitable exam, but most exams don’t match the exact math that I learned. I suppose a solution might be to only learn things from places that also offer tests, but somehow this doesn’t sound great. Perhaps it is that the tests I found before didn’t seem that promising for rapid feedback anyway - you had to turn to a different page of a pdf or book and search through and check your own solutions. Imagine how much less fun Sporcle would be if instead of instantaneously responding to your touch, nothing happened until the end of a quiz, at which point it emailed you a pdf of the answers.

    Some of this seems like it might have an easy solution that someone else knows about, at least for areas that are readily quantitatively examinable, such as math. So if I seem to be wrong, please correct me!

    Do you know of well-gamified learning opportunities or tests for things that are worth studying?

    I had fun practicing calibration of predictions, which seems useful, until the app idiosyncratically broke for my account in a permanent way. I also recently noticed the existence of Quantum Country, which apparently teaches about quantum mechanics via essays with built in spaced repetition, so I’m also excited to try that.

  • Oceans of snails

    Somehow while knowing that a) ‘sea snail’ was a concept, and b) beaches were strewn with sea shells, which involve various snail-reminiscent spirals, I failed to reach the stunning conclusion that the oceans are substantially inhabited by these kinds of characters:

    land snail

    But it’s true. And it seems that not only are the spiral-looking shells from snails, but various flat and non-spiral ones are too.


    Little Mountain 5, CC BY-SA 3.0, via Wikimedia Commons

    Julia exquisita

    Alexandre Laporte, CC BY-SA 4.0, via Wikimedia Commons


    Image: Malcolm Storey, BioImages - the Virtual Fieldguide (UK) through the Encyclopedia of Life and the creative commons CC BY-NC-SA license.

    Cowrie snail NOAA (Photo Collection of Dr. James P. McVey, NOAA Sea Grant Program), Public domain, via Wikimedia Commons

    This is a meaningful update because snails are a particular kind of thing I’m familiar with, and furthermore I like them.

  • Play with neural net

    I found Tensorflow Playground pretty cool to play with, in case anyone is looking for the kind of intuitive understanding of neural nets that futzing with a thing with one’s hands contributes to.

    Tensorflow playground

    One game you can play is pretending to be a training algorithm. How would you adjust the weights, if you wanted to classify which of two spirals a point comes from?

  • Top li'l pots

    I am a connoisseur of a certain genre of mostly-snack food, combining aesthetics, convenience and the idiosyncratic types of actual food that I like to eat (central examples of food whose substance I like, ignoring form, are parsley, farro, black lentils, pecans, craisins, dark chocolate, feta cheese, swedish crispbread, cream, ginger, brown or raw sugar on its own. Central non-examples are cinnabons, most cereals, cheetos, milk chocolate, cheese-flavored anything, pea protein, Coca-Cola, parmesan cheese, sweet potatos, spinach, beyond burgers.)

    Here is a list of some of the best li’l pot style foods I have come across so far (not all literally in small pots). Links for illustration, not guaranteed to be value-for-money instances.

    Ones regarding which my enthusiasm has extended to writing, photography, transcendent-style elation, notable ongoing quality of life improvement, or extended consideration of the ethical implications of eating all of them before anyone else knows that they exist

    • Mewe snickerdoodle peanut butter packets
    • Hanuta hazelnut wafers
    • Serenity food squashes
    • Jonny pops, especially cherries and dark chocolate, but also strawberry, chocolate mint, probably most other flavors
    • Many Marks & Spencer’s prepared foods, if you are in the UK (e.g. coconut pieces, cream yogurts, wasabi peas, cheeses, raspberry jelly, flapjacks, Our Best Ever Trifle, berries, small sausages, rice crackers, cappuccino chocolate things; but mostly I like the experience of being able to construct your own bento box of foods like this for lunch)
    • Ferrero Rochers - I realize these shouldn’t generally be a sustaining food, but sometimes if I lose my appetite for some reason, and yet would benefit from eating, and they are one of the few things that are still appealing

    Ones I have reliably liked and eaten for some period

    Ones I like but haven’t gotten super into so far

    Ones in a similar genre that I’m not that into but other people really like

    Ones that are almost or sometimes great but fall down on at least one dimension

    • Potato chips and fritos: great once in mouth, but oily on hands and packet non-pleasing
    • Mi goreng (or in cups): delicious and mostly convenient but opening flavor packets is hard and often causes them to spurt on hands and nearby things
    • Blueberries: excellent food but ask to be washed and are then wet and dripping from the holes in package; plus often some are moldy. M&S I think does this well. Some are much more delicious than others.
    • Jello cups: promising but jello is usually not substantial enough, and one time I cut my hand pretty bloodily trying to open the Jello brand ones. Pretty good from M&S, with raspberries in.
    • Rice pudding: a promising type of food in a promising form, but I don’t know of an ideal instance of it for my tastes
    • Cereal cups: seems maybe great if I liked cereal
  • Unpopularity of efficiency

    I feel like ‘efficiency’ is often scowled at. It is associated with factories and killing and commercialization, and people who are no fun. Things are openly criticized for being oriented toward efficiency. Nobody hopes to give their children an efficient childhood or asks for an efficient Valentine’s day, unless they want to get it over with. I expect wariness in listeners at talk of efficient charity.

    This intrigues me, because in what I take to be its explicit definition, ‘efficiency’ is almost the definition of goodness manifest. The efficiency of a process is the rate with which it turns what you have into what you want.

    I usually wince when people criticize efficiency, and think they are confused and should be criticizing the goal that is being pursued efficiently. Which does seem basically always true. For instance, if they are saying their childcare center cares only for efficiency, they probably mean that it is doing something like trying to minimize financial costs without breaking the law. Perhaps by fitting many children into a room with minimal oversight or attention to thriving. Here, I would complain that the childcare center cares only about its profits and not breaking the law. If it was fulfilling my own values efficiently, that would be awesome.

    However I think there is more merit to efficiency’s poor reputation than I have given credit for. Because pursuing efficiency does seem to systematically lead to leaving things out. Which I suppose perhaps makes sense, for creatures who don’t explicitly know what their values are, and especially who have trouble quantifying them. If you set out to build an efficient daycare center, chances are that you don’t know exactly what makes a daycare center good, and are even less well equipped to put these things into numbers and build machinery to optimize those numbers. (This would be much like the AI alignment problem but where the AI you are trying to direct is made of your own explicit reasoning. It might also what Seeing Like a State is about, but I haven’t read it.) It’s still not clear to me why this would systematically turn out actively worse than if you didn’t aim for efficiency, or whether it does (my guess is that it usually doesn’t, but sometimes does, and is notable on those occasions). If efficiency has really earned its poor reputation, I wonder if I should be more worried about this.

  • What is up with spirituality?

    What is up with spirituality? I mean, from an atheistic perspective?

    In my experience, atheists tend to focus on the empirical question of whether there is an all-powerful supernatural creature behind all that we observe. And yeah, there probably isn’t.

    But having won that point, what does one make of the extreme popularity of religion? I think the usual answer given is something like ‘well, we used to be very ignorant and not have good explanations of natural phenomena, plus we tend to see agents in everything because our agent detection software is oversensitive’.

    Which might explain the question ‘Why would people think a supernatural agent controls things?’. But what seems like only a corner of religion.

    Another big part of religion—and a thing that also occurs outside religion—seems to be ‘spirituality’—a cluster of things I find hard to describe, but which seem pretty disconnected from explanatory questions of where trees came from or why the crops failed.

    Some stylized facts about spirituality:

    • People across religions have ‘spiritual experiences’ that involve particular styles of feeling
    • People can have spiritual attitudes more broadly
    • Spiritual attitudes involve less interest in ‘worldly’ or petty things
    • Spiritual things often involve sacredness and ritual
    • Spiritual feelings are often associated with awe and the word ‘transcendent’
    • Spiritual experiences do not present themselves as about hedonic enjoyment of the experience
    • Spirituality is connected with ‘meaning’ and tends to feel deep and significant
    • Spirituality tends toward an interest in philosophy, especially metaphysics, especially regarding whether everything is one, the nature of selfhood, the possibility and nature of ‘souls’, the nature of love
    • Certain drugs seem to induce spiritual or adjacent states
    • Spirituality tends to be fairly positive, in the sense that from a satisfactorily spiritual mental place, one expects the world to seem broadly good or at least compatible with peace
    • Spirituality is related to ideas of connecting with something beyond yourself
    • Spirituality is sincere
    • Spirituality is serious
    • Spirituality is normative
    • I suspect that from a spiritual vantage, this list might seem like a mundane missing of the point

    Why are people into spirituality? Why is it a kind of thing that humans can experience? Why is it a thing they experience so much?

  • Wordtune review

    Wow, Facebook is advertising an AI-driven writing tool to me. I think this is the first time I’ve seen a mainstream potentially useful product coming from all of this. Let’s see how it does. I’ve started by writing this paragraph quickly myself, and then I will rewrite it using Wordtune, to demonstrate its features. Then I may rewrite it further using its premium features or my more honed writing, all going well.

    Wow, Facebook is showing me a machine-learning based writing platform. I believe that this is the first mainstream application of all this. See how it goes. I wrote this paragraph on my own quickly, and then rewrote it using Wordtune to demonstrate its features. Then I may rewrite it using its premium features or using my more refined writing skills, if all goes well. —Wordtune

    I am intrigued by Facebook’s advertisement with regards to an artificial intelligence (AI) writing tool. I’m not aware of any mainstream application arising from these efforts.Let us see how it goes. I created this paragraph quickly myself, and now I’m going to rewrite it using Wordtune to demonstrate all it has to offer. Afterwards, I will be able to incorporate premium features in it or craft it into a more refined style, all going well. —formal Wordtune

    Wow, Facebook is advertising a writing tool which uses artificial intelligence (AI) that I did not know existed. This is something of a landmark event; I’ve not yet seen a mainstream useful product coming out of anything like this. I’m looking forward to seeing how it goes. I’ve begun by writing this paragraph myself but I will then rewrite it with the aid of Wordtune to illustrate its various features. Then I may rewrite it further using its premium features or my more honed writing, all going well. —longer Wordtune

    Wow, Facebook’s touting a writing tool powered by artificial intelligence. This is the first time I’ve seen a mainstream use coming from all this. Let’s see what it does. I wrote it quickly, and I’m gonna rewrite it with Wordtune, just so I can show you how it works. After that I’ll maybe rewrite it more using its premium features or my own better skills. —casual Wordtune

    Wow, Facebook offers me an AI-driven writing tool. This is the first mainstream product I’ve seen from all of this. Let’s see. I first wrote the paragraph, and then I rewrote it with Wordtune, so you can see its features. Then I will retake it using its premium features or my better writing, all going well. —shorter Wordtune

    Facebook just surprised me with an ad for an AI writing tool. Could it be the first useful mainstream product to come from all this?  Let’s take it for a ride. I wrote this paragraph fast, then asked Wordtune to rewrite it, to show you its moves. I might repeat the demonstration with its other style options, or mine, fate willing. —me spending more time

    A Facebook ad for AI-improved writing! GPT-3-like tools enter the mainstream. Let’s try it. I jot five lines, apply Wordtune, and watch. Next I might compare paid features or my own skills. —me shorter

    Facebook takes a break in its endless stream of extremely comfortable-looking bra suggestions and creepy-but-welcome commercials for things I mentioned in conversation or thought of in a dream once or something, to offer me artificial intelligence to automate half my writing process. Wait, what? Is it the future at last? Shit! I like that half of my writing process. Well, let’s see it. I’ll write a shitty paragraph, and then see what this ‘Wordtune’ can do with it. Then maybe I’ll see what I can do with it, to compare (and see if it’s all over for my joyous thesaurus expeditions). —me, humorous conversational

  • Tentative covid surface risk estimates

    My household previously made some highly uncertain estimates of the covid risk from bringing random objects that other people have recently been touching into our home, for instance salads and groceries an endless stream of Amazon packages. The official guidance is very vague, e.g. “…not thought to be the main way the virus spreads”. Our bad estimates were fairly low, so we decided to basically ignore it in our covid risk accounting, except for habitually taking some reasonable precautions.

    Then the covid rates here increased by a factor of ten, so we decided it would be good to look at it again.

    So today I tried to estimate this from this paper (HT Ben Weinstein-Raun and Catherine Olsson) in which a group of researchers swabbed various door handles and trash cans and crosswalk buttons and the like in a small Massachusetts city and measured the covid RNA detectable on them. They also used the amounts they measured to estimate the infectiousness if someone else were to touch the surface and then touch their face.

    Here I offer you a draft of an elaborate Guesstimate spreadsheet on the topic, in case you are interested in such things. Hopefully someone is, and will tell me ways that it is wrong, then later I might offer you something reliable. At present, it is probably pretty unreliable, and should only be used insofar as you would otherwise use something even more unreliable. Some non-exhaustive evidence of its unreliableness:

    • I haven’t actually read much of the paper
    • The answers in the spreadsheet have changed substantially while I have felt about as confident in them as I do now
    • There are numerous places where it seems dubious to me, or where I made up numbers
    • I try not to be the sort of person who only shares things if they are of high quality, even if the stakes are high
    • This calculation is ignoring the efforts everyone is making to be safe, so you might underestimate the risks if surfaces look low risk in this study because supermarket employees are actually constantly wiping them down, for instance. So it should probably be interpreted more like ‘if you take levels of caution similar to those taken with the study surfaces…’.

    Interesting tentative conclusions so far, rely upon at own risk:

    • a person with covid touching something then you touching it then touching your face is worth extremely roughly 13 microcovids (uCov) (with 90% confidence of 0.7 to 110 according to my estimate, but I wouldn’t trust that)
    • thus such a touch-touch-face sequence with a random person in San Francisco (where I live) at the moment is ~0.7 uCov (give or take an order of magnitude)
    • adding further wild guesses about how many touches are involved in acquiring groceries, I get that a 30 item San Francisco grocery trip is worth about 5 uCov (give or take an order of magnitude)
    • that would mean about two cases from groceries in San Francisco per week, (give or take an order of magnitude) which doesn’t sound crazy to me. (You might think it does sound crazy, because surely we would notice that by now, but I’m pretty uncertain about our collective ability to observe and infer things.)

    The basic reasoning is this:

    • If an infected person touches a surface, it looks like it has about a 13% chance of becoming detectably infectious by this method (based on 36 samples from grocery store surfaces which are estimated by me to receive very roughly 2 covid-infected touches per day yielding 4 positive samples, along with some complication with distributions that make the arithmetic strange.)
    • Average risk from touching one of these positive-testing surfaces is very roughly 100 microcovids (taking an estimate half way between the average grocery surface infectiousness and the average surface infectiousness, according to the paper)
    • So if a person with covid touches a surface, then you touch it and then touch your face, this gives us about 13% of 100 microcovids = 13 microcovids (.0013% chance of covid)

    I welcome all kinds of comments, e.g. even ‘I want to know why this cell says 27, but I can’t be bothered reading all these things’.

  • Li'l pots

    As a pandemic-era purchaser of foods for a large household of time-thirsty researchers, I can tell you an interesting thing about the demand for cheese in this context:

    1. If you spend a lot of money on a nice cheese, wrapped up in some fancy foreign label, there is a good chance that it will languish sadly in the back of the fridge for months until someone notices that it is moldy and throws it away, or makes a last-ditch attempt to cut up the whole thing and compel the group to eat it. Maybe on the way there, someone will take a single slice of it once, and move it in a zip-loc bag, where it will remain until the end.

    2. If you spend a few dollars on a six-pack of generic single-serve cheese-cubes with nuts, they will fly from the fridge and you will be acknowledged for this triumph of shopping, and more such cheese will be needed by the next grocery order.

    It was initially hypothesized by a housemate that this was due to error. The cheese cubes are more expensive per unit of cheese, while also consisting of worse cheese. Which is fairly suggestive of overall worseness. One could further note that they involve substantially more packaging, and take up more space per cheese. So a natural theory is that the cheese-cube eating housemates are erring, due to some kind of short-sighted non-endorsed laziness.

    I’m with the cheese-cube eaters, except at least ten times more passionately (for instance, I am writing an essay in favor of the position). It’s not about the quality-adjusted cheese per dollar. Getting out a pre-opened hunk of cheese, examining the color and consistency of its moist edges, awkwardly undressing it further from its tight, torn, damp plastic casing, finding a knife and something to cut it on, cutting some, wrapping the rest again, fitting it back in the fridge, and cleaning up the knife and counter, is an experience. And it’s not a good one. It has all kinds of wetness and ineffectual muscular exertions and tenuous balancing and making hard decisions about risk under uncertainty and washing things. Whereas reaching out your hand for a cheese-cube pack then de-lidding it into the trash—while not an amazing experience—is I’d say overall positive, being substantially comprised of the initial sighting of your desired cheese-cube-pack and then the immediate having of it. At worst it is a very short experience. And it makes perfect sense to prefer twenty seconds of the cheese cube experience to two minutes of the better cheese experience enough to overwhelm the other stakes of the choice.

    I have relatively consistent preferences in this direction, whereas the rest of the house seems to vary by food. Others lunch on intermediately aged leftovers ladled from stacked tupperwares, while if I’m not going to make something fresh, I prefer just-add-boiling-water pots of vegetable curry or microwaveable instant rice (I do add butter and herbs though, which is a slight departure from the genre). Others have been known to eat yogurt spooned into a bowl from a giant tub, while I eat from towers of fresh stackable single-serve yogurt pots. Snack foods tend to cater to my interests here better, perhaps because everyone hopes to be more lazy and individualistic for snacks. There are tiny bags of chips and string cheeses and nut butter pouches and apple chips and fruit strips and protein bars.

    My boyfriend affectionately refers to these objects of my desire as ‘li’l pots’ (probably a term he grabbed from some gleeful past exclamation of mine) and often offers me ‘some sort of li’l pot?’ for breakfast, whether it be oatmeal or yogurt or rice pudding or mashed potato to be determined.

    I claim that this is not about appreciating aesthetic qualities less. It is about appreciating more aesthetic qualities. Packaging can be beautiful and simple and pleasing to use, but it is often painful to behold and also painful to try to open: packets that tear open half way down the side, or can’t be opened at all with normal-range female grip strength, or that naturally explode their contents on nearby objects unless you do something that I haven’t discovered yet, or that cut your hands then leave you holding a small overflowing tub of water.

    The arrangement of objects can similarly give pleasure or suffering: my stack of fresh white yogurt bled through with passionfruit, lemon or red berries asks to be reached for, whereas the jumble of giant containers of yogurt and sour cream and cream cheese on top and in front of each other and strewn between with other objects trying to fit somewhere, has no such appeal. String cheeses living upright and individuated are much more appealing than string cheeses attached together in a large string cheese blanket inside another plastic packet horizontally hidden under some other cheeses. Small stacks of different types of protein bar laid out for eating are more pleasing than a rubble of large packages thrown into a drawer.

    Snack drawer

    And actions can be aesthetically pleasing or not. Peeling a grape with your teeth is pleasing. Breaking through the tin-foil-paper on top of a new jar of Caro can be very pleasing. Wrestling an icy firm-like-slightly-decomposing-wood field roast sausage from its skin-tight twisted plastic tubing is not pleasing. Any kind of tearing that is difficult and involves a new liquid appearing that you are not equipped to deal with is not pleasing. Anything that naturally calls for more than two hands is not pleasing unless you are a group of people. Making judgment calls about food safety is not pleasing. Actions that require finding, dirtying, and cleaning multiple objects tend not to be pleasing unless there’s a real payoff.

    It’s not just about the time. There are preparation rituals that are beautiful and compelling. I am secretly a bit pleased that our coffee machine is being replaced and we are temporarily relegated to measuring fresh grounds into my beautiful orange French press, then pouring boiling water into the roiling black soup of them, then slowly pressing the mandala-like metal plunger onto them, perhaps watching mysterious currents shooting up the sides of the clear glass tower. I enjoy choosing a cup, and directing a smooth black torrent into its belly. I like shaking the cream carton with vigor, and pouring a dollop of its heavy white cloud into the black depths, to curl and spiral through it.

    Which is not to say that others should agree with my evaluation of li’l pots. The same series of actions is probably a very different experience for different people. For one person, there might be a single action ‘get out some cheese’, and in a half-conscious flurry it happens, and they are soon focused on the eating of the cheese. For a different person ‘get out some cheese’ means something more like ‘take out the cream cheese and the yogurt and balance them tenuously near the fridge, then reach in and get the intended cheese from a slightly wet and slimy pool on the shelf, then replace the cream cheese and the yogurt, then try to open the cheese while touching only the dry bits, then be unable to rip the plastic on the first three tries but hurt your finger somewhat, then look for scissors to cut it, then fail to find them and look for a knife instead, then use the knife to somewhat recklessly cut the edge of the packet, then try to rip it again from there, then get it suddenly and nearly lose the cheese (in the process grab the wet package and the cheese and give up on that particular separation), then open a cupboard with the least cheesy part of your hand and take out a chopping board and put the cheese on it, then hope that the pool of liquid running from the cheese doesn’t run onto the counter, then wash your hands because of the fridge slime and the cheese water, then cut off some cheese with the knife, then take out a storage container and move the rest of the cheese from its entire now-useless package into the container, then throw out the bag, then wipe up the cheese water that dripped while moving the bag, then move the yogurt and the cream cheese again, then put the cheese block back in the pool of liquid, then replace the yogurt and the cream cheese, then wipe up the cheese water from the counter, then wash the knife, then take your cheese to your room so that you can lie down for a bit before eating the fucking thing.’


    Note: I do have OCD, so my love of indeterminate liquids, contamination, decisions about safety risks, and additional reasons to wash my hands is lower than it might be for a human.

  • Who should you expect to spend your life with?

    Striking things about the figure below, which I got from Our World in Data, on time use [edit: oops, all only known to be true in America]:

    • People spend increasing time alone over their whole lives, with the exception of roughly their twenties. This surprises me a bit because it seems like people like spending time with other people, and I would expect them to increasingly succeed at it with experience and time to acquire partners and families and friends.
    • From 31 to 45, people spend more time with children1 on average than they spend with any other category of person, including for instance partners and colleagues.
    • You might think all this children time would be substituting for some partner time, but as the children time swoops downward by three quarters, partner time stays about the same.
    • People are at a relationship-time-steady-state between about thirty and sixty. I imagine that many people start relationships in that time, so does that mean that they also stop them at about the same rate, or gradually reduce time with their partners at a rate matching others’ forming of new relationships? Are people radically less likely to start relationships after about thirty?
    • People spend broadly decreasing time with every group except their partner over time, from some early peak for each trend—in the teenage years for friends and family, and in the 20s and 30s for colleagues and children. I wonder how many people just like being alone and with their partners more than most other options, and steadily optimize for that, once they have been sociable enough to find a partner in their early years.
    • Coworker time peaks at age 25-30 and goes slowly downward before the retirement descent. Is that from people dropping out of the workforce? Earning themselves a nice private office? Some difference between junior and senior roles?
    • People spend fairly consistent time with their friends after a decline from 18 to 40. Retirement doesn’t increase it. Spending three hours a day fewer with children doesn’t increase it. I guess those things go to solitude.

    In other news, Our World In Data seems awesome.

    1. I’m guessing that this means ‘any children’ rather than ‘their own children’, because the rate for 15 year olds seems high 

  • What if we all just stayed at home and didn't get covid for two weeks?

    I keep thinking about how if at any point we were all able to actually quarantine for two weeks1 at the same time, the pandemic would be over.

    Like, if instead of everyone being more or less cautious over a year, we all agreed on single a two week period to hard quarantine. With plenty of warning, so that people had time to stock up on groceries and do anything important ahead of time. And with massive financial redistribution in advance, so that everyone could afford two weeks without work. And with some planning to equip the few essential-every-week-without-delay workers (e.g. nurses, people keeping the power on) with unsustainably excessive PPE.

    This wouldn’t require less total risky activity. If we just managed to move all of the risky activity from one fortnight to the one before it, then that would destroy the virus (and everyone could do as many previously risky activities as they liked in the following fortnight!). It could be kind of like the Christmas week except twice as long and the government would pay most people to stay at home and watch movies or play games or whatever. Maybe the TV channels and celebrities could cooperate and try to put together an especially entertaining lineup.

    How unrealistic is this? It sounds pretty unrealistic, but what goes wrong?

    Some possible things:

    1. To actually coordinate that many people, you would need to have serious policing—beyond what is an acceptable alternative to a year-long pandemic—or serious buy-in—beyond what is possible in any normal place of more than ten people.
    2. Even if you could basically coordinate that many people, you would fail in a few places. And if you fail anywhere, then the disease will gradually build back up.
    3. You can’t just have everyone buy groceries for a given fortnight at some point in the preceding months, because there aren’t enough groceries in warehouses or enough grocery producers able to spin up extra weeks of grocery production on short notice (I am especially unsure whether this is true).
    4. The people who do really have to work are too many to over-prepare well for it in a month
    5. It would cost really a lot of money
    6. It would need to be longer than two weeks if you wanted to actually crush the disease, because some people are probably infectious for abnormally long times.
    7. You would need everyone not living alone to stay away from those they live with, to avoid spreading covid within houses, making this a more extreme proposition than it first seems, very hard to police, and basically impossible for households with small children or other very dependent members.
    8. It’s just way too much logistical effort to make this happen well.

    1, 2 and 7 look like the clearest problems to me. I don’t know enough to say if 3, 4 or 8 are real obstacles, and it seems like the US federal government has sent out a lot of money already, so 5 could at worst be solved by doing this thing at the time the money was sent out. 6 seems true, but I’m not sure if the length it would need to be is out of the question, if the other questions are solved.

    7 is pretty bad even in a community without dependent people, because it requires active effort from everyone to protect themselves within their houses, which seems much less likely to be ubiquitously complied with than a request to not go to effort to do something (i.e. more people will find the energy to stay on their sofas than will find the energy to set up their room to prepare food in it for a fortnight). Then the dependent people who really need to stay with someone else seem even harder to get the end-of-fortnight risk down for. I could imagine dealing with these problems by spreading people out as much as feasible and requiring longer quarantines for pairs. But the difficulty of that—or need for extending the length of the whole thing—seem quite costly.

    On 2 and 7, even if you don’t actually stop the pandemic, and you have to have another occasional scheduled ‘firebreak’ in activity, once cases had built up again, it seems like it could hugely reduce the human cost, without more total caution (just moving the caution in time).

    (Also, if you did it for four weeks instead of two, you would only end up with cases where two failures met, i.e. where someone improbably got covid during the first two weeks, then improbably passed it on to another person in the second.)

    On 4, One way you might swing this is to have many of the people who work during the two weeks then do their own hard quarantine in the following two weeks, where they can be replaced by some of the workers with similar skills who were at home during the main round.

    Many of these depend on scale, and location. For instance, this can clearly often work at the level of a group house, and is probably too ambitious for a large and ideologically diverse nation (especially one that isn’t really organized for people to consistently wear masks after a year). Could it work at the level of a relatively anti-covid city? (The city would then have to limit or quarantine incoming travelers, but that seems doable for many cities.) A small town? A small college campus? A highly religious adult community, where the church was in favor? There are a lot of human groups in the world.

    Have I got the main reasons this wouldn’t work? Is there some good solution to them that I haven’t seen?

    Has anyone done something like this? There have been lots of lockdowns, but have there been time-bounded almost-total lockdowns scheduled in advance, with huge efforts to avert people needing to take risks during that particular period (e.g. treating moving risks to the time earlier as great compared to running them that week)?

    1. Or however long it takes a person to reliably stop spreading covid, after contracting it. 

  • Thoughts on the inner ring

    I enjoyed C. S. Lewis’ The Inner Ring, and recommend you read it. It basically claims that much of human effort is directed at being admitted to whatever the local in-group is, that this happens easily to people, and that it is a bad thing to be drawn in to.

    Some quotes, though I also recommend reading the whole thing:

    In the passage I have just read from Tolstoy, the young second lieutenant Boris Dubretskoi discovers that there exist in the army two different systems or hierarchies. The one is printed in some little red book and anyone can easily read it up. It also remains constant. A general is always superior to a colonel, and a colonel to a captain. The other is not printed anywhere. Nor is it even a formally organised secret society with officers and rules which you would be told after you had been admitted. You are never formally and explicitly admitted by anyone. You discover gradually, in almost indefinable ways, that it exists and that you are outside it; and then later, perhaps, that you are inside it.

    There are what correspond to passwords, but they are too spontaneous and informal. A particular slang, the use of particular nicknames, an allusive manner of conversation, are the marks. But it is not so constant. It is not easy, even at a given moment, to say who is inside and who is outside. Some people are obviously in and some are obviously out, but there are always several on the borderline. And if you come back to the same Divisional Headquarters, or Brigade Headquarters, or the same regiment or even the same company, after six weeks’ absence, you may find this secondary hierarchy quite altered.

    There are no formal admissions or expulsions. People think they are in it after they have in fact been pushed out of it, or before they have been allowed in: this provides great amusement for those who are really inside. It has no fixed name. The only certain rule is that the insiders and outsiders call it by different names. From inside it may be designated, in simple cases, by mere enumeration: it may be called “You and Tony and me.” When it is very secure and comparatively stable in membership it calls itself “we.” When it has to be expanded to meet a particular emergency it calls itself “all the sensible people at this place.” From outside, if you have dispaired of getting into it, you call it “That gang” or “they” or “So-and-so and his set” or “The Caucus” or “The Inner Ring.” If you are a candidate for admission you probably don’t call it anything. To discuss it with the other outsiders would make you feel outside yourself. And to mention talking to the man who is inside, and who may help you if this present conversation goes well, would be madness.

    My main purpose in this address is simply to convince you that this desire is one of the great permanent mainsprings of human action. It is one of the factors which go to make up the world as we know it—this whole pell-mell of struggle, competition, confusion, graft, disappointment and advertisement, and if it is one of the permanent mainsprings then you may be quite sure of this. Unless you take measures to prevent it, this desire is going to be one of the chief motives of your life, from the first day on which you enter your profession until the day when you are too old to care. That will be the natural thing—the life that will come to you of its own accord. Any other kind of life, if you lead it, will be the result of conscious and continuous effort. If you do nothing about it, if you drift with the stream, you will in fact be an “inner ringer.” I don’t say you’ll be a successful one; that’s as may be. But whether by pining and moping outside Rings that you can never enter, or by passing triumphantly further and further in—one way or the other you will be that kind of man.

    The quest of the Inner Ring will break your hearts unless you break it. But if you break it, a surprising result will follow. If in your working hours you make the work your end, you will presently find yourself all unawares inside the only circle in your profession that really matters. You will be one of the sound craftsmen, and other sound craftsmen will know it. This group of craftsmen will by no means coincide with the Inner Ring or the Important People or the People in the Know. It will not shape that professional policy or work up that professional influence which fights for the profession as a whole against the public: nor will it lead to those periodic scandals and crises which the Inner Ring produces. But it will do those things which that profession exists to do and will in the long run be responsible for all the respect which that profession in fact enjoys and which the speeches and advertisements cannot maintain.

    His main explicit reasons for advising against succumbing to this easy set of motives are that it runs a major risk of turning you into a scoundrel, and that it is fundamentally unsatisfying—once admitted to the ingroup, you will just want a further in group; the exclusive appeal of the ingroup won’t actually be appealing once you are comfortably in it; and the social pleasures of company in the set probably won’t satisfy, since those didn’t satisfy you on the outside.

    I think there is further reason not to be drawn into such things:

    1. I controversially and uncertainly claim that even the good of having especially high social status is a lesser kind of good relative to those available from other arenas of existence. Though I suppose it increases your life expectancy somewhat.
    2. It is roughly zero sum, so hard to wholly get behind and believe in, what with your success being net bad for the rest of the world.
    3. To the extent it is at the cost of real craftsmanship and focus on the object level, it will make you worse at your profession, and thus less cool in the eyes of God, or an ideal observer, who are even cooler than your local set.

    I think Lewis is also making an interesting maneuver here, beyond communicating an idea. In modeling the behavior of the coolness-seekers, you put them in a less cool position. In the default framing, they are sophisticated and others are naive. But when the ‘naive’ are intentionally so because they see the whole situation for what it is, while the sophisticated followed their brute urges without stepping back, who is naive really?

  • On fundamental solitude

    A quote from Aldous Huxley that has stuck in my mind more than perhaps any other over the years:

    “We live together, we act on, and react to, one another; but always and in all circumstances we are by ourselves. The martyrs go hand in hand into the arena; they are crucified alone. Embraced, the lovers desperately try to fuse their insulated ecstasies into a single self-transcendence; in vain. By its very nature every embodied spirit is doomed to suffer and enjoy in solitude. Sensations, feelings, insights, fancies—all these are private and, except through symbols and at second hand, incommunicable. We can pool information about experiences, but never the experiences themselves. From family to nation, every human group is a society of island universes.”

    I used to be fairly troubled by this kind of thought. These days I’m more inclined to think of memories of myself, my own writing from yesterday, my sense of a person in my arms, words vibrating my inner ears as light bounces between someone’s eyes and mine, words reaching me across the internet from a stranger, barely understandable lines from thousand year old pages, as more of a piece—physical communications between scattered consciousness. All interpreted with more or less insight and confidence and detail and sense of being an ‘experience’ and not just ‘information’, depending on the quality and nature of the message. But my ‘imagining’ of your mental state, and my ‘knowing’ of my own are both guesses. The sense that they are different is a pragmatic, superficial, quantitative one, not the symptom of a deep metaphysical separation.

  • Public selves

    A question that I return to in life strategy is whether to lean heavily on ‘spending one’s weirdness points wisely’—otherwise put, cowering lonely behind a cardboard cutout of the most forgettable person while proffering optimized propaganda through carefully selected slots—or whether to offer the world a fuller view of oneself.

    A few arguments as I see them:

    • Hiding allows you to be strategic, showing anything that is good to show, hiding anything that is not. Surely that is better then than any alternative, that must involve showing things that are bad to show, or not showing things that are good to show?
    • Not necessarily! People can tell which strategy you are using, and usually the things that are ‘bad to show’ are bad for you to show, but other people would be perfectly interested to see them. So it is less cooperative, and people may respond to that, which may on a longer term view be bad for you.
    • Also, which strategy you are enacting overall, or what you are doing in the past or future, can change whether something is good or bad to share. For instance, maybe you have personal problems that it would be both nice to have in the open, and helpful for others to know that you also face. If you are usually open about things, mentioning these might be no big deal, and so worth it on net. Whereas if you tend to be private, then suddenly announcing a personal problem will seem like a bigger deal, so the costs might outweigh the benefits.
    • There is something good about actually knowing other people - being part of a global intellectual society of real people, not of robotic fictions created by people. Being open contributes to this world being actual.

    There are intermediate options too, of course. Are there good principled ones?

    What considerations am I missing?

  • Are the consequences of groups usually highly contingent on their details?

    How much of the impact of an organization is covered by it being ‘a group of size M working on X’, relative to the specifics of how and what they do in working on X? What if we also include a single scale of how functional they are?

    For instance, does it mostly matter that my research group, AI Impacts, is a certain size force for AI risk related thinking (with the size determined by the number and competence of people and the functionality of the organization, say), or does it matter whether we write journal articles or news stories or blog posts or research pages, or whether we choose our projects collectively vs. individually, or whether we get most of our feedback internally vs. externally? Maybe most of these things can mostly be translated into ‘functionality’. But the ‘type of thing we are producing’ one doesn’t seem to as easily.

    How much does what exactly you are producing matter? It could matter almost entirely or not at all, to my knowledge. For instance, I have some intuition that ‘there are about three small orgs in that space’ is a reasonable description of how much effort is going into a goal, but I also have intuitions that, say, minor differences in the responsiveness or navigability or style of a website can make the difference between it seeming great or annoying or crankish, and being read or not, or liked by different people. Which seems like it should just have a fairly different effect. These seem vaguely in conflict.

    I originally meant for AI Impacts to be an unusual kind of entity, and the form of the output (website of ongoingly updated research pages on modular questions, hierarchically supporting one another) was a key part of the experiment. Which doesn’t imply a strong view on the importance of format—experimentation might make sense if format is so important it is worth searching alternative ones, but it might also make sense if it is so unimportant that it won’t mess anything up.

    But there are costs to doing unusual things (e.g. people are confused about what you are doing, other entities aren’t organized to interface with you), so if format is totally unimportant, maybe we should move to more normal things. (Or the same goes if format is important, and this one is clearly worse.)

    This question is also relevant in how you pick organizations to donate to, so maybe people who regularly do that know the answers. Does it mostly matter that you have competent people working on the right cause, or having checked that, do you generally also need to look closely at exactly what they are doing?

  • What is going on in the world?

    Here’s a list of alternative high level narratives about what is importantly going on in the world—the central plot, as it were—for the purpose of thinking about what role in a plot to take:

    • The US is falling apart rapidly (on the scale of years), as evident in US politics departing from sanity and honor, sharp polarization, violent civil unrest, hopeless pandemic responses, ensuing economic catastrophe, one in a thousand Americans dying by infectious disease in 2020, and the abiding popularity of Trump in spite of it all.
    • Western civilization is declining on the scale of half a century, as evidenced by its inability to build things it used to be able to build, and the ceasing of apparent economic acceleration toward a singularity.
    • AI agents will control the future, and which ones we create is the only thing about our time that will matter in the long run. Major subplots:
      • ‘Aligned’ AI is necessary for a non-doom outcome, and hard.
      • Arms races worsen things a lot.
      • The order of technologies matters a lot / who gets things first matters a lot, and many groups will develop or do things as a matter of local incentives, with no regard for the larger consequences.
      • Seeing more clearly what’s going on ahead of time helps all efforts, especially in the very unclear and speculative circumstances (e.g. this has a decent chance of replacing subplots here with truer ones, moving large sections of AI-risk effort to better endeavors).
      • The main task is finding levers that can be pulled at all.
      • Bringing in people with energy to pull levers is where it’s at.
    • Institutions could be way better across the board, and these are key to large numbers of people positively interacting, which is critical to the bounty of our times. Improvement could make a big difference to swathes of endeavors, and well-picked improvements would make a difference to endeavors that matter.
    • Most people are suffering or drastically undershooting their potential, for tractable reasons.
    • Most human effort is being wasted on endeavors with no abiding value.
    • If we take anthropic reasoning and our observations about space seriously, we appear very likely to be in a ‘Great Filter’, which appears likely to kill us (and unlikely to be AI).
    • Everyone is going to die, the way things stand.
    • Most of the resources ever available are in space, not subject to property rights, and in danger of being ultimately had by the most effective stuff-grabbers. This could begin fairly soon in historical terms.
    • Nothing we do matters for any of several reasons (moral non-realism, infinite ethics, living in a simulation, being a Boltzmann brain, ..?)
    • There are vast quantum worlds that we are not considering in any of our dealings.
    • There is a strong chance that we live in a simulation, making the relevance of each of our actions different from that which we assume.
    • There is reason to think that acausal trade should be a major factor in what we do, long term, and we are not focusing on it much and ill prepared.
    • Expected utility theory is the basis of our best understanding of how best to behave, and there is reason to think that it does not represent what we want. Namely, Pascal’s mugging, or the option of destroying the world with all but one in a trillion chance for a proportionately greater utopia, etc.
    • Consciousness is a substantial component of what we care about, and we not only don’t understand it, but are frequently convinced that it is impossible to understand satisfactorily. At the same time, we are on the verge of creating things that are very likely conscious, and so being able to affect the set of conscious experiences in the world tremendously. Very little attention is being given to doing this well.
    • We have weapons that could destroy civilization immediately, which are under the control of various not-perfectly-reliable people. We don’t have a strong guarantee of this not going badly.
    • Biotechnology is advancing rapidly, and threatens to put extremely dangerous tools in the hands of personal labs, possibly bringing about a ‘vulnerable world’ scenario.
    • Technology keeps advancing, and we may be in a vulnerable world scenario.
    • The world is utterly full of un-internalized externalities and they are wrecking everything.
    • There are lots of things to do in the world, we can only do a minuscule fraction, and we are hardly systematically evaluating them at all. Meanwhile massive well-intentioned efforts are going into doing things that are probably much less good than they could be.
    • AI is powerful force for good, and if it doesn’t pose an existential risk, the earlier we make progress on it, the faster we can move to a world of unprecedented awesomeness, health and prosperity.
    • There are risks to the future of humanity (‘existential risks’), and vastly more is at stake in these than in anything else going on (if we also include catastrophic trajectory changes). Meanwhile the world’s thinking and responsiveness to these risks is incredibly minor and they are taken unseriously.
    • The world is controlled by governments, and really awesome governance seems to be scarce and terrible governance common. Yet we probably have a lot of academic theorizing on governance institutions, and a single excellent government based on scalable principles might have influence beyond its own state.
    • The world is hiding, immobilized and wasted by a raging pandemic.

    It’s a draft. What should I add? (If, in life, you’ve chosen among ways to improve the world, is there a simple story within which your choices make particular sense?)

  • Meditative thinking

    I got fairly into meditation in 2018. Not in a way where you do a huge amount of meditation, or seek guidance on how to do meditation well, or on whether what you are doing is meditation at all. I don’t think I even graduated from doing meditation largely in five minute bouts. I just really dug a set of mental things which seemed related to meditation. I was riding an ethos. I suppose meditation was a particularly namable point in the space, but not obviously the most exciting part of it, so I ended up being pretty excited about ‘meditation…?’ while not actually meditating that much.

    I also remained so ignorant about meditation traditions and previous human meditation experience that you might doubt that I’m talking about the same thing. (I did talk to other people a little, and did about five guided meditations, which seemed like a different kind of thing anyway, but also very awesome.)

    With that said, here’s a kind of meditation I made up. I call it, ‘meditative thinking’. In it, you do all the stuff you would usually do while meditating: be in a meditation-conducive position (I do lying on my back, which I hear is a bad idea, but it is very comfy); set a meditation timer with bells at the start and end (mine also makes water sounds between them, but this probably isn’t crucial); close your eyes; be aware of your breathing; then just go on being aware of stuff in a conspicuously conscious way. But before all this, choose a question to think about. And then, once meditating, think about the question. Maintain the same mental stance as you normally would meditating, insofar as that makes sense. Be aware of your breathing, and your awareness, and your thinking. If you stop thinking about the question, gently bring your awareness back to it.

    I still find this good. Though it’s possible that it gets much of its benefit from being a ritualistic way to actually think about a question for five or ten minutes without reflexively opening Facebook because thinking doesn’t correspond to a tab or occupy one’s hands.

  • San Francisco outing

    PXL_20201223_202446423 PXL_20201223_202723913 PXL_20201223_202241573 PXL_20201223_200936571 PXL_20201223_202623152 PXL_20201223_202350627(1) PXL_20201223_202420653 PXL_20201223_201815353 PXL_20201223_201538697(1) PXL_20201223_202319122 PXL_20201223_201511879 PXL_20201223_202543175(1) PXL_20201223_202628869 PXL_20201223_195743905 PXL_20201223_201016516 PXL_20201223_202042963 PXL_20201223_203249024

  • Discussion on the choice of concepts

    “The reason that you can currently make toast without doing great damage is just that your toaster is stupid.”

    “Can ‘stupid’ be correctly applied to toasters?”


    “What if I say no?”

    “Well, if you have a conception of stupidity that can’t be applied to toasters, and one that can, why would you choose the one that can’t?”

    “But I don’t have two—I’m talking about the actual concept”

    “There isn’t an actual concept, there are a bajillion concepts, and you can use whichever one you want.”

    “There’s one that people mean”

    “Not really—each person has a slightly different usage, and probably hasn’t pinned it down. For instance if you ask them if toasters are stupid, they might be unsure.”

    “Yes! They are unsure because they are trying to guess what the real concept is, from their limited collection of exposures to it. If it were about making one up, why would they be uncertain?”

    “They might be uncertain which one they want to make up”

    “You’re saying when people say words, they are ascribing meanings to them that they just made up, according to which definition they like most?”

    “Well, they like definitions that fit with other people’s usage a lot more than other ones.”

    “I think they are just guessing what the real meaning is”

    “There isn’t a real meaning”

    “Ok, what the consensus meaning is”

    “There isn’t a consensus”

    “Yeah but they believe there is one”

    “You’re like a word meaning nihilist—you want only these ‘real’ word meanings or at least these word meanings of consensus, yet you know they don’t exist. That seems sad.”

    “Maybe, but that doesn’t make it wrong. And also, I was talking about what other people do.”

    “What does it matter what other people do? You can use whatever meanings you want.”

    “That seems unfriendly somehow”

    “What if you do it in a friendly way? For instance, where a meaning is ambiguous, if you choose the best one. For instance, if you say toasters can be stupid?”

    “It’s more a vibe of do-it-alone responsibility for everything, thinking of others as machinery that happens to be near you, that rings my alarm bells. Leaving the common experience of word usage to stand outside the system, as it were, and push the common stock of concepts in the way that you calculate best. At least it seems somehow lonely and cold”

    “That’s a bit dramatic - I think the odd nudge in a good direction is well within the normal human experience of word usage. Plus, often people clearly redefine words somewhat in the context of a specific conversation. Would it be so strange if within our conversation we deemed ‘stupid’ applicable to toasters? Not doing so seems like it will only limit our discussion and edge us toward taking up some further concept like shmoopid to fill the gap.”

    “It’s not clear at all to me that that is the only bad consequence at stake. For instance, words have all kinds of connotations besides what you explicitly think of them as about. If you just declare that stupid applies to toasters, then try to use it, you’ll doubtless be saying all kinds of things about toasters that you don’t mean. For instance, that that they are mildly reprehensible, and that you don’t like them.”

    “I don’t know if I would have used it if I didn’t implicitly accept the associations, and this is a risk one seems to always run in using words, even when you would deem them to apply.”

    “Hmm. Ok, maybe. This sounds like a lot of work though, and I have done ok not thinking about using my influence over words until this day.”

    “You think you have done ok, but word meanings are a giant tragedy of the commons. You might have done untold damage. We know that interesting concepts are endlessly watered down by exaggerators and attention seekers choosing incrementally wider categories at every ambiguity. That kind of thing might be going on all over the place. Maybe we just don’t know what words could be, if we were trying to do them well, instead of everyone being out to advance their own utterings.”

  • What's good about haikus?

    Fiction often asks its readers to get through a whole list of evocative scenery to imagine before telling them anything about the situation that might induce an interest in what the fields and the flies looked like, or what color stuff was. I assume that this is fun if you are somehow more sophisticated than me, but I admit that I don’t enjoy it (yet).

    I am well capable of enjoying actual disconnected scenery. But imagining is effort, so the immediate action of staring at the wall, say, seems like a better deal than having to imagine someone else’s wall to be staring at. Plus, a wall is already straining my visual-imaginative capacities, and there are probably going to be all kinds of other things, and some of them are probably going to be called exotic words to hammer in whatever kind of scenic je ne sais quoi is going to come in handy later in the book, so I’m going to have to look them up or think about it while I keep from forgetting the half-built mental panorama constructed so far. It’s a chore.

    My boyfriend and I have recently got into reading haikus together. They mostly describe what things look like a bit, and then end. So you might think I would dislike them even more than the descriptive outsets of longer stories. But actually I ask to read them together every night.

    I think part of it is just volume. The details of a single glance, rather than a whole landscape survey, I can take in. And combined with my own prior knowledge of the subject, it can be a rich picture. And maybe it is just that I am paying attention to them in a better way, but it seems like the details chosen to bring into focus are better. Haikus are like a three stroke drawing that captures real essence of the subject. My boyfriend also thinks there is often something clean about the images.

    Some by Matsuo Bashō from our book The Essential Haiku, edited by Robert Hass [ETA: line positionings not as in original]:

    In the fish shop
    the gums of the salt-bream
    look cold

    Early fall—
    The sea and the rice fields
    all one green.

    Another year gone—
    hat in my hand,
    sandals on my feet.

    More than ever I want to see
    in these blossoms at dawn
    the god’s face.

    The peasant’s child,
    husking rice, stops
    and gazes at the moon.

    Year after year
    On the monkey’s face
    A monkey face

    In sum:

    Ten words
    Draw on my mind
    Cleaner than fifty lines

  • A vastly faster vaccine rollout

    When a traveler introduced smallpox to New York City in 1947, the city—and in particular its health commissioner, Israel Weinstein—apparently ran an epic vaccination campaign, reaching 5 million people in the first two weeks.1 That is, four hundred thousand vaccinations per day. San Francisco in two days.

    For covid, the first New York City vaccine was given on the 14th of December, and if I understand, by the 10th of January, twenty seven days later, 203,181 doses had reportedly been given. That’s around eight thousand doses per day. A factor of fifty fewer.

    That’s a pretty incredible difference. Why is New York fifty times slower at delivering covid vaccines in 2021 than it was at delivering smallpox vaccines in 1947?

    Part of the answer is presumably ‘regression to the mean’: if thousands of different cities at different times try to roll out vaccinations quickly, with a similar basic ability to do so, and there is some random chance in how well it goes, then the one we tell stories about seventy years later will one that got surprisingly lucky. You shouldn’t expect your own effort to go as well as that best one. But—without having done the statistics—I don’t think you should expect your attempt to be fifty times worse. New York didn’t get that lucky.

    Perhaps there are other differences in the problems faced. For instance, the current vaccine needs refrigeration, or we are so late in the disease spread that we can’t all crowd together in around fast-moving vaccinators, or be rounded up in our natural crowded habitats, like classrooms or offices.

    Though the 1947 situation looks harder in important ways too. For one thing, there was no time to prepare. The vaccination project began the day that the disease was identified to health commissioner Weinstein. By 2pm, he was apparently holding a news conference publicly asking residents to be vaccinated. With covid, there were about ten months to prepare. For another thing, now people have smartphones with which they can be alerted, and computers and online systems that might be used to coordinate them and tell them what to do.

    I heard about this episode from the Rachel Maddow Show, and read about it in the New York Times, both of which bring it up to note the inadequacy in the current vaccine efforts. The New York Times says a rollout like this, “almost certainly couldn’t happen today”, and offers some explanations:

    1. Complicated relationships between city and other governments these days

      “In 1947, the city was able to act alone, as opposed to navigating a complicated relationship with the governor of New York and the federal government,” said Dr. Irwin Redlener, director of the Pandemic Resource and Response Initiative at Columbia University’s Earth Institute. “The city was able to say, ‘We’re going after this,’ and then make it happen.”

    2. A ‘hollowing out’ of the public health infrastructure

      But this time, with the coronavirus pandemic, New York faces a logistical hurdle. Experts in infectious disease point to a hollowing out of the public health infrastructure — not just in the city, but across the country.

    3. Lack of public faith in medical science, government and the media

      “This was the height of polio in the United States,” he said. “People had a much better sense of the impact of infectious disease. They saw it all the time, and they were rightly fearful. But they were also optimistic that medical science could conquer this. In 1947, there was tremendous faith in the medical community, unlike today.”…

      …Yet, [infectious disease experts] believe the biggest obstacle is not distribution but the public’s distrust of government, science and the media.

      “We’re coming out of a train wreck of messaging,” Dr. Redlener said. “We’ve learned that politics is poison to a public health initiative, especially during a crisis. Honesty and straightforward, clear messaging are absolutely critical.”

      In 1947, Dr. Weinstein was the only voice with a megaphone. He spoke and people listened.

      “Back then, there was a much simpler media landscape,” Ms. Sherman said as she laid out the Ad Council’s campaign, which is due to kick off early next year. “In today’s environment, we’re dealing with a highly, highly fragmented media. We’ll be relying on micro-influencers who are the trusted voices.”

    They seem to favor #3, noting that it is what ‘experts believe’. But it seems so implausible—am I totally misunderstanding the claim? On the current margin, if lack of public trust was making the vaccine rollout even ten times slower, wouldn’t we see campaigns begging us to go out and get the incredibly accessible vaccine, rather than seeing elderly people camping in outside queues to get vaccinated, and most people being told that they just can’t have a vaccine for months or a year? Perhaps they mean that ultimately the number of doses given out will be limited by public willingness to receive them? (Which seems surely true, but not necessarily the answer to an interesting question).

    The NYT’s other suggestions don’t seem immediately wrong, but are too vague for me to understand. I guess I’d like to know how things went differently at an object level. At what point, in 1947, did someone decide that it was their job to gather a volunteer army to administer vaccines, for instance? Do people work in similar roles nowadays? Did they think of this? Did they expect to get into trouble for having a bunch of lay people giving injections? Do they have to fill out a lot of paperwork to recruit them, whereas in 1947 they would have just shouted for them in the street? (I don’t know.) If New York had had these constraints, would their vaccination campaign have looked like ours, or is there more to explain?

    I suppose I have several questions:

    1. What is really going on, that can account for a near 50x slowdown?
    2. Why does the New York Times have such an unenlightening, vague, and seemingly wrong discussion so close to where one could have a genuinely interesting one?
    3. (Are these things related?)
    4. My guess after thinking about it for ten seconds is that the gap in speed is to do with more things being regulated and formalized. The difference in time it would take me to physically cause someone to have a cookie versus to legally sell someone a cookie seems huge enough for this kind of thing to account for large slowdowns, and I do expect creating a massive band of volunteer non-nurses to administer a vaccine to require time-consuming paperwork or to be straight up illegal. How does this explanation sound?

    This isn’t just an idle inquiry, and shouldn’t be just another interesting story for another payload of political disapproval.

    Naively extrapolating, New York City could be fully vaccinated in about seven weeks if we knew how to do what was done in 1947. At the current rate, which will presumably change, vaccinating everyone would take years.2

    What if someone figured out how to replicate even part of what New York did before at Weinstein’s direction? In America alone, around three thousand people are dying each day now, as they wait for the vaccine. My boyfriend’s ninety year old grandmother in Vermont was diagnosed with covid last week. Her center was scheduled to begin vaccinating its residents this Wednesday.

    (Regardless of what makes things slower these days, good on everyone who is working to forward the vaccination effort, and also those doing their best to make it appropriately safe. Good luck being fast and right.)

    P.S. In fact the whole of America vaccinated fewer people in the first two weeks of having a covid vaccine than New York did in 1947:

    Also interestingly, others are not doing better.

    Photograph shows school children waiting in line for immunization shots at a child health station in New York City, N.Y. Photograph shows crowd of people waiting to be vaccinated at the Department of Health building, New York City.

    Photos from the Office of War Information and Al Ravenna at the World Telegram and Sun, via the Library of Congress

    1. My information about this is all from the New York Times, Wikipedia, and the Rachel Maddow Show 

    2. Naive extrapolation says six years, but this is especially naive since at that rate covid will speed things up by reaching many people before the vaccine does. Plus we should probably expect some speedup over time, if it was going to take that long. Or something different to happen. 

  • The time I got really into poker

    One time I decided it would be good to learn to play poker. I had probably learned to play some form of poker a couple of other times before, and forgotten. One way to play a game a lot is to play it with a computer rather than other people. An iPad turns Agricola from one of the slowest games that casual board gamers might still be bothered to play to something you can play a few quick rounds of over lunch. I downloaded some kind of poker app, and began. It was maybe 9pm, and I was maybe sitting on my bed, in maybe Berkeley. My memories are pretty unclear. The app was green I think, like some kind of casino table.

    In the app there were a series of tables with increasing stakes. The first one was ‘your grandmother’s table’ or something, at which you needed almost no money to play, and were given a lot of advice about what to do. If you won several games there, you could afford a single game at the next table up, and so on. If you lost enough that you could no longer buy in at your higher table, you would drop down, and if you could no longer afford grandmother’s, then you could transfer American dollars to the app-maker in exchange for more fake poker money, and keep going.

    I got the hang of the rules and began to play fast. And I got the hang of not losing and bought my way into higher tables. I played faster. I didn’t pause. The ends of games got the same reflexive flash of choice as any other part of the game. Time passed, and lots and lots of games. My mind started to wander, even while I played. It wandered to a memory of long ago, more vivid and detailed than memories that come when I call for them. How strange. I played very fast. And more memories appeared, intense against the smooth background of mindless poker. I don’t usually remember things for pleasure—recollection is a neutral activity, if not about something particularly upsetting or nice. But these were somehow pleasing. Not because they were about anything that would usually inspire happiness—they were mundane scenes, like a road that I crossed once to get to a gas station, and the look of the gas station, and the feeling of the sky and the car and the other passenger waiting and the afternoon (not a real example necessarily; I forget all of the particulars now)—but in their mere pungent existence, they felt somehow satisfying. I drifted between them and frantic yet peaceful poker. Hours passed. I often wondered what I had just done—what cards I had played, or why—and realized that I had no explicit recollection. More hours passed, and more scenes from younger years projected fragrantly into the flickering virtual cards. I don’t think I consciously explored the strange mental landscape, transfixed as I was by the irresistible torrent of poker moves to be made. I took action after action definitively, yet lived dreamlike above it. After nine hours, with morning beginning to establish itself, I stopped.

  • A different dictionary

    I enjoyed James Somers account (HT Liron Shapira) of how Webster’s dictionary used to be much more beautiful than dictionaries today, for instance:

    “…Flashing differs from exploding or disploding in not being accompanied with a loud report. To glisten, or glister, is to shine with a soft and fitful luster, as eyes suffused with tears, or flowers wet with dew.”

    Did you see that last clause? “To shine with a soft and fitful luster, as eyes suffused with tears, or flowers wet with dew.” I’m not sure why you won’t find writing like that in dictionaries these days, but you won’t. Here is the modern equivalent of that sentence in the latest edition of the Merriam-Webster: “glisten applies to the soft sparkle from a wet or oily surface .”

    Who decided that the American public couldn’t handle “a soft and fitful luster”? I can’t help but think something has been lost. “A soft sparkle from a wet or oily surface” doesn’t just sound worse, it actually describes the phenomenon with less precision. In particular it misses the shimmeriness, the micro movement and action, “the fitful luster,” of, for example, an eye full of tears — which is by the way far more intense and interesting an image than “a wet sidewalk.”

    It’s as if someone decided that dictionaries these days had to sound like they were written by a Xerox machine, not a person, certainly not a person with a poet’s ear, a man capable of high and mighty English, who set out to write the secular American equivalent of the King James Bible and pulled it off.

    My guess is that dictionaries became ‘official’ and so are written in an official style. And that official style is contrary to color and poetry, because these things are in some way personal, showing glints of a specific idiosyncratic soul. And part of the point of officialness is that the entity having it is expansively clean and impersonal, involving specific people only insofar as they can be homogenized and branded and made fungible. You’re not supposed to look them in the eye.

    I wonder if this is related to dictionaries being commonly treated as authoritatively defining words rather than documenting a group of people’s efforts to feel around for their meanings. Somers also says:

    “Notice, too, how much less certain the Webster definition seems about itself, even though it’s more complete — as if to remind you that the word came first, that the word isn’t defined by its definition here, in this humble dictionary, that definitions grasp, tentatively, at words, but that what words really are is this haze and halo of associations and evocations, a little networked cloud of uses and contexts.”

    I got curious about the author of this dictionary, Noah Webster, and learned about his 1783 Blue backed speller (or formally, ‘A Grammatical Institute of the English Language’, speller section), which taught children to read and spell. It was the most popular American book at one point, according to Wikipedia, which also makes it sound like Webster might have had a substantial part in making American English different from English English:

    His most important improvement, he claimed, was to rescue “our native tongue” from “the clamour[30] of pedantry” that surrounded English grammar and pronunciation. He complained that the English language had been corrupted by the British aristocracy, which set its own standard for proper spelling and pronunciation.[31] Webster rejected the notion that the study of Greek and Latin must precede the study of English grammar. The appropriate standard for the American language, argued Webster, was “the same republican principles as American civil and ecclesiastical constitutions.” This meant that the people-at-large must control the language; popular sovereignty in government must be accompanied by popular usage in language…

    …As time went on, Webster changed the spellings in the book to more phonetic ones. Most of them already existed as alternative spellings.[33] He chose spellings such as defense, color, and traveler, and changed the re to er in words such as center. He also changed tongue to the older spelling tung, but this did not catch on.[34]

    Here is a version of Webster’s dictionary online, and Somers describes how to add it to your computer (which surprisingly worked for me in spite of his article being from 2014.)

  • Why not? potato chips in a box edition

    In reviewing my year I came across these photos of a box of potato chips I took in January on a plane. I took them because it seemed so much better to me than chips in a bag:

    chip box 1 chip box 2 chip box 3 chip box 4

    The corners of the box opened out, so that it was like a bowl.

    As usual, I wonder, is it actually better or does it just seem so to me? If it is better, why aren’t most chips sold this way?

    Ways it seems better to me:

    • you don’t get oil from the bag all over your hand
    • you can see what you are picking up, which is somehow pleasing
    • and which also allows more delicate grasping, rather than just thrusting your hand into the food, which seems better for sharing
    • it has a natural surface to sit on, which is more pleasing than a bag that sits on its side
    • it would fit neatly in a drawer, instead of taking up a large shapeless space (this is getting minor, but I do imagine it affecting what I would in fact want to buy)

    Further evidence that this is good is that people often pour chips into bowls.

    A downside is that it seems to involve more packaging, though I’m not sure how much the environmental impacts of cardboard and normal chip packaging compare. Other foods regularly involve cardboard boxes, along with numerous layers of other packign material, so it would be surprising if that was prohibitively costly. I think the foil-like layer is needed to avoid air getting in and making the chips stale (chips are actually packaged with nitrogen to keep fresh, apparently).

    There could easily be costs I don’t know about. For instance, apparently normal chip packaging involves carefully chosen layers of polymer for things like repelling moisture and avoiding package breakage. And some materials will change the taste of ingredients. I also wonder if you could do a similar thing with creative use of entirely normal chip packaging, though there is less of an existence proof there. I’m imagining something like this:

    chip drawing

  • Condition-directedness

    In chess, you can’t play by picking a desired end of the game and backward chaining to the first move, because there are vastly more possible chains of moves than your brain can deal with, and the good ones are few. Instead, chess players steer by heuristic senses of the worth of situations. I assume they still back-chain a few moves (‘if I go there, she’ll have to move her rook, freeing my queen’) but just leading from a heuristically worse to a heuristically better situation a short hop away.

    In life, it is often taken for granted that one should pursue goals, not just very locally, but over scales of decades. The alternative is taken to be being unambitious and directionless.

    But there should also be an alternative that is equivalent to the chess one: heuristically improving the situation, without setting your eye on a particular pathway to a particular end-state.

    Which seems like actually what people do a lot of the time. For instance, making your living room nice without a particular plan for it, or reading to be ‘well read’, or exercising to be ‘fit’ (at least insofar as having a nice living space and being fit and well-read are taken as generally promising situations rather than stepping stones immediately prior to some envisaged meeting, say). Even at a much higher level, spending a whole working life upholding the law or reporting on events or teaching the young because these put society in a better situation overall, not because they will lead to some very specific outcome.

    In spite of its commonness, I’m not sure that I have heard of this type of action labeled as distinct from goal-directedness and undirectedness. I’ll call it condition-directedness for now. When people are asked for their five year plans, they become uncomfortable if they don’t have one, rather than proudly stating that they don’t currently subscribe to goal-oriented strategy at that scale. Maybe it’s just that I hang out in this strange Effective Altruist community, where all things are meant to be judged by their final measure on the goal, which perhaps encourages evaluating them explicitly with reference to an envisaged path to the goal, especially if it is otherwise hard to distinguish the valuable actions from doing whatever you feel like.

    It seems like one could be condition-directed and yet very ambitious and not directionless. (Though your ambition would be non-specific, and your direction would be local, and maybe they are the worse for these things?) For instance, you might work tirelessly on whatever seems like it will improve the thriving of a community that you are part of, and always know in which direction you are pushing, and have no idea what you will be doing in five years.

    Whether condition-directedness is a good kind of strategy would seem to depend on the game you are playing, and your resources for measuring and reasoning about it. In chess, condition-directedness seems necessary. Somehow longer term plans do seem more feasible in life than in chess though, so it is possible that they are always better in life, at the scales in question. I doubt this, especially given the observation that people often seem to be condition-directed, at least at some scales and in some parts of life.

    (These thoughts currently seem confused to me - for instance, what is up with scales? How is my knowing that I do want to take the king relevant?)

    Inspired by a conversation with John Salvatier.

  • Evening drawing

    This evening I became tempted by a YouTube video of an artist painting a portrait, which led me to be tempted by another such video, and then more of them, and then by one of these artists’ websites, and then by my own pencils and paper. (I did not become tempted by YouTube videos advertising breaking news of some sort of crazy Trump riot, since I decided not to ‘check the internet’ until bed time).

    Some observations on drawing:

    • In my world, art supplies are a kind of archetype of a thing that is viscerally exciting vastly beyond its likelihood of being used (and even further beyond its likelihood of being used well). Like, look at this paper I got! (I did, extensively.) Think of the potential it holds! But actually, for now, it’s sitting under my bed. Colored paper
    • It is sometimes very hard to draw without poking my tongue out. Not in the sense that that without my tongue out, drawing is hard—I mean, while drawing, it seems infeasible to keep my tongue in my mouth. This feels natural and unsurprising, like how if you stand on one leg it is hard to not fall over, not like an epileptic seizure overcoming my tongue. But stated explicitly, it is pretty weird.
    • It is also frequently very hard to draw without tilting my head from side to side. I first noticed this in high school art class, where I sat drawing and listening to the other students talk. One of them pointed out to another that looking around the room, students were tilting their heads to and fro all over the place. I also overheard an eye-opening discussion of how awesome it was that we were going to war in Iraq imminently, since that might clear space in the ranks of military helicopter pilots, allowing the excited student to fill it. Drawing and overhearing seem like good complementary activities.
    • It’s hard to really get behind my attempts to draw well, because it doesn’t feel like they are leading anywhere. I would be more into it if it was part of a compelling plan to save the world. Perhaps I don’t really believe in leisure enough to wholeheartedly do it, and don’t disbelieve in it enough to work all of the time. Alas.
    • Even without the world perhaps needing saving, drawing feels a bit pointless, since we have photography. But this doesn’t make sense: if you want to create an image of a scene, much better to draw it than to have to create it in real life then photograph it. For instance, most of the time, the scene you want probably isn’t realistic in the sense that it looks exactly like some bit of reality. How about computer graphics then? Yes, but even if arbitrary styles of image could be computer generated, doing this is itself basically just another kind of drawing—you still have to understand what patterns on the page would produce what in the minds of viewers, and you still have to invent ideas worth sending to the minds of viewers. It’s just the interface for putting down colors and shapes is different. That said, usually I just draw people, who might be better photographed, so it’s unclear that these thoughts apply. I feel like there are other good things about it that I haven’t captured. For instance, there are lots of great paintings just attempting to capture what is in front of them.
    • I feel like I learn to draw faces better over the years via taking in insights about what they look like. For instance, ‘there is that little dip above the lip that you can fill with shadow’, or ‘eyes go half way down’, or the very early, ‘mustaches go under noses, not on top of them’.
    • I basically can’t imagine faces, at least to anywhere near the degree of realism with which I can draw them (checking this now, I’m struggling to imagine a generic smiley face, though arguably succeeding). It’s interesting to me that a person can draw what they can’t imagine.
    • I have the intuition that you can learn to draw more easily by looking at someone else’s drawing than you can by looking at a photograph of its subject. I decided to test this, and drew three people—one from a painting, one from a photograph, and one from my head. Can you guess which is which? (If so, how?) Drawing of a woman's face 1 Drawing of a woman's face 2 Drawing of a woman's face 3
  • Centrally planned war

    We watched Dunkirk, and wondered how many military deaths are for reasons more of logistics than of facing the enemy. Probably lots - we have heard that war is made of colossal logistical feats, so probably they often fail, and often lives depend on them.

    (Imagine organizing a party with hundreds of thousands of people at it. Imagine that is located in an overseas country, where you don’t have a house, and everyone hates you. Imagine that it goes for several years. Imagine it is a very stressful party for the partygoers, but also you are counting on them to carry out some hard and terrifying tasks for you during the party. Imagine you anticipate many deaths during the proceedings.)

    Which made me wonder, why is war so centrally planned? Why wouldn’t all these logistical details be simpler and cheaper in the usual ways if each soldier looked after himself mostly? Similar to how it works better for each person to look after themselves during peacetime, rather than having commanders organize and dictate the whole peaceful existence effort. Thoughts?

  • Intended enjoyment

    I am sometimes unsure what is meant to be enjoyed in things, for instance the short story The Gioconda Smile, or In the mood for love which people often enjoy a lot. Which seems like it shouldn’t be a problem, as long as I find something to enjoy in them. But it also seems like I would be seriously missing something if I was buying bags of coffee all these years just to appreciate the thick, substantive quality of the paper bag between my teeth as I chewed it. How many of my enjoyments are like this?

  • Review: The Gioconda Smile

    (Spoiler alert: discusses entire plot of The Gioconda Smile by Aldous Huxley)

    I’ve been reading short stories lately, which are often confusing to me, and I frequently wish that the author resolved the actual tension and relieved my actual curiosity more, by including some sort of short note at the end on what they were even trying to do.

    With that said, I read Aldous Huxley’s The Gioconda Smile, and was somewhat confused by it. I mean, it was a story. But since I got it from ‘50 Great Short Stories…a comprehensive selection from the world’s finest short fiction’, I’m expecting it to be somehow surpassingly great.

    The protagonist is a relatively uncaring womanizer, who we first meet making a friendly visit to one woman while another waits in his car, to be dropped off before he gets home to his wife, who he finds disgusting in her aged ill-health. He seems to be largely driven by avoiding boredom and upholding an image of himself as a sophisticated and charming stud. He knows that his tendency to feel loathing in the place of pity toward the unfortunate in general is not ‘comely’, but has abandoned feeling shame about this unfortunate fact of his nature. When his wife dies, he immediately marries his young mistress, on an apparently humor-fueled whim and with little regard for her. He is ever detached, lightheartedly throwing back strategic retorts to questions sent his way, never actually with another person in spirit. He seems intelligent and reasonable aside from being callous to the point of ridiculous imprudence. For instance, if you marry a woman as a ‘practical joke’, this might predictably cause inconvenience to your own life at some point, even if you are dedicatedly indifferent to her welfare.

    In fact the biggest thing that goes wrong is that the intriguing woman he visits but doesn’t dally with—she with the mysterious Gioconda (Mona Lisa) smile—falls in love with him, surreptitiously murders his wife, attempts too late to claim him, then frames him for the murder.

    So why is this good to read? What would one come to this for?

    It is probably a memorable story, but why is it worth remembering? Has it added anything rich to my life?

    Is it perhaps a moralizing story? The protagonist is gratuitously awful, which is some hint. One way it could be this is as a tale of his errors coming back to get him, where by emptily seducing every woman around, he was carelessly pitching powerful passions against one another, and somebody was bound to be crazy enough to do something terrible (his second wife also tries to kill herself, upon realizing that her husband doesn’t love her). But why write that? For people who would feel no moral compunctions about the guy’s behavior in general but may not have considered the pragmatic consequences?

    Another way it might be a moral story is that his own joyful amusements are cut short by another’s incredible selfish callousness, which serves him right for his own years of incredible selfish callousness? I don’t know, this doesn’t seem like a very interesting moral saga.

    Maybe it is meant as an interesting, amoral, portrait of a callous but smiling man, coldly charming women as he goes on with his empty existence, until the wake of his own heartlessness flips his boat? I didn’t come away with a vivid sense of his character, but perhaps I wasn’t paying enough attention.

    The title suggests that it is actually a portrait of the enigmatically smiling woman in the background, though she is an infrequent character. The protagonist imagines her as having mysterious depths to her personality in a way that his straightforward wife and lover do not, but again I didn’t end up with a well-formed sense of them.

    Maybe it’s for the murder mystery plot? The real murderer wasn’t revealed until the end, but as soon as it became clear that the wife was murdered, it seemed pretty obvious who had done it. For instance, if you assume that it wasn’t the husband (since that would conflict with the narration of his confused perspective) then the other visitor at the time of the poisoning was ol’ Gioconda smile. I guess it could have been an unnamed servant. The protagonist still seems confused about who the murderer could be at this point though, which is a bit strange if the author agrees that it is obvious.

    Maybe the point is to try to guess the murderer before it comes out that it was a murder? There were clues. For instance, Gioconda was like ‘your wife seems very sick, she could die any time’ and then the wife dies a few hours later. Then when the husband later goes to see Gioconda, she is like, ‘I’m impressed by how you put up with your wife for so long, but I saw that you weren’t really into it, and actually you needed a soul mate; also I love you’. But it seems like you would only guess that this is a murder mystery at that point if your mind happened upon the thought that this is all suspicious, at which point you have also solved it.

    It has some insightful observation of psychology. For instance, the protagonist reflecting on Gioconda’s decorations while waiting for her, illuminating both her psychology and his own:

    Photographs of Greek statuary, photographs of the Roman Forum, coloured prints of Italian masterpieces, all very safe and well known. Poor, dear Janet, what a prig—what an intellectual snob! Her real taste was illustrated in that water-colour by the pavement artist, the one she had paid half a crown for (and thirty-five shillings for the frame). How often his had heard her tell the story, how often expatiate on the beauties of that skilful imitation of an oleograph! “A real Artist in the streets,” and you could hear the capital A in Artist as she spoke the words. She made you feel that part of his glory had entered into Janet Spence when she tendered him that half-crown for the copy of the oleograph. She was implying a compliment to her own taste and penetration. A genuine Old Master for half a crown. Poor, dear Janet!

    This seems richer to me. But The Gioconda Smile was made into multiple films, which seems surprising if the value is in the unspoken thoughts of characters or manner of description, since these are hard to translate into action.

    It seems likely that I’m missing something.

  • On writing like a butterfly

    I thought it would be interesting to try to write my review of the Diving Bell and the Butterfly in my head without setting pen to paper until the end, and to convey at least some of it by blinking, since I find the fact that the author wrote the whole book in this way astonishing. Perhaps experiencing that process myself would improve my understanding of things, such that I wouldn’t be astonished.

    I think trying to do this was an even better exercise than I expected, though by the end I was frustrated to the point of tears, and I’m still feeling kind of annoyed, having just put it up.

    (Hopefully this was also a vivid and enlightening experience of signing up for annoying projects, which I do often, but usually the annoyance is months later than the agreeing, so I’m not sure that my intuitive anticipations make the connection.)

    Before I go and do something anti-annoying, I figure I should write some notes on the experience, while it is fresh.

    Some notes:

    • It did feel fairly encumbering. There were nascent sentences that I might have tried to poke in somewhere, then play around with, then look at and move or get rid of, where the prospect of trying to do some equivalent of all that in my head while keeping hold of the broader paragraph was too intimidating, and I watched them go by. And the sentences I did write felt like half my attention was on something like balancing them on the end of a stick and not having them fall on the floor, and really sculpting them would have required too much dexterity.
    • Though I think in some sense they were much more sculpted than usual, because I did think about each one for longer, and often hone it into something more succinct and memorable instead of writing down the first ramble that entered my mind. I’m not sure how that fits with the above observation.
    • It felt mentally strength-building - as if I was exercising a capability that would improve, which was exciting, and I briefly fantasized about a stronger and defter inner world.
    • I started out looking at things around me as I composed, like my resting computer, and the table, and the sea. But after a while, I realized that I was staring intently at a long rug with about as many Persian whorls as paragraphs in my prospective post, and that as I envisaged the current sentence, I was mentally weaving it around some well-placed sub-curls of its paragraph-whorl. Looking away from it, it was harder to remember what I had been saying. (I have noticed before that thinking in the world, I end up appropriating the scenery as some kind of scratch paper - you can’t write on it, but you can actually do a lot with reinterpreting whatever it already contains.)
    • For words with lots of synonyms, I kept selecting one, then forgetting which and having to select again (e.g. ‘lively’ or ‘energetic’ or ‘vigorous’?)
    • I originally set out to compose the whole thing before writing it, but this was fairly hard and seemed somewhat arbitrary, so after composing the basic outline and a few paragraphs, somewhat discouraged by the likelihood of forgetting them again imminently, I decided that I could instead compose chunks at a time rather than having to do it all at once. In the end I did it in paragraph chunks. Which is probably a much easier task than Bauby had, since if someone was coming to transcribe stuff for hours, one probably wants more than one paragraph relatively well prepared.
    • Thinking lots of thoughts without saying or writing them can feel a particular kind of agitating.
    • It took about 20 minutes for my boyfriend and I to transcribe a single sentence using roughly the winking method described in the book, for a speed of around 1 word per minute. The scheme was for him to run his finger over an alphabet reorganized by letter frequency, then for me to wink when he reached the desired letter. We added some punctuation, and a ‘pause! let me think!’ signal, and ‘yes’, and ‘no’. These last three got a lot of use. It basically worked as expected, though one time we made an error, and I didn’t know what to do, so I continued from the beginning of the word again, which made the sentence nonsensical, which confused him for a while, but he figured it out.
    • I wondered why Bauby and his assistant didn’t use Morse code, or something more efficient. We didn’t try this, but some forum users also wonder this, and one claims that he can wink out about 20 words per minute in Morse code, but that the large amount of blinking involved is ‘pretty tiring’.
    • We made a huge amount of use of my boyfriend guessing the rest of the word, from context and the first few letters. In the book, Bauby describes how people frequently mess that up, or fail to check that they have guessed correctly, or refuse to guess and conscientiously coax forth every letter. This all sounds terrible.
    • I’m aware that some people probably compose things entirely in their heads all the time (people have all kinds of mental situations - some people can also reliably imagine a triangle without it being more like the feeling of a triangle laid out in a kind of triangle-like space, or breaking apart and becoming a volcano full of red and white flowers), and my notes here probably sound to them like a person saying ‘for a bizarro experience, I tried to walk across the room without holding on to things, but it was obviously a total disaster - knees bending every which way, and imagine balancing a whole floppy and joint-strewn human body on top of two of those things, while moving! Such sympathy I have for those who have lost their walking frames.’ I’m curious to hear from them whether this is what it sounds like.
  • Review: The Diving Bell and the Butterfly

    I suspect my mind of taking its observations of a person’s physical energy and dexterity as strong evidence about their mental quickness and clarity.

    The existence and the wrongness of this presumption were brought into relief for me by reading Jean-Dominique Bauby’s The Diving Bell and the Butterfly, on his life with locked-in syndrome. Because realizing that the author’s lively and intelligent voice was issued from a single blinking eye looking out of a mostly inert body felt like seeing a magic trick.

    But perhaps it is just that a writing process without vigorous back-and-forth with a mind-expanding piece of paper or virtual paper—a place to lay out one’s words and see them and move them around, without having to keep the entire gameboard in one’s head—sounds mentally paralyzing to me.

    At this point, I realize that in all likelihood it is my own mind that is weak and wasted from clinging to these crutches, and that has learned to fear starting out by itself with no piece of paper to reassuringly catch its thoughts and promise that nothing will be forgotten. Bauby’s mind may well have become stronger between his time as eloquent editor of French Elle and writing this book.

    It probably isn’t the feat of its creation though that makes this enjoyable to read, or makes my mind keep coming back to it. It’s a bright and compelling window into another person’s mind. (The bizarre tragedy of the plot probably doesn’t hurt either.)

    Another thing I found interesting about the book was that the setting of being immobile in a hospital bed doesn’t give the reader many of the clues they might usually use to make out the character of the protagonist. How does he hold himself, when he can hold himself? How does he talk to his partner, when he can do more than blink? What did he want in life, before the goalposts were changed and, as he says, not drooling all the time became a cherished ambition? (We do learn that he is 44 years old, and that if he is going to drool, he would prefer to drool on cashmere.) Yet, I did not have a sense of blankness about his personality. From his narration of the world, I felt that I knew him. However toward the end, he narrates the final day before the stroke that detached his mind and his body mostly. And I realize that if I had met him in that way first, I would have a different impression of him. And I liked the way around that I did meet him. So I wonder, if it would be better to more often meet people’s trains of thought for a while before getting to see their appearance or overt behaviors.

    I composed each paragraph of this post in my head except about fifteen words of editing, and dictated the first line to my boyfriend via winking.

  • Mistakes to want

    The end of the year is a classic time for reflecting on the year. And a classic part of reflecting is noticing mistakes you have made. I admit that I don’t relish this: having made mistakes, admitting to them, and looking at them further all pain me, and I find it hard to call things mistakes. It’s because to make a mistake would seem to be to make the world worse than it could have been, and thus to indelibly reduce the total goodness of the universe at the end of time, which feels like a big deal and the worst (only?) evil.

    Possibly others don’t have the same mental hangups around such things as I do, or have thought clearly about this earlier, but just in case not, I’ll spell out how it actually isn’t bad at all, even in these terms (or at least offer some somewhat scattered thoughts on the matter).

    Let us distinguish between two kinds of mistakes: ‘innocent mistakes’ and ‘evil mistakes’.

    Suppose you made a mistake. Was it a mistake given what you knew and understood at the time? Then that’s an ‘evil mistake’. For instance, if you think kicking dogs is worse than not kicking them, and you kick one anyway to experience what badness feels like. (It sounds kind of strange to call it a mistake, since it was what you intended, but whatever.)

    If your mistake was not an error given your understanding the time, then it’s an ‘innocent mistake’. For instance, if you made an unsuccessful choice about what projects to do, or if you hurt your brother’s feelings because it hasn’t occurred to you that he would be sensitive about his appearance. It is tempting to say that an innocent mistake wasn’t a mistake in the relevant sense. You couldn’t have really done better in your actual circumstances of limited knowledge and limited thought, at least not predictably. Your conscience should be clear, at least. You did the best you could, so you do not seem to deserve blame or regret.

    Yet if blame and regret are for teaching people, then it seems you should have them, at least if they are the ways for you to notice or feel your mistake. For instance, perhaps your sister should say ‘you were so mean to Bob!’ or you should find your thoughts sadly dwelling on your choice of projects. On the other hand, there is not much point blaming the evil mistake-doer, unless your blame hurts them in a way that might put them off selfishly in the future, supposing that they haven’t come around. Just pointing out that they were bad is no news to them. Regret might help them more, but it’s a bit unclear how, since by the time they are regretting their evil, it seems they have already changed their mind about whether they prefer evil. In sum, blame, regret and genuine moral wrongness seem to come apart: blame and regret are often helpful for the innocent mistake-maker, and less so for the evil mistake-maker, while it is the evil mistake-maker who is a genuine moral failure.

    A different way of putting this is that the notion of innocence put forward here is a bit different from the normal one - for instance, if you were thoughtless because you were young and hadn’t thought about not being thoughtless yet, and implicitly didn’t think being thoughtful was worth it, and you hurt someone, then you are innocent in the current sense, but quite possibly a guilty asshole in more common senses. We all agree that you should have the error of your ways pointed out to you, but I’m claiming that you shouldn’t take this as your having genuinely made the world worse than you might have, or feel that it is a true negative mark in some moral ledger.

    Only innocent mistakes are helpful for learning. And only evil mistakes represent having genuinely made the world worse when you could (in the relevant sense) have made it better. So looking back on the year, one can hope without terror to see many innocent mistakes, and no evil mistakes.

  • Seeing the edge of the world

    A nice thing about looking at the ocean that I noticed today is that it is unusually easy to interpret the view as a close up of the edge of a giant wet ball of rock in space, and thus to more compellingly visualize the fact that I live on one of those, and some of all that that entails.

    The Pacific Ocean where I am got dark, so here’s the Norwegian Sea 2.5 years ago to illustrate: Norwegian sea, probably

  • Desires as film posters

    Sometimes I like to think of desires as like film posters. You come across them, and they urge you to do something, and present it in a certain way, and induce some inclination to do it. But film posters are totally different from films. If you like a film poster, you don’t have to try to see the film. There is no metaphysical connection between the beauty of a film poster and the correctness of you seeing the film. It’s some evidence, but you have other evidence, and you get to choose. A film poster can be genuinely the most beautiful film poster you’ve ever seen, without the film being a worthwhile use of two hours. That’s largely an orthogonal question. If you put up the poster on your wall and look at it lovingly every day, and never see the film, that doesn’t need to be disappointing—it might be the best choice, and you might be satisfied in choosing it.

  • Measuring up to incredible potential

    Yesterday I wrote that people often talk as if events are basically determined by people’s values and capabilities, ignoring the difficulty of figuring out which opportunities to take, or even noticing opportunities.

    I think one reason to have a better model is that this one doesn’t account for a substantial category of felt difficulty in being a human, possibly encouraging a general sense that one is ubiquitously failing, what with not seeming to be demonstrably grabbing the best of a vast multitude of possible options at each moment.

    My own experience for instance involves often not remembering even the options that I noticed before, and these not seeming that multitudinous, except relative to my pitiful capacity to hold things in my mind. Also, regularly noticing new aspects of the world suggestive of many options whose existence hadn’t even dawned on me previously, even though they would be pretty obvious to a creature for whom logic was free. And mostly being extremely uncertain about what actions will lead to what outcomes. If I habitually model myself as roughly homo economicus, deftly directing my resources toward my goals, at worst according to my pristine priors, it seems like I am in danger of continually finding my real self incredibly disappointing. Plus, abstracting away parts of the situation that might be rich with potential for improvement.

  • Infinite possibilities

    Naively, for instance from the perspective of me as a child, it seems like a person has vastly many possible options at each moment, leading out in every direction, where many of them surely lead to amazing things, and thus it should be very easy to have an incredibly great life and make a huge positive difference to the world.

    The problem with this is that having the ability to do incredible things, and wanting to do those incredible things, is not enough. If you can also do a bazillion other non-incredible things, then you also have to be able to pick out the incredible path from among the rest, and even if you do, a moment later it hits another incomprehensibly complicated intersection of unmarked paths, and you have to do it again.

    This perhaps sounds obvious, but I think we do often still talk as if what happens is determined by people’s goals and their capabilities, and ignore the issue of computing which exercise of capabilities will bring about which goals, or leaving it as hopefully irrelevant noise in the model. My tentative guess is that this is a real impediment to thinking about the world and strategizing about life well.

    I don’t know if anyone has a better model, or has thought about how bad this is. My tentative guess is that it is bad. It seems like something economists would think about, but I’m not sure what it would be called.

  • An open Christmas card to my stepfather

    Julafton, Carl Larson

    Happy Christmas!

    I made you a link post of things you might find interesting on the internet. (Please don’t feel obliged to look at them!)

    Lots of love, Katja oxo


    (Picture: Christmas Eve, by Carl Larson, 1904-5)


    1. George Saunders, e.g. The Braindead Megaphone, and Tent City, USA

    2. Epic tomato sorting machine

    3. Slatestarcodex archives, e.g. on meaning and the problem of evil and thinking under conformity pressure (these are incidentally all funny stories, but the site is mostly non-fiction)

    4. Test your ability to distinguish real people from AI generated faces (or to just see unreal people, look at thispersondoesnotexist.com)

    5. Julia Galef, probably via Rationally Speaking Podcast

    6. Robin Hanson: Politics isn’t about policy; The Great Filter - Are We Almost Past It?

    7. Did President Truman not realize that Hiroshima was a city?

    8. Politics and the English Language: I found the writing ironically uncompelling, but the idea interesting

    9. Hamilton the Musical

    10. How to live for much longer in expectation

    11. Xkcd, e.g.


    Informative ones: Earth’s temperature over time, chart of what improvements are worth the time

  • An open Christmas card to my brothers

    candy canes nyc

    Dear Brothers,

    Happy Christmas! I made you a list of things you might like on the internet. I won’t tell you which of you any was intended for, so feel especially free to skip any that don’t look interesting.

    Love, Katja


    (Picture: nearing Christmas in New York 2017)




    George Saunders: The incredible Buddha boy, The New Mecca

    Archive of 3495 Peter Dombrovskis photographs

    Nice buildings that are mostly mosques (click the x when it misleadingly says that you need to log in to see more)

    Opulent Joy: my friend’s art blog

    Unsong: one of my favorite books that I haven’t finished reading.

    How to do what you love: Paul Graham has some other good essays too

    AI dungeon: have cutting edge AI write you a story

    The oldest video: strange

    It’s as simple as one, two, three

    3Blue1Brown, e.g. The unexpectedly hard windmill question or the first of a 15 part series on linear algebra, if you are in for something more involved

    Extensive Mandlebrot set

    Rent a real goat for your video calls

    The (Mormon) Tabernacle Choir

    Universal love, said the cactus person

    The guns falling silent at the end of WWI, reconstructed into sound from ‘sound ranging’ equipment that translated sounds onto visual film at the time.

    Very impressive patty cake a cappella (and more: 3D pretty a cappella patty cake)

    AI fills in a cat/house/etc if you draw the outline. I thought these buildings were pretty good:

    eg1 eg2

    Hamilton (the musical)

    How to live for much longer in expectation

  • Open Christmas card to my father

    Our christmas tree

    Dear Daddy,

    Merry Christmas! I got you a list of things on the internet I thought you might find interesting. No obligation to look at them. (Many things I’ve enjoyed lately are old or about history, so there’s a good chance that you have seen them or seen much better things, but oh well.)

    Lots of love from Katja oxo


    (Picture: our Christmas tree)



    1. AI dungeon: a text adventure written by AI in response to your responses

    2. AI can also generate coherent completions of photographs (See examples under ‘Completions’)

    3. A lecture about the Bronze Age collapse (I enjoyed this, though your historical understanding being thousands of times greater than mine might make it less good for you)

    4. Slave voyages: a memorial in the form of a database of 36,000 specific slaving expeditions that took place between 1514 and 1866. Perhaps most easily appreciated in the form of this timelapse of the journeys (note that you can click on any dot and see the details of the ship and captives). I’m pretty impressed at them finding this much data.

    5. On the shortness of life (I enjoyed this, though don’t necessarily agree)

    6. How to live for much longer in expectation

    7. The isles of Greece

    8. The tallest building, the Burj Khalifa, is pretty cool looking:

    roman_logov_f5QWC1a3tOA_unsplash denis_harschi_A4o2W_MgQEM_unsplash alex_azabache_9ZChTzDtCww_unsplash_(1) nick_fewings_4PDWwUD6g_4_unsplash cem_ersozlu_HX16IHCPlO4_unsplash ben_koorengevel_yoksEYoQUD4_unsplash jake_de_bique_o_Ye4KHFaSY_unsplash zheni_yaneva_ZLVY4YuMT48_unsplash thomas_drouault_3dGU4xziUAM_unsplash

  • Open Christmas card to my mother

    Geboorte van Christus, Robert van den Hoecke, after Jan van den Hoecke, 1632 - 1668

    Dear Mama,

    Happy Christmas!

    Rather than be so bold as to give you a further object to take care of, I gathered for you a collection of things I think you might like on the internet. I CC the world, in case they are interested, or want to add anything.

    Lots of love, Katja xox


    (Picture: Geboorte van Christus, Robert van den Hoecke, after Jan van den Hoecke, 1632 - 1668)



    1. Baby elephants

    2. A nice video of a 1965 living room clog dance

    3. Round seals that I gathered earlier

    4. Dance your PhD winner (though perhaps you showed that to me originally..)

    5. The complete works of Alfred Sisley

    6. Archive of Olegas Truchanas’ photography

    7. The 1890s, around the world, on film

    8. Positly: Perhaps you have enough opportunities to interview people, but if you want to run your own surveys of strangerse online, this streamlines the process a lot (i.e. you basically write some questions and put money in the slot and they handle getting strangers to answer your questions and paying them). For instance, I asked some Americans why they voted as they did, and some other things about their experience of the election.

    9. Hamilton the Musical: I know I probably recommended this before, but a) I should note that it might be the strongest artistic recommendation I’ve ever given, and b) if you did listen to it, I recommend listening to it again.

    10. How to live for much longer in expectation (I know you don’t believe in this, but I encourage you to think about it more, because it’s important to me that you avoid dying if feasible.)

    11. Rijksmuseum: the national museum of the Netherlands has a pleasing website for their pleasing art collection.

    Here are some that I like today:

    Morning Ride along the Beach, Anton Mauve, 1876 Morning Ride along the Beach, Anton Mauve, 1876 Fishing Pinks in Breaking Waves, Hendrik Willem Mesdag, c. 1875 - c. 1885 Fishing Pinks in Breaking Waves, Hendrik Willem Mesdag, c. 1875 - c. 1885 Landschap in de omgeving van Den Haag, Jacob Maris, 1891 Landschap in de omgeving van Den Haag, Jacob Maris, 1891 Cellar of the Artist’s Home in The Hague, Johan Hendrik Weissenbruch, 1888 Cellar of the Artist’s Home in The Hague, Johan Hendrik Weissenbruch, 1888

  • Moral moralizing

    Most people get their ethics from a combination of trusting what is normally done in their society and doing what they feel is right.

    It seems to me that this has been utterly discredited as a reliable source of ethical advice, because it is the same one that recommended to the average person slavery as labor, and genocide as standard geopolitical strategy, and rape as prize, and torture as entertainment.

    I don’t know of a clearly better source, for a typical person who isn’t a professional ethicist. But given the intense fallibility of this one, I’m inclined to say that the resulting moral views should be held with uncertainty and modesty.

    From this I’m inclined to infer that people should not be moralizing about moral disagreements.

    I do share the urge to moralize when I’m in a moral disagreement, but is this wrong?

  • What is food like?

    I have an Anki1 deck of things I feel like a failure regarding. Instead of each card having a question that I see if I can remember the answer to, it has a potentially shameful thing that I see if I still feel bad about. Each time I look at one, as well as marking it correct to the extent that I no longer feel bad about it, I briefly do a little bit to make it better. (Learn about the thing I’m embarrassed to not know about, practice the skill that I don’t have, think about whether it’s a real problem, etc). My sense is that one can often feel bad about something for a long time which one could alternatively make marked progress on in a very short time.

    This time I drew a card marked, ‘can’t critique food’. Admittedly not the most horrifying of failures, and I don’t currently feel too bad about it, or remember writing it down. But I suppose that what I had in mind is that when people discuss the merits or subtleties of different foods and food establishments, I often feel like the topic is outside my domain, and furthermore suspect that if my strongest views of the moment were revealed—e.g. “good fries > bad fries » not fries”, or “I want to eat something cool and wet and I prefer it involve parsley”, or “pea protein is the worst”—I would seem childish.

    So I read online briefly and found that Jonathan Gold is a famous food critic, then read a very small amount of his writing. Now I will have a go at critiquing food at all, which I expect is a decent step toward being passable at it (while also fulfilling my intention to occasionally do things I haven’t done).


    On an almost empty teacup of Mary’s Gone crackers I found in my bedroom while cleaning up recently.

    Food that you know has been sitting open on your desk for a week can be tainted by a foul flavor of unease. But my expectations for these crackers were cleansed with the fresh-toasted snap and delicate flavor of my first nibble of one.

    Crispy things are often light and insubstantial—crunch and heft seem to often overlap in coarse, unyielding foods. Yet this Mary’s Gone cracker had both a pleasing, easy crackle and a real density. And while in no way chewy, it was good to chew: coming apart not into a wheat slurry, but into a textured rubble of seed meal, satisfying and nutty. I wanted to find the fragments lost in the corners of my mouth. I wanted to crush the last seeds with my teeth. All the while the bright taste of toasted herbs lingered, sometimes veering into a burning. Not of chili, but of flavor.

    Then my boyfriend became available for chatting, and I chatted with him, ceasing to resist the temptation to eat virtually all of the crackers. I bit them thoughtfully as we spoke, knowing I was eating up the opportunity to critique them, but only sparing enough attention from the conversation to favor this end, on grounds that they were presently delicious.

    In sum, Mary’s Gone Crackers are very good crackers, and survive surprisingly well in an uncovered teacup. If you keep a stack of them in a teacup with a somewhat but not much larger diameter, this can also be spatially pleasing.

    1. A spaced repetition flashcard system 

  • Elephant seal

    Elephant seal

  • Opposite attractions

    Is the opposite of what you love also what you love?

    I think there’s a general pattern where if you value A you tend to increase the amount of it in your life, and you end feeling very positively about various opposites of A—things that are very unlike A, or partially prevent A, or undo some of A’s consequences—as well. At least some of the time, or for some parts of you, or in some aspects, or when your situation changes a bit. Especially if you contain multitudes.


    • Alice values openness, so tends to be very open: she tells anyone who asks (and many people who don’t) what’s going on in her life, and writes about it abundantly on the internet. But when she is embarrassed about something, she feels oppressed by everyone being able to see her so easily. So then she hides in her room, works at night when nobody is awake to think of her, and writes nothing online. Because for her, interacting with someone basically equates to showing them everything, her love of openness comes with a secondary love of being totally alone in her room.
    • Bob values connecting with people, and it seems hard in the modern world, but he practices heartfelt listening and looking people in the eye, and mentally jumping into their perspectives. He often has meaningful conversations in the grocery line, which he enjoys and is proud of. He goes to Burning Man and finds thousands of people desperate to connect with him, so that his normal behavior is quickly leading to an onslaught of connecting that is more than he wants. He finds himself savoring the impediments to connection—the end of an eye-gazing activity, the chance to duck out of a conversation, the walls of his tent—in a way that nobody else at Burning Man is.
    • An extreme commitment to honesty and openness with your partner might leads to a secondary inclination away from honesty and openness with yourself.
    • A person who loves travel also loves being at home again afterward, with a pointed passion absent from a person who is a perpetual homebody.
    • A person who loves jumping in ice water is more likely to also love saunas than someone who doesn’t.
    • A person who loves snow is more likely to love roaring fires.
    • A person who loves walking has really enjoyed lying down at the end of the day.
    • A person who surrounds themselves with systems loves total abandonment of them during holiday more than he who only had an appointment calendar and an alarm clock to begin with.
    • A person with five children because they love children probably wants a babysitter for the evening more than the person who ambivalently had a single child.
    • A person who loves hanging out with people who share an interest in the principles of effective altruism is often also especially excited to hang out with people who don’t, on the occasions when they do that.
    • A person who directs most of their money to charity is more obsessed with the possibility of buying an expensive dress than their friend who cares less about charity.
    • A person who is so drawn to their partner’s company that they can’t stay away from them at home sometimes gets more out of solitary travel than someone more solitariness-focused in general.
    • A person craving danger also cares about confidence in safety mechanisms.
    • A person who loves the sun wants sunglasses and sunscreen more than a person who stays indoors.

    This pattern makes sense, because people and things are multifaceted, and effects are uncertain and delayed. So some aspect of you liking some aspect of a thing at some time will often mean you ramp up that kind of thing, producing effects other than the one you liked, plus more of the effect that you liked than intended because of delay. And anyway you are a somewhat different creature by then, and maybe always had parts less amenable to the desired thing anyway. Or more simply, because in systems full of negative feedbacks, effects tend to produce opposite effects, and you and the world are such systems.

  • Thesauring

    Sometimes I really enjoy opening a thesaurus. I don’t know why. It’s just a moment of opening it, anticipating feeling around amongst the meanings of different words, weighing their rightnesses, which seems like a kind of heavenliness, sometimes. I think it was better in 2018, and now I’m mostly remembering that.

  • Genetic magic

    23andMe are now willing to guess where one’s ancestors are from at the level of counties. For instance, as well as thinking I have 19% Swedish ancestry, they now guess that it is primarily from Västra Götaland County. Which is in fact where my current Swedish relatives cluster. Their guesses in Ireland center on Cork, with Limerick and Tipperary next door 4th and 8th most likely (of 26 counties), and those two are where the few 17th-19th Century relatives I know about seem to have come from in Ireland, so that also seems pretty good.


    Much as I believe all that stuff about one’s body being full of cells that contain genetic code that is shared by one’s relatives, and about historic movement and mixing of populations being low, it’s awesome to actually see someone take a fairly good guess at what part of what country your obscure relatives lived hundreds of years ago by examining your spit.

  • Some things I'm looking forward to in 2021: probable post-pandemic edition

    1. Fewer deaths all around
    2. A giant party at my house
    3. A portion of the research I feel bad about not doing just becoming irrelevant (e.g. what’s the evidence about surfaces now? Are we badly underestimating the harms of long covid?)
    4. Leaving my house in an unprepared fashion and seeing where it takes me
    5. Whatever it was that I used to do in places other than my house, that I actually can’t seem to remember or explicitly pinpoint and plan from a distance, but which I vaguely miss (possibly this is basically just 4)
    6. Seeing friends who live in faraway places such as Berkeley
    7. Going on a cross-country train and embracing the general lack of hygiene and space
    8. Seeing non-household friends without inadvertently spending a fraction of my attention on air dynamics and mask stability
    9. The stakes of everyday personal choices being lowered enough that people being thoughtless or foolish isn’t a critical threat to friendliness, harmony or anyone’s life
    10. Helping the economy of restaurants and cafes recover
    11. Casual minor encounters and non-encounters with strangers in shops, streets and cafes (and these being more of the meeting places of the world, and internet comments being less, for more people)
    12. Meeting new people regularly
    13. More distinct places and place-vibes to conveniently be in
    14. One fewer area to worry that I should be doing something better
    15. Listening to an audiobook on crowded public transit on the way to my downtown office, and then being at my office
    16. Reading and writing in notebooks in cafes
    17. The world feeling so big that you can go to other places and find people thinking about entirely different things
    18. Being dirty
    19. Crowded, dimly lit places with atmosphere and mild uncertainty
    20. Resolution of this episode
    21. Watching everyone else enjoy things coming back
  • What is it good for? But actually?

    I didn’t learn about history very well prior to my thirties somehow, but lately I’ve been variously trying to rectify this. Lately I’ve been reading Howard Zinn’s People’s History of the United States, listening to Steven Pinker’s the Better Angels of Our Nature, watching Ken Burns and Lynn Novick’s documentary about the Vietnam War and watching Oversimplified history videos on YouTube (which I find too lighthearted for the subject matter, but if you want to squeeze extra history learning in your leisure and dessert time, compromises can be worth it.)

    There is a basic feature of all this that I’m perpetually confused about: how has there been so much energy for going to war?

    It’s hard to explain my confusion, because in each particular case, there might be plenty of plausible motives given–someone wants ‘power’, or to ‘reunite their country’, or there is some customary enemy, or that enemy might attack them otherwise–but overall, it seems like the kind of thing people should be extremely averse to, such that even if there were plausibly good justifications, they wouldn’t just win out constantly, other justifications for not doing the thing would usually be found. Like, there are great reasons for writing epic treatises on abstract topics, but somehow, most people find that they don’t get around to it. I expect going to some huge effort to travel overseas and die in the mud to be more like that, intuitively.

    To be clear, I’m not confused here about people fighting in defense of things they care a lot about—joining the army when their country is under attack, or joining the Allies in WWII. And I’m not confused by people who are forced to fight, by conscription or desperate need of money. It’s just that across these various sources on history, I haven’t seen much comprehensible-to-me explanation of what’s going on in the minds of the people who volunteer to go to war (or take part in smaller dangerous violence) when the stakes aren’t already at the life or death level for them.

    I am also not criticizing the people whose motives I am confused by–I’m confident that I’m missing things.

    It’s like if I woke up tomorrow to find that half the country was volunteering to cut off their little finger for charity, I’d be pretty surprised. And if upon inquiring, each person had something to say—about how it was a good charity, or how suffering is brave and valiant, or how their Dad did it already, or how they were being emotionally manipulated by someone else who wanted it to happen, or they how wanted to be part of something—each one might not be that unlikely, but I’d still feel overall super confused, at a high level, at there being enough total energy behind this, given that it’s a pretty costly thing to do.

    At first glance, the historical people heading off to war don’t feel surprising. But I feel like this is because it is taken for granted as what historical people do. Just as in stories about Christmas, it is taken for granted that Santa Clause will make and distribute billions of toys, because that’s what he does, even though his motives are actually fairly opaque. But historical people presumably had internal lives that would be recognizable to me. What did it look like from the inside, to hear that WWI was starting, and hurry to sign up? Or to volunteer for the French military in time to fight to maintain French control in Vietnam, in the First Indochina War, that preceded the Vietnam War?

    I’d feel less surprised in a world where deadly conflict was more like cannibalism is in our world. Where yes, technically humans are edible, so if you are hungry enough you can eat them, but it is extremely rare for it to get to that, because nobody wants to be on any side of it, and they have very strong and consistent feelings about that, and if anyone really wanted to eat thousands or millions of people, say to bolster their personal or group power, it would be prohibitively expensive in terms of money or social capital to overcome the universal distaste for this idea.

  • The wild

    The idea of art about nature doesn’t sound exciting to me in the abstract. Perhaps I remember that I am evolutionarily supposed to see it and go, ‘oh fantastic, it’s green and blue near each other, maybe I’m in for some reproductive success’, and that doesn’t sound very inspiring. (Yes, I know that simple evolutionary situations can feel inspiring from the inside.)

    But a kind of art about nature that I can especially get behind is that which to me evokes some of wild alien vigor of nature, that I sometimes experience for instance sitting in my overgrown back garden, contending with a flock of ascendent giant poppy faces and a stilled frenzy of branches gaining ground and sky about my seat.

    It is a sharp exotic aliveness, an electric intent for living that wants to overwhelm the strictures of physical space and come curling and unfolding through the air, with an explosive energy that has no truck with time.

    Not alien like an alien mind, but more like an alien spirituality, that doesn’t know anything so mundane as minds. But while you revere under the perfect spires of goodness, it comes in unexpected and unintelligible from the edges and without looking you in the eye, just overwhelms you with how what really made the world is intensely here and intensely not in your understanding. This was always the world.

    “Everything wants to live”, my botanist grandfather would explain to me. Perhaps for plants it is often more borderline whether living is in store. Regardless, their unrelenting striving for it is more visceral to me than that of animals. Animals are more placidly set in their bodies. Plants take every opportunity to push outwards, expanding into new limbs and thickening out old ones in their unthinking, unfailing fervor.

    A lot of nature in art isn’t like this. Often it is idyllic, or evocative at a landscape level, or sleepy furniture for a human story. But here is some art that does portray it somewhat, for me, though I don’t know if the artists actually meant to:

    The Hill Path, Ville d Avray, by Alfred Sisley The Hill Path, Ville d Avray, by Alfred Sisley

    Irises, Vincent Van Gogh (1889) Irises, Vincent Van Gogh (1889)

    Myrtle tree in rainforest at Mount Anne, southwest Tasmania, 1984 , by Peter Dombrovskis (under copyright)

    I have some of the same image with these lines:

    “Reduced to a Gothic skeleton, the abbey is penetrated by beauty from above and below, open to precisely those elements it had once hoped to frame for pious young men, as an object for their patient contemplation. But that form of holy concentration has now been gone longer than it was ever here. It was already an ancient memory two hundred years ago, when Wordsworth came by. Thistles sprout between the stones. The rain comes in. Roofless, floorless, glassless, “green to the very door”—now Tintern is forced to accept the holiness that is everywhere in everything.” - Zadie Smith, Some notes on Attunement

  • Blog plant

    I like to have plants in honor of things I’ve done. I’m not consistent or proportionate about it, and admittedly I also have quite a few plants that I intend to be in honor of things I haven’t yet done. But now that this blog has apparently functional comments and images and analytics, as well as words on pages on the internet, I declare it fully bloggy, and my effort to make a blog complete. Here’s my blog plant:


    I welcome perspectives on good marginal improvements toward WSSP being a pleasing blog to interact with. (I’m aware that the subscription possibilities are not as salient as they could be.) It is a static site made using Jekyll and kept on Github, with its comments looked after by Disqus and its analytics by Google and its pictures by Photobucket.

  • Why quantitative methods are heartwarming

    From Twitter:

    I haven’t actually seen Moneyball, but it does sound heartwarming, and I have had to hide my tears when someone described a payment app their company was working, so I’m probably in Julia’s category here.

    If I didn’t feel this way though, reading this I might imagine it as some alien nerdly aberration, and not a way that I could feel from the inside, or that would seem the ‘right’ way to feel unless I became brain-damaged. Which I think is all wrong—such feelings seem to me to be a warm and human response to appreciating the situation in certain ways. So I want to try to describe what seems to be going on in my mind when my heart is warmed by quantitative methods and efficient algorithms.

    When using good quantitative methods makes something better, it means that there wasn’t any concrete physical obstacle to it being better in the past. We were just making the wrong choices, because we didn’t know better. And often suffering small losses from it at a scale that is hard to imagine.

    Suppose the pricing algorithm for ride sharing isn’t as good as it could be. Then day after day there will be people who decide to walk even though they are tired, people who wait somewhere they don’t feel safe for a bit longer, countless people who stand in their hallway a bit longer, people who save up their health problems a bit more before making the expensive trip to a doctor, people who decide to keep a convenient car and so have a little bit less money for everything else. All while someone who would happily to drive each of them at a price they would happily pay lives nearby, suffering for lack of valuable work.

    I’m not too concerned if we make bad choices in baseball, but in lots of areas, I imagine that there are these slow-accreting tragedies, in thousands or millions or billions of small inconveniences and pains accruing each day across the country or the world. And where this is for lack of good algorithms, it feels like it is for absolutely nothing. Just unforced error.

    Daily efforts and suffering for nothing are a particular flavor of badness. Like if someone erroneously believed that it was important for them to count to five thousand out loud at 10am each day, and every day they did this—and if they traveled they made sure there would be somewhere non-disturbing to do it, and if they stayed up late they got up by 10am; and if they were doing something they stepped out—there would be a particular elation in them escaping this senseless waste of their life, perhaps mixed with sorrow for what had been senselessly lost.

    Also, having found the better method, you can usually just do it at no extra cost forever. So it feels reelingly scalable in a way that a hero fighting a bad guy definitively does not. This feels like suddenly being able to fly, or walk through walls.

    So basically, it is some combination of escape from a senseless corrosion of life, effortlessly, at a scale that leaves me reeling.

    Another thing that might be going on, is that it is a triumph of what is definitely right over what is definitely wrong. Lots of moral issues are fraught in some way. No humans are absolutely bad and without a side to the story. But worse quantitative methods are just straightforwardly wrong. The only reason for picking baseball players badly is not knowing how to do it better. The only reason for using worse estimates for covid risk is that you don’t have better ones. So a victory for better quantitative methods is an unsullied victory for light over darkness in a way that conflicts between human forces of good and bad can’t be.

    Yet another thing is that a victory for quantitative methods is always a victory for people. And if you don’t know who they are, that means that they quietly worked to end some ongoing blight on humanity, and did it, and weren’t even recognized. Often, even the good they did will look like a boring technical detail and won’t look morally important, because saving every American ten seconds doesn’t look like saving a life. And I’m not sure if there is anything more heartwarming than someone working hard to do great good, relieving the world from ongoing suffering, knowing that neither they nor and what they have given will be appreciated.

  • The institution of email

    There seems to be a common phenomenon where people get messages, then fail to respond to them, then feel bad. And the rarer strategy of actually dealing with all of one’s emails promptly doesn’t even seem obviously better. Was that how things were with letters or telegrams? Is it just that there are so many messages now, because they are easy to send?

    Could email, say, have gone a different way?

    I act as though I’m assuming email involves implied norms, though I didn’t agree to them. For instance, if someone writes me a personal message, I think I should not delete it without reading it, unless they deserve my disregard for some reason. If someone sends me a message asking me to do something, I act like I think I should do it in a timely fashion, if it is a reasonable request. If I write to someone, I feel like I should make it less terse than ‘ok.’. At least, many people seem to constantly feel bad for failing to uphold some standards that are I guess implicitly bought into by having an email address. (And not having an email address would be pretty wild.)

    Was this bad luck, and we could instead have developed happier norms? For instance, that there is no expectation that one read emails, or that responding in about a month is proper, or that tersity is expected, like text messages? Or the norm that you are meant to respond within a few seconds, or consider it dead, like verbal speech? Or the norm where if you want to put something in someone else’s to-do list, you find a slot in their Calendly and send them money for it? My guess is no - the norms are mostly implied by the length and cost and permanence of the form. Could we have a form that better implies happier norms in that case?

  • Flights of wonder

    Today I put up the last page in AI Impacts’ (primarily Ronny Fernandez’s) investigation into how human-made flying machines compare to evolved ones. (Relevant to how we expect human efforts to build minds to compare to evolved minds.) Evolution mostly won.

    Some other interesting things I learned in the process of editing this:

    • Monarch butterflies can probably fly at least 100km on the energy in about a quarter of a raspberry (not counting the energy they get from the wind, which seems to be central to their flying methods. And not to suggest that they eat raspberries, I am just more familiar with those than flower nectar).
    • People who estimate monarch butterfly ‘performance parameters’ sometimes do so by attaching plasticine to dead butterflies to get them to the right weight and balance, and then hand-throwing them across the room and noting measurements for those “in which no obvious pitching up or stalling occurred after release” (or at least they did in 1979).
    • Paramotors are a thing. They are like a giant fan you wear on your back to turn your paraglider into a powered vehicle.
    • A model airplane crossed the Atlantic on a gallon of fuel (this was the furthest per Joule of the machines in our collection).
    • This pedal-powered flying machine crossed the English Channel.

      This might be a slightly different one: Gossamer Albatross with bikes, maybe

  • Unexplored modes of language

    English can be communicated via 2D symbols that can be drawn on paper using a hand and seen with eyes, or via sounds that can be made with a mouth and heard by ears.

    These two forms are the same language because the mouth sounds and drawn symbols correspond at the level of words (and usually as far as sounds and letters, at least substantially). That is, if I write ‘ambition’, there is a specific mouth sound that you would use if converting it to spoken English, whereas if you were converting it to spoken French, there might not be a natural equivalent.

    As far as I know, most popular languages are like this: they have a mouth-sound version and a hand-drawn (or hand-typed) version. They often have a braille version, with symbols that can be felt by touch instead of vision. An exception is sign languages (which are generally not just alternate versions of spoken languages), which use 4-D symbols gestured by hands over time, and received by eyes.

    I wonder whether there are more modes of languages that it would be good to have. Would we have them, if there were? It’s not clear from a brief perusal of Wikipedia that Europe had sophisticated sign languages prior to about five hundred years ago. Communication methods generally have strong network effects—it’s not worth communicating by some method that nobody can understand, just like it’s not worth joining an empty dating site—and new physical modes of English are much more expensive than for instance new messaging platforms, and have nobody to promote them.

    Uncommon modes of language that seem potentially good (an uninformed brainstorm):

    • symbols drawn with hands on receiver’s skin, received by touch, I’ve heard of blind and deaf people such as Helen Keller using this, but it seems useful for instance when it is loud, or when you don’t want to be overheard or to annoy people nearby, or for covert communication under the table at a larger event, or for when you are wearing a giant face mask. -symbols gestured with whole body like interpretive dance, but with objective interpretation. Good from a distance, when loud, etc. Perhaps conducive to different sorts of expressiveness, like how verbal communication makes singing with lyrics possible, and there is complementarity between the words and the music.
    • symbols gestured with whole body, interpreted by computer, received as written text What if keyboards were like a Kinect dance game? Instead of using your treadmill desk while you type with your hands, you just type with your arms, legs and body in a virtual reality whole-body keyboard space. Mostly good for exercise, non-sedentariness, feeling alive, etc.
    • drumming/tapping, received by ears or touch possibly faster than spoken language, because precise sounds can be very fast. I don’t know. This doesn’t really sound good.
    • a sign version of English this exists, but is rare. Good for when it is loud, when you don’t want to be overheard, when you are wearing a giant face mask or are opposed to exhaling too much on the other person, when you are at a distance, etc.
    • symbols drawn with hands in one place e.g. the surface of a phone, or a small number of phone buttons, such that you could enter stuff on your phone by tapping your fingers in place in a comfortable position with the hand you were holding it with, preferably still in your pocket, rather than awkwardly moving them around on the surface while you hold it either with another hand or some non-moving parts of the same hand, and having to look at the screen while you do it. This could be combined with the first one on this list.
    • What else?

    Maybe if there’s a really good one, we could overcome the network effect with an assurance contract. (Or try to, and learn more about why assurance contracts aren’t used more.)

  • Walk to a San Francisco FedEx

    Today I had three work projects so close to done that I might be able to put something up on each, which would usually encourage work enthusiasm. But when I started on the first, I was struck by a strong inclination to stop and do something else. I didn’t immediately, but the inclination remained. And such inclinations make work worse in themselves, because when each new sentence or next motion engenders a little flinch away from it, the whole thing gets so slow and encumbered and hard to concentrate on that it makes sense to be repelled by it. And the thought of moving to the next little bit on the other projects seemed similarly distasteful.

    Sitting in my chair was likewise aversive, and I thought perhaps a walk would help. (Often I find similar work compelling, so it seems a shame to wrestle myself into a slow and encumbered version of it at a time when I don’t.) I also had a package to take to FedEx. If I listened to the Alignment Newsletter Podcast while I walked and delivered the package, I migh