June 14, 2020

Misconceptions in the AI community about technical AI safety

Source: A lot of this is copied from this doc

  • AI safety is about avoiding Terminator/Skynet/robot apocalypse, not about problems with technical merit:
    • Actually, AI safety is largely about how to get AI agents to generate the behavior that we want in spite of our inability to correctly specify good objectives/rewards/loss functions/cost functions. This is a clearly defined technical problem.
    • [Note: this is a problem already, and it’s only going to get worse as actions spaces increase in size and planning/RL capability improves.]
    • [Note: this is not the trolley problem.]
    • [Note: articles like this one are not helping the AI safety image https://www.smithsonianmag.com/innovation/what-happens-when-artificial-intelligence-turns-us-180949415/ ]
  • Related: AI safety worries about consciousness.
    • See above.
  • Related: Elon Musk is a spokesperson for AI safety / Elon is the reason folks are working on AI safety.
    • Elon is an individual with his own opinions, but see above about the technical problem we’re going after.
  • Related: Everyone in AI safety buys into the deep learning hype and thinks very capable AI/AGI/superintelligence is just around the corner.
    • Actually, the community ranges widely in their skepticism or optimism, but everyone thinks the problem statement above is important (see why below). Yes, there are people who believe there is no fundamental problem to be solved on the path to AGI, but there are also many who strongly disagree.
  • Corollary: the only reason people work in AI safety is because they think AGI/superintelligence” will happen very soon.
    • Actually, even if this is a very long term issue, it is important for some people to start thinking about it, because:
      • Some folks, especially in academia or labs with much freedom, need to think about long term problems.
      • Much of the work is applicable short term to less capable or more narrow AI systems with misspecified objectives.
      • It takes time to figure it out.
      • What if by some chance it does happens sooner N decades from now?
      • The solution might mean redefining the AI problem away from rationality in isolation — an agent that needs to optimize an exogenously specified reward function, to something that defines rationality of a human-robot system. This means that work on solutions to rationality in isolation might not be useful long term.
  • AI safety researchers are doing it as a PR move to further their careers.
    • People are very passionate about figuring out how do build AI long-term in a way that’s purely a plus to humanity.
    • In the current academic environment it is something that is a PR hazard to AI researchers
  • AI safety researchers are feeding the media frenzy about terminator.
    • Much like the media misreports on other kinds of AI work, the media likes to exaggerate and skew AI safety work.
    • AI safety researchers have far less of a media voice than the average AI researchers think.
  • Related: AI safety is a new concern that comes from hype
    • AI safety was discussed by many of the founding members of the field and has been a discussion since the fields inception (e.g. Turing, Wiener)
  • Related: AI safety researchers choose examples that are unrealistic and incendiary.
    • We should be more clear, but the examples (like the paperclip) are easy tools to illustrate the point that rationality in isolation is insufficient unless we find ways for agents to do the right thing despite our inability to think of every edge case / the optimal solution ahead of time. Yes, we won’t create an agent that turns the universe into paperclips, but we have already released agents that ended up showing us videos that skew our beliefs or polarize us — these are side effects of not having specified everything we care about.
  • There are too many people working on such a long-term problem/there too many resources being allocated to it.
    • It’s less than 1% of AI funding. It’s also about 50 researchers.
    • It’s important work, i.e. not everyone should do it, but someone definitely should.
    • It’s work that’s actually useful for short term too.
  • Related: distracting attention from fairness/capability/…
    • Like saying that working on biology distracts attention from chemistry..
    • It’s work that’s actually useful for short term too.
  • Related: AI safety research is only useful if AGI is likely to emerge
    • Small probabilities matter
    • AI safety feeds back into building good AI systems and is basic research on the problem
  • AI safety research is only about AGI/superintelligence
    • Most AI safety researchers work on long, medium, and short term safety problems (including transparency, explainability, dual use, and even fairness).
    • Again, the alignment problem is just as applicable to current AI systems.
  • This is not real safety. Real safety is physical safety, typically defined in terms of collisions, safe sets, keep out sets, stability.
    • There is a technical definition that is broader than physical safety, and it is a constraint on regret. Regret is with respect to the human’s actual objective, which the agent does not get to observe. Some of us call this beneficial AI instead of safe AI.
  • Major goals for communicating to the AI community:
    • Establish that there is technical merit, inform about the technical problem.
    • Also, alignment a problem today too!
    • Establish that it is not about robots taking over the world (and that examples are hyperbole of what happens when you keep the current AI definition unchanged and extrapolate).
    • Establish that this is worthwhile for someone to work on even if it’s only long term, but point out that there are short term systems that suffer from the same problem.

Updated Jul, 03 2020


#Literature-notes #AI-safety