Stanford, UMass Amherst develop algorithms that train AI to avoid specific misbehaviors

As robots, self-driving cars and other intelligent machines weave AI into everyday life, a new way of designing algorithms can help machine-learning developers build in safeguards against specific, undesirable outcomes like racial and gender bias, to help earn societal trust.
Nov 21 2019
Members Only

A team of researchers at Stanford and University of Massachusetts Amherst published a paper in Science outlining a new technique that translates a goal, such as to avoid gender or racial bias, into mathematical criteria to allow a machine-learning algorithm to train an AI application to avoid that behavior. “We want to advance AI that respects the values of its human users and justifies the trust we place in autonomous systems,” said Emma Brunskill, an assistant professor of computer science at Stanford and senior author of the paper.