The Behaviors of BERT Attention Heads in Stereotype Detection