Named entity recognition: challenges in document annotation, gazetteer construction and disambiguation