Evaluación del rendimiento de los sistemas de búsqueda de respuestas de dominio general

María-Dolores Olvera-Lobo; Juncal Gutiérrez-Artacho

doi:10.3989/redc.2013.2.921

Authors

María-Dolores Olvera-Lobo CSIC, Unidad Asociada Grupo SCImago, Madrid; Departamento de Información y Documentación, Universidad de Granada
Juncal Gutiérrez-Artacho Departamento de Traducción e Interpretación, Universidad de Granada

DOI:

https://doi.org/10.3989/redc.2013.2.921

Keywords:

Question-Answering Systems, performance analysis, definitional questions, factoid question, list questions, evaluation

Abstract

Information overload is felt more strongly on the Web than elsewhere. Question-answering systems (QA systems) are considered as an alternative to traditional information retrieval systems, because they give correct and understandable answers rather than just offering a list of documents. Four answer search systems available online have been analyzed: START, QuALiM, SEMOTE, and TrueKnowledge. They were analyzed through a wide range of questions that prompted responses of definitions, facts, and closed lists pertaining to different thematic areas. The answers were analyzed using several specific measurements (MRR, TRR, FHS, MAP and precision). The results are encouraging and they show that these systems, although each one different, are potentially valid for precise information retrieval of diverse types and thematic areas.

Downloads

Download data is not yet available.

References

Abdou, S.; Savoy, J.; Ruch, P. (2006). Dépister efficacement de l’information dans une banque documentaire: L’exemple de MEDLINE. En: Actes du XXIVème Congrès INFORSID. 129-143.

Belkin, N.J.; Vickery, A. (1985). Interaction in Information Systems: A Review of Research from Document Retrieval to Knowledge-based systems (LIR Report No 35). The British Library: Londres.

Blair-Goldensohn, S.; McKeown, K.; Schlaikjer, A. H. (2004). Answering Definitional Questions: A Hybrid Approach. En: Maybury, M. T. (ed.), New Directions in Question Answering. AAAI Press: Palo Alto. 47-58.

Buckley, C.; Voorhees, E. M. (2000). Evaluating evaluation measure stability. IGIR ‘00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval. 33-40. http://dx.doi.org/10.1145/345508.345543

Cleverdon, C. (1997). The Cranfield tests on index languages devices. En: Sparck Jones, K.; Willett, P. (eds.), Readings in information retrieval. Morgan Kaufmann: San Francisco. 47-59.

Cui, H.; Kan, M. Y.; Cua, T. S.; Xiao, J. (2004). A Comparative Study on Sentence Retrieval for Definitional Question Answering. SIGIR Workshop on Information retrieval for Question Answering (IR4QA).

Crouch, D.; Saurí, R.; Fowler, A. (2005). AQUAINT Pilot Knowledge-Based Evaluation: Annotation Guidelines: Palo Alto Research Center. Disponible en: http://www2.parc.com/isl/groups/nltt/papers/aquaint_kb_pilot_evaluation_guide.pdf.

Fukumoto, J.; Kato, T.; Masui, F. (2004). Question Answering Challenge (QAC-1) an evaluation of question answering tasks at the NTCIRWorkshop 3. Proceedings of AAAI Spring Symposium on New Directions in Question Answering. 122-133.

Green, B. F.; Wolf, A. K.; Chomsky, C.; Laughery, K. (1961). Baseball: An Automatic Question Answerer. Proceedings of the Western Joint Computer Conference. v.19, pp. 219–224.

Greenwood, M. A.; Saggion, H. (2004). A Pattern Based Approach to Answering Factoid, List and Definition Questions. Proceedings of the 7th RIAO Conference (RIAO 2004). 232-243.

Harman, D. K. (1998). Text retrieval conferences (TRECs): providing a test-bed for information retrieval systems. Bulletin of the American Society for Information Science, 24 (4), pp. 11-13. http://dx.doi.org/10.1002/bult.90

Jackson, P.; Schilder, F. (2005). Natural Language Processing: Overview. En: Brown, (ed.), Encyclopedia of Language & Linguistics, 2. Elsevier Press: Amsterdam. 503-518.

Kaisser, M. (2008). The QuALiM question answering demo: supplementing answers with paragraphs drawn from Wikipedia. Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Demo Session. Stroudsburg: Association for Computational Linguistics. 32–35.

Kangavari, M. R.; Ghandchi, S.; Golpour, M. (2008). A New Model for Question Answering Systems. World Academy of Science, Engineering and Technology, 42, pp. 506-513.

Katz, B.; Borchardt, G.; Felshin, S.; Shen, Y.; Zaccak, G. (2007). Answering English questions using foreign-language, semistructured sources. Proceedings of the First IEEE International Conference on Semantic Computing (ICSC 2007). Irvine: IEEE Computer Society. 439-445. http://dx.doi.org/10.1109/ICSC.2007.59

Kolomityets, O.; Moens, M. F. (2011). A Survey on Question Answering Technology from an Information Retrieval Perspective. Information Sciences (en prensa). http://dx.doi.org/10.1016/j.ins.2011.07.047

Olvera-Lobo, M. D.; Gutiérrez-Artacho, J. (2010). Question-Answering Systems as Efficient Sources of Terminological Information: Evaluation. Health Information and Library Journal, 27 (4), pp. 268-276. http://dx.doi.org/10.1111/j.1471-1842.2010.00896.x PMid:21050369

Olvera-Lobo, M. D.; Gutiérrez-Artacho, J. (2011a). Evaluation of Open -vs. Restricted- Domain Question Answering Systems in the Biomedical Field. Journal of Information Science, 37 (2), pp. 152-162. http://dx.doi.org/10.1177/0165551511398575

Olvera-Lobo, M. D.; Gutiérrez-Artacho, J. (2011b). Language resources used in multi-lingual Question Answering Systems. Online Information Review, 35 (4), pp. 543-557. http://dx.doi.org/10.1108/14684521111161927

Olvera-Lobo, M. D.; Gutiérrez-Artacho, J. (2011c). Multilingual Question-Answering System in Biomedical Domain on the Web: An Evaluation. Multilingual and Multimodal Information Access Evaluation, Lecture Notes in Computer Science, vol. 6941, pp. 83-88. http://dx.doi.org/10.1007/978-3-642-23708-9_10

Pérez-Coutiño, M.; Solorio, T.; Montes y Gómez, M.; López López, A.; Villaseñor Pineda, L. (2004). The Use of Lexical Context in Question Answering for Spanish. Workshop of the Cross-Language Evaluation Forum (CLEF 2004). pp.377-384, http://www.clef-campaign.org/2004/working_notes/CLEF2004WN-Contents.html [11 octubre 2011].

Peters, C. (2009). What Happened in CLEF 2009: Introduction to the Working Notes. Working Notes for the CLEF 2009 Workshop. http://www.clef-campaign.org/2009/working_notes/ [11 septiembre 2011].

Radev, D. R.; Qi, H.; Wu, H.; Fan, W. (2001). Evaluating Web-based Question Answering Systems. Informe técnico. University of Michigan.

Rodrigo, A.; Pérez-Iglesias, J.; Peñas, A.; Garrido, G.; Araujo, L. (2010). A Question Answering System based on Information Retrieval and Validation. Notebook Papers/LABs/Workshops (CLEF 2010). http://clef2010.org/index.php?page=pages/proceedings.php [5 septiembre 2011].

Salton, G.; McGill, J. (1983). Introduction to modern information retrieval. New York: McGraw-Hill.

Sultan, M. (2006). Multiple Choice Question Answering. Sheffield: University of Sheffield. Tesis doctoral.

Tsur, O. (2003). Definitional Question-Answering Using Trainable Text Classifiers. University of Amsterdam: Amsterdam. Tesis doctoral.

Voorhees, E. M. (1999). The TREC 8 Question Answering Track Report. Proceedings of the 8th Text Retrieval Conference. http://trec.nist.gov/pubs/trec8/papers/qa_report.pdf.

Voorhees, E. M. (2002). Overview of the TREC 2002 Question Answering Track. Proceedings of the Eleventh Text Retrieval Conference. http://comminfo.rutgers.edu/~muresan/IR/TREC/Proceedings/t11_proceedings/t11_proceedings.html [5 septiembre 2011].

Voorhees, E. M.; Tice, D. (1999). The TREC-8 question answering track evaluation. En: Voorhees, E. M.; Harman, D. Proceedings of the Eleventh Text Retrieval Conference. Gaithersburg, MD: NIST Publicación Especial. http://comminfo.rutgers.edu/~muresan/IR/TREC/Proceedings/t8_proceedings/t8_proceedings.html [5 septiembre 2011].

Warren, D. (1981). Efficient Processing of Interactive Relational Database Queries Expressed in Logic. Proceedings Seventh International Conference on Very Large Data Bases. Cannes: VLDB Endowment. v.7, pp. 272-283.

Weizenbaum, J. (1966). Eliza: A computer program for the study of natural language communication between man and machine. Communications of the ACM, 9 (1), pp. 36-45. http://dx.doi.org/10.1145/365153.365168

Woods, W. A.; Kaplan, R. M.; Nash-Webber, B. (1972). The Lunar Sciences Natural Language Information System. En: BBN Final Report 2378. Cambridge: Bolt, Beranek and Newman.

Zweigenbaum, P. (2005). Question answering in biomedicine. Proceedings Workshop on Natural Language Processing for answering. Budapest: ACL, EACL 2003. 1-4.