A two-sided academic landscape: snapshot of highly-cited documents in Google Scholar (1950-2013)
DOI:
https://doi.org/10.3989/redc.2016.4.1405Keywords:
Google Scholar, academic search engines, highly-cited documents, academic books, open accessAbstract
The main objective of this paper is to identify and define the core characteristics of the set of highly-cited documents in Google Scholar (document types, language, free availability, sources, and number of versions), on the hypothesis that the wide coverage of this search engine may provide a different portrait of these documents with respect to that offered by traditional bibliographic databases. To do this, a query per year was carried out from 1950 to 2013 identifying the top 1,000 documents retrieved from Google Scholar and obtaining a final sample of 64,000 documents, of which 40% provided a free link to full-text. The results obtained show that the average highly-cited document is a journal or book article (62% of the top 1% most cited documents of the sample), written in English (92.5% of all documents) and available online in PDF format (86.0% of all documents). Yet, the existence of errors should be noted, especially when detecting duplicates and linking citations properly. Nonetheless, the fact that the study focused on highly cited papers minimizes the effects of these limitations. Given the high presence of books and, to a lesser extent, of other document types (such as proceedings or reports), the present research concludes that the Google Scholar data offer an original and different vision of the most influential academic documents (measured from the perspective of their citation count), a set composed not only of strictly scientific material (journal articles) but also of academic material in its broadest sense.
Downloads
References
Aguillo, I. F.; Ortega, J.; Fernández, M.; Utrilla, A. (2010). Indicators for a webometric ranking of open access repositories. Scientometrics, vol. 82 (3), 477-486. https://doi.org/10.1007/s11192-010-0183-y
Aguillo, I. F. (2012). Is Google Scholar useful for bibliometrics? A webometric analysis. Scientometrics, vol. 91 (2), 343-351. https://doi.org/10.1007/s11192-011-0582-8
Aksnes, D. W. (2003). Characteristics of highly cited papers. Research Evaluation, vol. 12 (3), 159-170. https://doi.org/10.3152/147154403781776645
Aksnes, D. W.; Sivertsen, G. (2004). The effect of highly cited papers on national citation indicators. Scientometrics, vol. 59 (2), 213-224. https://doi.org/10.1023/b:scie.0000018529.58334.eb
Archambault, E.; Amyot, D.; Deschamps, P.; Nicol, A.; Rebout, L.; Roberge, G. (2013). Proportion of open access peer-reviewed papers at the European and world levels—2004–2011. Science-Metrix. Report. Science Matrix Inc. In : http:// www. science-metrix. com/ pdf/ SM_ EC_ OA_ Availability_ 2004-2011. pdf
Bar-Ilan, J. (2010). Citations to the "Introduction to informetrics" indexed by WOS, Scopus and Google Scholar. Scientometrics, vol. 82(3), 495-506. https://doi.org/10.1007/s11192-010-0185-9
Beel, J.; Gipp, B.; Wilde, E. (2010). Academic Search Engine Optimization (ASEO): Optimizing Scholarly Literature for Google Scholar and Co. Journal of Scholarly Publishing, vol. 41 (2), 176-190. https://doi.org/10.3138/jsp.41.2.176
Björk, B. C.; Welling, P.; Laakso, M.; Majlender, P.; Hedlund, T.; Gudnason, G. (2010). Open Access to the scientific journal literature: Situation 2009. PLoS ONE, vol. 5(6), e11273. https://doi.org/10.1371/journal.pone.0011273 PMid:20585653 PMCid:PMC2890572
Bornmann, L. (2010). Towards an ideal method of measuring research performance: Some comments to the Opthof and Leydesdorff (2010) paper. Journal of Informetrics, vol. 4 (3), 441–443. https://doi.org/10.1016/j.joi.2010.04.004
Bornmann, L.; Mutz, R. (2011). Further steps towards an ideal method of measuring citation performance: the avoidance of citation (ratio) averages in field-normalization. Journal of Informetrics, vol. 5 (1), 228-230. https://doi.org/10.1016/j.joi.2010.10.009
Bornmann, L.; Marx, W.; Schier, H.; Rahm, E.; Thor, A.; Daniel, H. D. (2009). Convergent validity of bibliometric Google Scholar data in the field of chemistry—Citation counts for papers that were accepted by Angewandte Chemie International Edition or rejected but published elsewhere, using Google Scholar, Science Citation Index, Scopus, and Chemical Abstracts. Journal of Informetrics, vol. 3 (1), 27-35. https://doi.org/10.1016/j.joi.2008.11.001
Bornmann, L.; Moya-Anegón, F.; Leydesdorff, L. (2011). The new excellence indicator in the World Report of the SCImago Institutions Rankings 2011. Journal of Informetrics, vol. 6(2), 333-335. https://doi.org/10.1016/j.joi.2011.11.006
Garfield, E. (1977). Introducing Citation Classics: the human side of scientific papers. Current Contents, vol. 3 (1), 1-2.
Garfield, E. (1979). Is citation analysis a legitimate evaluation tool?. Scientometrics, vol. 1(4), 359- 375. https://doi.org/10.1007/BF02019306
Garfield, E. (2005). The Agony and the Ecstasy— The History and Meaning of the Journal Impact Factor. International Congress on Peer Review and Biomedical Publication. Chicago, 16 September. In: http://www.garfield.library.upenn.edu/papers/ jifchicago2005.pdf
Glänzel, W.; Czerwon, H. J. (1992). What are highly cited publications? A method applied to German scientific papers, 1980–1989. Research Evaluation, vol. 2 (3), 135-141. https://doi.org/10.1093/rev/2.3.135
Glänzel, W.; Schubert, A. (1992). Some facts and figures on highly cited papers in the sciences, 1981–1985. Scientometrics, vol. 25 (3), 373-380. https://doi.org/10.1007/bf02016926
Glänzel, W.; Rinia, E. J.; Brocken, M. G. (1995). A bibliometric study of highly cited European physics papers in the 80s. Research Evaluation, vol. 5 (2), 113-122. https://doi.org/10.1093/rev/5.2.113
Harzing, A. W. (2013). A preliminary test of Google Scholar as a source for citation data: a longitudinal study of Nobel prize winners. Scientometrics, vol. 94 (3), 1057-1075. https://doi.org/10.1007/s11192-012-0777-7
Harzing, A. W. (2014). A longitudinal study of Google Scholar coverage between 2012 and 2013. Scientometrics, vol. 98 (1), 565-575. https://doi.org/10.1007/s11192-013-0975-y
Harzing, A.W.; Van der Wal, R. (2008). Google Scholar as a new source for citation analysis. Ethics in Science and Environmental Politics, vol. 8 (1), 61- 73. https://doi.org/10.3354/esep00076
Jacsó, P. (2005). Google Scholar: the pros and the cons. Online information review, vol. 29 (2), 208-214. https://doi.org/10.1108/14684520510598066
Jacsó, P. (2006). Deflated, inflated, and phantom citation counts. Online Information Review, vol. 30 (3), 297-309. https://doi.org/10.1108/14684520610675816
Jacsó, P. (2008a). The pros and cons of computing the h-index using Scopus. Online Information Review, vol. 32 (4), 524-535. https://doi.org/10.1108/14684520810897403
Jacsó, P. (2008b). The pros and cons of computing the h-index using Google Scholar. Online Information Review, vol. 32 (3), 437-452. https://doi.org/10.1108/14684520810889718
Jacsó, P. (2012). Using Google Scholar for journal impact factors and the h-index in nationwide publishing assessments in academia – siren songs and air-raid sirens. Online Information Review, vol. 36 (3), 462-478. https://doi.org/10.1108/14684521211241503
Jamali, H. R.; Nabavi, M. (2015). Open access and sources of full-text articles in Google Scholar in different subject fields. Scientometrics, vol. 105 (3), 1635-1651. https://doi.org/10.1007/s11192-015-1642-2
Khabsa, M.; Giles, C. L. (2014). The number of scholarly documents on the public web. PLoS One, vol. 9(5), e93949. https://doi.org/10.1371/journal.pone.0093949 PMid:24817403 PMCid:PMC4015892
Kousha, K.; Thelwall, M. (2008). Sources of Google Scholar citations outside the Science Citation Index: A comparison between four science disciplines. Scientometrics, vol. 74 (2), 273–294. https://doi.org/10.1007/s11192-008-0217-x
Kousha, K.; Thelwall, M.; Rezaie, S. (2011). Assessing the citation impact of books: The role of Google Books, Google Scholar, and Scopus. Journal of the American Society for Information Science, vol. 62 (11), 2147-2164. https://doi.org/10.1002/asi.21608
Kresge, N.; Simoni, R. D.; Hill, R. L. (2005). The most highly cited paper in publishing history: Protein determination by Oliver H. Lowry. Journal of Biological Chemistry, vol. 280 (28), e25. http://www.jbc.org/content/280/28/e25
Levitt, J. M.; Thelwall, M. (2009). The most highly cited Library and Information Science articles: Interdisciplinarity, first authors and citation patterns. Scientometrics, vol. 78 (1), 45-67. https://doi.org/10.1007/s11192-007-1927-1
Maltrás Barba, B. (2003). Los indicadores bibliométricos: fundamentos y aplicación al análisis de la ciencia. Gijón: Trea.
Martín-Martín, A.; Ayllón, J. M.; Delgado López-Cózar, E.; Orduna-Malea, E. (2015). Nature's top 100 Re-revisited. Journal of the Association for Information Science & Technology, vol. 66 (12), 2714-2714. https://doi.org/10.1002/asi.23570
Meho, L.; Yang, K. (2007). Impact of data sources on citation counts and rankings of LIS faculty: Web of Science versus Scopus and Google Scholar. Journal of the American Society for Information Science and Technology, vol. 58 (13), 2105–2125. https://doi.org/10.1002/asi.20677
Narin, F. (1987). Bibliometric techniques in the evaluation of research programs. Science and Public Policy, vol. 14(2), 99-106.
Narin, F.; Frame, J. D.; Carpenter, M. P. (1983). Highly cited Soviet papers: An exploratory investigation. Social Studies of Science, vol. 13 (2), 307-319. https://doi.org/10.1177/030631283013002006
Oppenheim, C.; Renn, S. P. (1978). Highly cited old papers and the reasons why they continue to be cited. Journal of the American Society for Information Science, vol. 29 (5), 225-231. https://doi.org/10.1002/asi.4630290504
Orduna-Malea, E.; Delgado López-Cózar, E. (2014). Google Scholar Metrics evolution: an analysis according to languages. Scientometrics, vol. 98 (3), 2353–2367. https://doi.org/10.1007/s11192-013-1164-8
Orduna-Malea, E.; Ayllón, J. M.; Martín- Martín, A.; Delgado López-Cózar, E. (2015). Methods for estimating the size of Google Scholar. Scientometrics, vol. 104 (3), 931-949. https://doi.org/10.1007/s11192-015-1614-6
Orduna-Malea, E.; Serrano-Cobos, J.; Ontalba- Ruipérez, J. A.; Lloret-Romero, N. (2010). Presencia y visibilidad web de las universidades públicas españolas. Revista Española de Documentación Científica, vol. 33 (2), 246-278. https://doi.org/10.3989/redc.2010.2.740
Ortega, Jose L. (2014). Academic Search Engines: A Quantitative Outlook. London: Elsevier.
Persson, O. (2010). Are highly cited papers more international?. Scientometrics, vol. 83 (2), 397-401. https://doi.org/10.1007/s11192-009-0007-0
Pitol, S. P.; De Groote, S. L. (2014). Google Scholar versions: Do more versions of an article mean greater impact? Library Hi Tech, vol. 32 (4), 594– 611. https://doi.org/10.1108/LHT-05-2014-0039
Plomp, R. (1990). The significance of the number of highly cited papers as an indicator of scientific prolificacy. Scientometrics, vol. 19 (3), 185-197. https://doi.org/10.1007/bf02095346
Smith, D. R. (2009). Highly cited articles in environmental and occupational health, 1919– 1960. Archives of environmental & occupational health, vol. 64 (1), 32-42. https://doi.org/10.1080/19338240903286743 PMid:20007115
Tijssen, R. J.; Visser, M. S.; Van Leeuwen, T. N. (2002). Benchmarking international scientific excellence: are highly cited research papers an appropriate frame of reference? Scientometrics, vol. 54 (3), 381- 397. https://doi.org/10.1023/A:1016082432660
Van Noorden, R.; Maher, B.; Nuzzo, R. (2014). The top hundred papers. Nature, vol. 514 (7524), 550- 553. https://doi.org/10.1038/514550a PMid:25355343
Van Raan, A. F.; Hartmann, D. (1987). The comparative impact of scientific publications and journals: Methods of measurement and graphical display. Scientometrics, vol. 11(5-6), 325-331. https://doi.org/10.1007/BF02279352
Verstak, A.; Acharya, A. (2013). Identifying multiple versions of documents. U.S. Patent No. 8,589,784. Washington, DC: U.S. Patent and Trademark Office.
Winter, J.C.F.; Zadpoor, A.; Dodou, D. (2014). The expansion of Google Scholar versus Web of Science: a longitudinal study. Scientometrics, vol. 98 (2), 1547–1565. https://doi.org/10.1007/s11192-013-1089-2
Yang, K.; Meho, L. (2006). Citation Analysis: A Comparison of Google Scholar, Scopus, and Web of Science. Proceedings of the American Society for Information Science and Technology, vol. 43 (1), 1–15. https://doi.org/10.1002/meet.14504301185
Published
How to Cite
Issue
Section
License
Copyright (c) 2016 Consejo Superior de Investigaciones Científicas (CSIC)

This work is licensed under a Creative Commons Attribution 4.0 International License.
© CSIC. Manuscripts published in both the print and online versions of this journal are the property of the Consejo Superior de Investigaciones Científicas, and quoting this source is a requirement for any partial or full reproduction.
All contents of this electronic edition, except where otherwise noted, are distributed under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence. You may read here the basic information and the legal text of the licence. The indication of the CC BY 4.0 licence must be expressly stated in this way when necessary.
Self-archiving in repositories, personal webpages or similar, of any version other than the final version of the work produced by the publisher, is not allowed.