Workshop on Research Evaluation
Friday, May 10, 2013 - Free University of Bozen-Bolzano, Italy
The invalidity of the most popular research performance indicators and rankings
Research performance should be evaluated with respect to the specific goals and objectives to be achieved, which leaves ample margins to the creative bibliometrician. Nevertheless, the impression is that most rankings and indicators are largely based on what can easily be counted rather than "what really counts". In fact, theory should always drive measurement. Because research activity is a production process, it is mainly the economic theory of production which should drive the formulation of indicators and methods of assessment. The quintessential performance indicator of any production unit is productivity. I will show how the most popular indicators, such as the h-index and the new crown indicator, fall short of measuring real research performance. I will then present a proxy indicator of labor productivity, named Fractional Scientific Strength, and measure the ranking distortions when performance is calculated using the most popular indicators.
Evaluating the unevaluable, or: two kinds of peerThe presentation argues that evaluating scientific research amounts to subjecting it to a procedure that is alien to its nature. Systematic evaluation lays a stress on scientific research that threatens to uproot it from its relation to truth and to transform it into a performance-driven practice. The pivotal figure of research evaluation is the so-called peer, who, from being one who has the right to be judged by his equals, has unawares turned into a functionary enlisted for the purpose of implementing this transformation.
Scientific Research Measures
We propose a family of Scientific Research Measures (SRM), based on bibliometrics and characteristics of the Journals, that are: Flexible to fit peculiarities of different areas and/or ages; Calibrated to the scientific community; Coherent, as they share the same structural properties; Inclusive, as they comprehend several popular indices. We associate to each author a vector of citations X, where each component of X represents the number of citation of the n th publication and the components of X are ranked in decreasing order. We consider a family of performance curves monotone increasing over q, where , and we define the Scientific Research Measure as: The balance between the number of citations and the number of publications, depends on the scientific areas and age (seniority) and is captured by the family , that must be calibrated from data. We also discuss the dual representation of such SRM which provides additional novel features to this research valuation method.
Can the scientific assessment process be fair?
We will examine the very fundamental properties of impact functions, that is the aggregation operators which may be used in e.g. the assessment of scientists by means of citations received by their papers. It turns out that each impact function which gives noncontroversial valuations in disputable cases must necessarily be trivial. Moreover, we will show that for any set of authors with ambiguous citation records, we may construct an impact function that gives ANY desired authors' ordering. Theoretically then, there is a considerable room for manipulation.
A novel indicator to select a subset of elite papers, based on citers
The goal of this presentation is to introduce the citer-success-index (cs-index), i.e., an indicator that uses the number of different citers as a proxy for the impact of a generic set of papers. For each of the articles of interest, it is defined a comparison term – which represents the number of citers that, on average, an article published in a certain period and scientific field is expected to "infect" – to be compared with the actual number of citers of the article. Similarly to the recently proposed success-index (Franceschini et al., Scientometrics 92(3):621-6415, 2011), the cs-index allows to select a subset of “elite” papers. Some advantages of the cs-index are that (i) it can be applied to multidisciplinary groups of papers, thanks to the field-normalization that it achieves at the level of individual paper and (ii) it is not significantly affected by self citers and recurrent citers. The main drawback is its computational complexity.
Who is better: a cow or a horse?
Hirsch in 2005 opened a Pandora's box of different indeces quantifying the publication activities of scietific individualities. We will discuss several of such indeces, and present a framework for such kind of indeces. After discussion of pros' and cons' of single indeces we will discuss them from the integration point of view. The total number of citations can be seen as standard Lebesgue (Choquet) integral, with total compensation between the number of papers and number of citations. On the other hand, h-index has no compensation and it corresponds to Sugeno integral. Several other types of integrals can be proposed, admitting compensation in some degree. In the next part, we will open the problem of co-authors and the problem of the datum of publication. Finally, the problem of comparison of citation based scientific impact in different fields of science will be opened.
* The star indicates the speaker